Commit Graph

225 Commits

Author SHA1 Message Date
Richard Henderson
6820964e2f
tcg/aarch64: Fix tcg_out_movi
There were some patterns, like 0x0000_ffff_ffff_00ff, for which we
would select to begin a multi-insn sequence with MOVN, but would
fail to set the 0x0000 lane back from 0xffff.

Backports commit 50b468d42107a2c646b1c566ed17d9ec362c51c4 from qemu
2018-03-01 09:15:34 -05:00
Richard Henderson
a03666f2f2
tcg/aarch64: Fix addsub2 for 0+C
When al == xzr, we cannot use addi/subi because that encodes xsp.
Force a zero into the temp register for that (rare) case.

Backports commit 028fbea47713f909d6ea761a457779a82b276247 from qemu
2018-03-01 09:13:54 -05:00
Joseph Myers
7ff441826c
tcg: correct 32-bit tcg_gen_ld8s_i64 sign-extension
The version of tcg_gen_ld8s_i64 for 32-bit systems does a load into
the low part of the return value - then attempts a sign extension into
the high part, but wrongly sets the high part to a sign extension of
itself rather than of the low part. This results in TCG internal
errors from the use of the uninitialized high part (in some GCC tests
of AArch64 NEON shift intrinsics, in particular). This patch corrects
the sign-extension logic, making it match other functions such as
tcg_gen_ld16s_i64.

Backports commit 3ff91d7e85176f8b4b131163d7fd801757a2c949 from qemu
2018-03-01 08:41:23 -05:00
Peter Maydell
f9c5c1a604
tcg/tcg.h: Improve documentation of TCGv_i32 etc types
The typedefs we use for the TCGv_i32, TCGv_i64 and TCGv_ptr
types are somewhat confusing, because we define them as
pointers to structs, but the structs themselves are never
defined. Explain in the comments a bit more clearly why
this is OK and what is going on under the hood.

Backports commit a40d4701bc9f6e6a3bbfb7b4fbe756a5b72b5df1 from qemu
2018-03-01 08:40:35 -05:00
Richard Henderson
f5a35908da
tcg: Add tcg_gen_mulsu2_{i32,i64,tl}
This multiply has one signed input and one unsigned input,
producing the full double-width result.

Backports commit 5087abfb7dfd1d368ae6939420057036b4d8e509 from qemu
2018-03-01 08:39:37 -05:00
Paolo Bonzini
9d64a89acf
tcg: comment on which functions have to be called with tb_lock held
softmmu requires more functions to be thread-safe, because translation
blocks can be invalidated from e.g. notdirty callbacks. Probably the
same holds for user-mode emulation, it's just that no one has ever
tried to produce a coherent locking there.

This patch will guide the introduction of more tb_lock and tb_unlock
calls for system emulation.

Note that after this patch some (most) of the mentioned functions are
still called outside tb_lock/tb_unlock. The next one will rectify this.

Backports commit 7d7500d99895f888f97397ef32bb536bb0df3b74 from qemu
2018-02-28 10:26:28 -05:00
Richard Henderson
b48508a6c1
tcg: Emit barriers with parallel_cpus
Backports commit 91682118aa330aff7e8ef0cc685c32d101f49940 from qemu
2018-02-27 22:28:33 -05:00
Richard Henderson
064543a415
tcg: Add CONFIG_ATOMIC64
Allow qemu to build on 32-bit hosts without 64-bit atomic ops.

Even if we only allow 32-bit hosts to multi-thread emulate 32-bit
guests, we still need some way to handle the 32-bit guest using a
64-bit atomic operation. Do so by dropping back to single-step.

Backports commit df79b996a7b21c6ea7847f7927a2e1a294b86c72 from qemu
2018-02-27 22:25:36 -05:00
Richard Henderson
da01e53757
tcg: Add atomic128 helpers
Force the use of cmpxchg16b on x86_64.

Wikipedia suggests that only very old AMD64 (circa 2004) did not have
this instruction. Further, it's required by Windows 8 so no new cpus
will ever omit it.

If we truely care about these, then we could check this at startup time
and then avoid executing paths that use it.

Backports commit 7ebee43ee3e2fcd7b5063058b7ef74bc43216733 from qemu
2018-02-27 21:43:48 -05:00
Richard Henderson
5c0ce1b99c
tcg: Add atomic helpers
Add all of cmpxchg, op_fetch, fetch_op, and xchg.
Handle both endian-ness, and sizes up to 8.
Handle expanding non-atomically, when emulating in serial.

Backports commit c482cb117cc418115ca9c6d21a7a2315414c0a40 from qemu
2018-02-27 15:57:47 -05:00
Richard Henderson
4e498cc54d
target-m68k: Reorg flags handling
Separate all ccr bits. Continue to batch updates via cc_op.

Backports commit 620c6cf66584bfbee90db84a7e87a6eabf230ca9 from qemu
2018-02-27 10:02:02 -05:00
Paolo Bonzini
8734e13a73
tcg: try sti when moving a constant into a dead memory temp
This comes from free from unifying tcg_reg_alloc_mov and
tcg_reg_alloc_movi's handling of TEMP_VAL_CONST. It triggers
often on moves to cc_dst, such as the following translation
of "sub $0x3c,%esp":

before: after:
subl $0x3c,%ebp subl $0x3c,%ebp
movl %ebp,0x10(%r14) movl %ebp,0x10(%r14)
movl $0x3c,%ebx movl $0x3c,0x2c(%r14)
movl %ebx,0x2c(%r14)

Backports commit 0fe4fca4e1a5e06a270127dd80bb753d4dda61c6 from qemu
2018-02-26 10:08:47 -05:00
Alex Bennée
bf72733576
tcg/optimize: move default return out of if statement
This is to appease sanitizer builds which complain that:

"error: control reaches end of non-void function"

Backports commit 550276ae0a88851edda2cb7fcdd64256dbb8e314 from qemu
2018-02-26 05:05:21 -05:00
Richard Henderson
2ab4b8fa4d
tcg/i386: Extend TARGET_PAGE_MASK to the proper type
TARGET_PAGE_MASK, as defined, has type "int". We need to extend
that to the proper target width before oring in an "unsigned".

Backports commit ebb90a005da67147245cd38fb04a965a87a961b7 from qemu
2018-02-26 03:32:38 -05:00
Pranith Kumar
16d71f0f10
tcg: Optimize fence instructions
This commit optimizes fence instructions. Two optimizations are
currently implemented: (1) unnecessary duplicate fence instructions,
and (2) merging weaker fences into a stronger fence.

[rth: Merge tcg_optimize_mb back into tcg_optimize, so that we only
loop over the opcode stream once. Merge "unrelated" weaker barriers
into one stronger barrier.]

Backports commit 34f939218ce78163171addd63750e1e0300376ab from qemu
2018-02-26 03:29:59 -05:00
Pranith Kumar
65a73763e3
tcg/sparc: Add support for fence
Backports commit f8f03b3707b49898052fb8cd75ee31d19c8161fc from qemu
2018-02-26 03:20:39 -05:00
Pranith Kumar
a6fdc24e28
tcg/s390: Add support for fence
Backports commit c9314d610e0e5da4d2cd5a36f3563d102b3294e0 from qemu
2018-02-26 03:19:41 -05:00
Pranith Kumar
bdd9cad15c
tcg/ppc: Add support for fence
Backports commit 7b4af5ee8a1336bc39714b6de47924ee71fba761 from qemu
2018-02-26 03:18:43 -05:00
Pranith Kumar
5f10101245
tcg/mips: Add support for fence
Backports commit 6f0b99104a396905870edc3049310ece29b6b8d6 from qemu
2018-02-26 03:17:34 -05:00
Pranith Kumar
e29cbe9640
tcg/arm: Add support for fence
Backports commit 40f191ab8226fdada185efa49c44b60d8f494890 from qemu
2018-02-26 03:13:17 -05:00
Pranith Kumar
907060b865
tcg/aarch64: Add support for fence
Backports commit c7a59c2a92592e556b9361437c9c4229917bd1e3 from qemu
2018-02-26 03:11:03 -05:00
Pranith Kumar
d49bd55f52
tcg/i386: Add support for fence
Generate a 'lock orl $0,0(%esp)' instruction for ordering instead of
mfence which has similar ordering semantics.

Backports commit a7d00d4effb58889ac6df64f98ac50c9d1594149 from qemu
2018-02-26 03:10:58 -05:00
Pranith Kumar
5e44ce9be8
Introduce TCGOpcode for memory barrier
This commit introduces the TCGOpcode for memory barrier instruction.

This opcode takes an argument which is the type of memory barrier
which should be generated.

Backports commit f65e19bc2c9e8358e634d309606144ac2a3c2936 from qemu
2018-02-26 03:02:41 -05:00
Richard Henderson
91f5cf0417
tcg: Support arbitrary size + alignment
Previously we allowed fully unaligned operations, but not operations
that are aligned but with less alignment than the operation size.

In addition, arm32, ia64, mips, and sparc had been omitted from the
previous overalignment patch, which would have led to that alignment
being enforced.

Backports commit 85aa80813dd9f5c1f581c743e45678a3bee220f8 from qemu
2018-02-26 02:47:26 -05:00
Ladi Prosek
7acc14da16
Remove unused function declarations
Unused function declarations were found using a simple gcc plugin and
manually verified by grepping the sources.

Backports commit d4b84d564ee3eb7a58e4585d671fb3c220b6c3b9 from qemu
2018-02-26 02:31:46 -05:00
Thomas Huth
b581d4033f
tcg: Remove duplicate header includes
host-utils.h and timer.h are included twice in tcg.c.
One time should be enough.

Backports commit 347519eb9d68303a6c23a7663c0fa6c20a225191 from qemu
2018-02-26 02:29:38 -05:00
Richard Henderson
ede1cae3dc
tcg: Lower indirect registers in a separate pass
Rather than rely on recursion during the middle of register allocation,
lower indirect registers to loads and stores off the indirect base into
plain temps.

For an x86_64 host, with sufficient registers, this results in identical
code, modulo the actual register assignments.

For an i686 host, with insufficient registers, this means that temps can
be (temporarily) spilled to the stack in order to satisfy an allocation.
This as opposed to the possibility of not being able to spill, to allocate
a register for the indirect base, in order to perform a spill.

Backports commit 5a18407f55ade924aa6397c9a043a9ffd59645fe from qemu
2018-02-25 22:32:28 -05:00
Richard Henderson
8a012ff6d3
tcg: Require liveness analysis
Backports commit c0ef05b5e62ab0c291a94022f14104e61e306f03 from qemu
2018-02-25 22:20:42 -05:00
Richard Henderson
2aa46dd9a1
tcg: Include liveness info in the dumps
Backports commit bdfb460ef77500f7b186759b585f06ff2120929d from qemu
2018-02-25 22:13:08 -05:00
Richard Henderson
e973e89a57
tcg: Compress dead_temps and mem_temps into a single array
We only need two bits per temporary. Fold the two bytes into one,
and reduce the memory and cachelines required during compilation.

Backports commit c70fbf0a9938baf3b4f843355a77c17a7e945b98 from qemu
2018-02-25 22:07:08 -05:00
Richard Henderson
690985a582
tcg: Fold life data into TCGOp
Reduce the size of other bitfields to make room.
This reduces the cache footprint of compilation.

Backports commit bee158cb4dde35c41632a3a129c869f14a32f8f0 from qemu
2018-02-25 21:49:42 -05:00
Richard Henderson
1547048a22
tcg: Reorg TCGOp chaining
Instead of using -1 as end of chain, use 0, and link through the 0
entry as a fully circular double-linked list.

Backports commit dcb8e75870e2de199db853697f8839cb603beefe from qemu
2018-02-25 21:44:50 -05:00
Richard Henderson
b2e6e351c2
tcg: Compress liveness data to 16 bits
This reduces both memory usage and per-insn cacheline usage
during code generation.

Backports commit a1b3c48d2b23d6eaeb4529d3e1183d2648731bf8 from qemu
2018-02-25 21:27:24 -05:00
Paolo Bonzini
a47c68164d
compiler: never omit assertions if using a static analysis tool
Assertions help both Coverity and the clang static analyzer avoid
false positives, but on the other hand both are confused when
the condition is compiled as (void)(x != FOO). Always expand
assertion macros when using Coverity or clang, through a new
QEMU_STATIC_ANALYSIS preprocessor symbol.

This fixes a couple false positives in TCG.

Backports commit 8bff06a0bbf257a2083223534c1607bf87d913e6 from qemu
2018-02-25 19:19:28 -05:00
Richard Henderson
1dcd14d434
target-sparc: Store %asi in TB flags
Knowing the value of %asi at translation time means that we
can handle the common settings without a function call.

The steady state appears to be %asi == ASI_P, so that sparcv9
code can use offset forms of lda/sta. The %asi register gets
pushed and popped on entry to certain functions, but it rarely
takes on values other than ASI_P or ASI_AIUP. Therefore we're
unlikely to be expanding the set of TBs created.

Backports commit a6d567e523ed7e928861f3caa5d49368af3f330d from qemu
2018-02-25 05:17:21 -05:00
Richard Henderson
395e00cdc5
target-sparc: Remove softint as a TCG global
The global is only ever read for one insn; we can just as well
use a load from env instead and generate the same code. This
also allows us to indicate the the associated helpers do not
touch TCG globals.

Backports commit e86ceb0d652baa5738e05a59ee0e7989dafbeaa1 from qemu
2018-02-25 04:49:27 -05:00
Markus Armbruster
25ec9ab016
tcg: Clean up tcg-target.h header guards
These use guard symbols like TCG_TARGET_$target.
scripts/clean-header-guards.pl doesn't like them because they don't
match their file name (they should, to make guard collisions less
likely).

Clean them up: use guard symbol $target_TCG_TARGET_H for
tcg/$target/tcg-target.h.

Backports commit 14e54f8ecfe9c5e17348f456781344737ed10b3b from qemu
2018-02-25 04:15:08 -05:00
Sergey Sorokin
e4d123caa9
tcg: Improve the alignment check infrastructure
Some architectures (e.g. ARMv8) need the address which is aligned
to a size more than the size of the memory access.
To support such check it's enough the current costless alignment
check implementation in QEMU, but we need to support
an alignment size specifying.

Backports commit 1f00b27f17518a1bcb4cedca49eaec96a4d560bd from qemu
2018-02-25 02:23:28 -05:00
Richard Henderson
23586e2674
tcg: Optimize spills of constants
While we can store constants via constrants on INDEX_op_st_i32 et al,
we weren't able to spill constants to backing store.

Add a new backend interface, tcg_out_sti, which may store the constant
(and is allowed to fail). Rearrange the temp_* helpers so that we only
attempt to directly store a constant when the temp is becoming dead/free.

Backports commit 59d7c14eeff8d2ad7f61aed86ce5a176113bc153 from qemu
2018-02-25 01:45:29 -05:00
Richard Henderson
64fda683b1
tcg: Fix name for high-half register 2018-02-25 01:36:35 -05:00
Lluís Vilanova
2297527755
exec: [tcg] Track which vCPU is performing translation and execution
Information is tracked inside the TCGContext structure, and later used
by tracing events with the 'tcg' and 'vcpu' properties.

The 'cpu' field is used to check tracing of translation-time
events ("*_trans"). The 'tcg_env' field is used to pass it to
execution-time events ("*_exec").

Backports commit 7c2550432abe62f53e6df878ceba6ceaf71f0e7e from qemu
2018-02-24 19:21:39 -05:00
Emilio G. Cota
8518f55df7
compiler.h: add QEMU_ALIGNED() to enforce struct alignment
Backports commit 911a4d2215b05267b16925503218f49d607c6b29 from qemu
2018-02-24 17:32:43 -05:00
Paolo Bonzini
9485b7c2e1
cpu: move exec-all.h inclusion out of cpu.h
exec-all.h contains TCG-specific definitions. It is not needed outside
TCG-specific files such as translate.c, exec.c or *helper.c.

One generic function had snuck into include/exec/exec-all.h; move it to
include/qom/cpu.h.

Backports commit 63c915526d6a54a95919ebece83fa9ca631b2508 from qemu
2018-02-24 02:39:08 -05:00
Paolo Bonzini
58693409ea
exec: extract exec/tb-context.h
TCG backends do not need most of exec-all.h; extract what they actually
need to a separate file or move it directly to tcg.h. The next patch
will stop including exec-all.h from everywhere.

Backports commit 00f6da6a1a5d1ce085334eccbb50ec899ceed513 from qemu
2018-02-24 02:09:58 -05:00
Paolo Bonzini
37f26922dd
qemu-common: push cpu.h inclusion out of qemu-common.h
Backports commit 33c11879fd422b759483ed25fef133ea900ea8d7 from qemu
2018-02-24 01:50:56 -05:00
Sergey Fedorov
c9700af2bd
tcg: Clean up from 'next_tb'
The value returned from tcg_qemu_tb_exec() is the value passed to the
corresponding tcg_gen_exit_tb() at translation time of the last TB
attempted to execute. It is a little confusing to store it in a variable
named 'next_tb'. In fact, it is a combination of 4-byte aligned pointer
and additional information in its two least significant bits. Break it
down right away into two variables named 'last_tb' and 'tb_exit' which
are a pointer to the last TB attempted to execute and the TB exit
reason, correspondingly. This simplifies the code and improves its
readability.

Correct a misleading documentation comment for tcg_qemu_tb_exec() and
fix logging in cpu_tb_exec(). Also rename a misleading 'next_tb' in
another couple of places.

Backports commit 819af24b9c1e95e6576f1cefd32f4d6bf56dfa56 from qemu
2018-02-23 23:29:04 -05:00
Sergey Fedorov
ffdc9d6323
tcg: Allow goto_tb to any target PC in user mode
In user mode, there's only a static address translation, TBs are always
invalidated properly and direct jumps are reset when mapping change.
Thus the destination address is always valid for direct jumps and
there's no need to restrict it to the pages the TB resides in.

Backports commit 90aa39a1cc4837360889f0e033ca25cc82100308 from qemu
2018-02-23 23:12:14 -05:00
Sergey Fedorov
73c59faad5
tcg: Clean up direct block chaining safety checks
We don't take care of direct jumps when address mapping changes. Thus we
must be sure to generate direct jumps so that they always keep valid
even if address mapping changes. Luckily, we can only allow to execute a
TB if it was generated from the pages which match with current mapping.

Document tcg_gen_goto_tb() declaration and note the reason for
destination PC limitations.

Some targets with variable length instructions allow TB to straddle a
page boundary. However, we make sure that both of TB pages match the
current address mapping when looking up TBs. So it is safe to do direct
jumps into the both pages. Correct the checks for some of those targets.

Given that, we can safely patch a TB which spans two pages. Remove the
unnecessary check in cpu_exec() and allow such TBs to be patched.

Backports commit 5b053a4a28278bca606eeff7d1c0730df1b047e9 from qemu
2018-02-23 22:26:00 -05:00
Sergey Fedorov
e60c24cecf
tcg: Clean up direct block chaining data fields
Briefly describe in a comment how direct block chaining is done. It
should help in understanding of the following data fields.

Rename some fields in TranslationBlock and TCGContext structures to
better reflect their purpose (dropping excessive 'tb_' prefix in
TranslationBlock but keeping it in TCGContext):
tb_next_offset => jmp_reset_offset
tb_jmp_offset => jmp_insn_offset
tb_next => jmp_target_addr
jmp_next => jmp_list_next
jmp_first => jmp_list_first

Avoid using a magic constant as an invalid offset which is used to
indicate that there's no n-th jump generated.

Backports commit f309101c26b59641fc1aa8fb2a98a5441cdaea03 from qemu
2018-02-23 21:28:19 -05:00
Sergey Fedorov
87c3382dc8
tcg/mips: Make direct jump patching thread-safe
Ensure direct jump patching in MIPS is atomic by using
atomic_read()/atomic_set() for code patching.

Backports commit c82460a560176ef69c2f0662bd280612e274db96 from qemu
2018-02-23 21:28:18 -05:00