Commit Graph

107 Commits

Author SHA1 Message Date
Richard Henderson
ab8579123e
tcg: Add generic vector ops for multiplication
Backports commit 3774030a3e523689df24a7ed22854ce7a06b0116 from qemu
2018-03-06 16:10:08 -05:00
Richard Henderson
f9c4930ecd
tcg: Add generic vector ops for comparisons
Backports commit 212be173f01e85e6589fd76676827953a84a732b from qemu
2018-03-06 16:09:38 -05:00
Richard Henderson
577ee114c3
tcg: Add generic vector ops for constant shifts
Opcodes are added for scalar and vector shifts, but considering the
varied semantics of these do not expose them to the front ends. Do
go ahead and provide them in case they are needed for backend expansion.

Backports commit d0ec97967f940bbc11dced83422b39c224127f1e from qemu
2018-03-06 14:03:30 -05:00
Richard Henderson
64365612bf
tcg: Add generic vector expanders
Backports commit db432672dc50ed86dda17ac821b7eb07411a90af from qemu
2018-03-06 13:42:52 -05:00
Richard Henderson
b9cd924fa5
tcg: Add types and basic operations for host vectors
Nothing uses or enables them yet.

Backports commit d2fd745fe8b9ac574d28b7ac63c39f6529749bd2 from qemu
2018-03-06 12:13:32 -05:00
Richard Henderson
140058221d
tcg: Generalize TCGOp parameters
We had two fields specific to INDEX_op_call. Rename these and
add some macros so that the fields may be reused for other opcodes.

Backports commit cd9090aa9dbba30db8aec9a2fc103aaf1ab0f5a7 from qemu
2018-03-05 16:53:50 -05:00
Richard Henderson
7fe5f620df
tcg: Dynamically allocate TCGOps
With no fixed array allocation, we can't overflow a buffer.
This will be important as optimizations related to host vectors
may expand the number of ops used.

Use QTAILQ to link the ops together.

Backports commit 15fa08f8451babc88d733bd411d4c94976f9d0f8 from qemu
2018-03-05 16:34:40 -05:00
Richard Henderson
5f074f09ab
tcg: Remove TCGV_UNUSED* and TCGV_IS_UNUSED*
These are now trivial sets and tests against NULL. Unwrap.

Backports commit f764718d0cb30af9f1f8e1d6a33622cc05ca4155 from qemu
2018-03-05 15:58:15 -05:00
Richard Henderson
5ef155a68f
tcg/s390x: Use constant pool for prologue
Rather than have separate code only used for guest_base,
rely on a recent change to handle constant pool entries.

Backports commit ba2c747992f8c315c2fbddba196ce9137430d61d from qemu
2018-03-05 11:28:39 -05:00
Richard Henderson
ef3f552229
tcg: Allow constant pool entries in the prologue
Both ARMv6 and AArch64 currently may drop complex guest_base values
into the constant pool. But generic code wasn't expecting that, and
the pool is not emitted. Correct that.

Backports commit 5b38ee31616d1532c3c3a6dc644a9160d608ed2f from qemu
2018-03-05 11:25:56 -05:00
Richard Henderson
ab9df6244c
tcg: Use offsets not indices for TCGv_*
Using the offset of a temporary, relative to TCGContext, rather than
its index means that we don't use 0. That leaves offset 0 free for
a NULL representation without having to leave index 0 unused.

Backports commit e89b28a63501c0ad6d2501fe851d0c5202055e70 from qemu
2018-03-05 10:12:08 -05:00
Richard Henderson
d450156414
tcg: Remove GET_TCGV_* and MAKE_TCGV_*
The GET and MAKE functions weren't really specific enough.
We now have a full complement of functions that convert exactly
between temporaries, arguments, tcgv pointers, and indices.

The target/sparc change is also a bug fix, which would have affected
a host that defines TCG_TARGET_HAS_extr[lh]_i64_i32, i.e. MIPS64.

Backports commit dc41aa7d34989b552efe712ffe184236216f960b from qemu
2018-03-05 09:12:26 -05:00
Richard Henderson
960eb3f4f9
tcg: Introduce temp_tcgv_{i32,i64,ptr}
Backports commit 085272b35e0644fea373c33b5265c1818b7a978c from qemu
2018-03-05 08:55:52 -05:00
Richard Henderson
2bb5011b18
tcg: Introduce tcgv_{i32,i64,ptr}_{arg,temp}
Transform TCGv_* to an "argument" or a temporary.
For now, an argument is simply the temporary index.

Backports commit ae8b75dc6ec808378487064922f25f1e7ea7a9be from qemu
2018-03-05 08:46:12 -05:00
Richard Henderson
d104b792a6
tcg: Change temp_allocate_frame arg to TCGTemp
Backports commit 2272e4a791b7e1a01ffac143616ba4ece9a5762d from qemu
2018-03-05 07:51:40 -05:00
Richard Henderson
35a7a9c9a4
tcg: Avoid loops against variable bounds
Copy s->nb_globals or s->nb_temps to a local variable for the purposes
of iteration. This should allow the compiler to use low-overhead
looping constructs on some hosts.

Backports commit ac3b88911ebc6fc841f28898ee8aed40839debe2 from qemu
2018-03-05 07:50:06 -05:00
Richard Henderson
1f4ac863bf
tcg: Use per-temp state data in liveness
This avoids having to allocate external memory for each temporary.

Backports commit b83eabeac06e38706738bd5e92b1ba117a1b554d from qemu
2018-03-05 07:47:51 -05:00
Richard Henderson
010ded3088
tcg: Add temp_global bit to TCGTemp
This avoids needing to test the index of a temp against nb_globals.

Backports commit fa477d25470187030614288d35bc734edffa41ee from qemu
2018-03-05 07:21:10 -05:00
Richard Henderson
a9c46ad7a0
tcg: Introduce arg_temp
Backports commit 434391390ba99996af1591b427a73b3f5c05065e from qemu
2018-03-05 07:17:44 -05:00
Richard Henderson
c8f0f6901e
tcg: Propagate TCGOp down to allocators
Backports commit dd186292017641d5b31fc13225a420677e1d20d3 from qemu
2018-03-05 07:12:48 -05:00
Richard Henderson
f1e2ea6847
tcg: Propagate args to op->args in tcg.c
Backports commit efee3746fa471852daba7674b0d34f8c88be7559 from qemu
2018-03-05 07:06:50 -05:00
Richard Henderson
eb488f5bd6
tcg: Merge opcode arguments into TCGOp
Rather than have a separate buffer of 10*max_ops entries,
give each opcode 10 entries. The result is actually a bit
smaller and should have slightly more cache locality.

Backports commit 75e8b9b7aa0b95a761b9add7e2f09248b101a392 from qemu
2018-03-05 04:45:20 -05:00
Emilio G. Cota
239e9771df
tcg: define TCG_HIGHWATER
Will come in handy very soon.

Backports commit a505785cd221994dd3713bde860861869a059940 from qemu
2018-03-05 03:22:27 -05:00
Emilio G. Cota
8552d95c52
exec-all: extract tb->tc_* into a separate struct tc_tb
In preparation for adding tc.size to be able to keep track of
TB's using the binary search tree implementation from glib.

Backports commit e7e168f41364c6e83d0f75fc1b3ce7f9c41ccf76 from qemu
2018-03-05 02:57:22 -05:00
Richard Henderson
9a9c2ede4a
tcg: Remove tcg_regset_{or,and,andnot,not}
Backports commit 07ddf036fa66bca279590c09fe1c46bcdcc5bcff from qemu
2018-03-04 23:34:16 -05:00
Richard Henderson
7ba6f6f5e6
tcg: Remove tcg_regset_set
Backports commit d21369f5fb41299d5e7b032ec6da12da7f95f72f from qemu
2018-03-04 23:31:35 -05:00
Richard Henderson
49d09d6888
tcg: Remove tcg_regset_clear
Backports commit ccb1bb66ea2a42e773bfa04178d8b383ff86d4d8 from qemu
2018-03-04 23:24:45 -05:00
Richard Henderson
7b68a8f0ca
tcg: Add tcg_op_supported
Backports commit be0f34b5840312bbe9627c2b9f68a25f32903dae from qemu
2018-03-04 23:20:28 -05:00
Richard Henderson
e9d8cef430
tcg: Infrastructure for managing constant pools
A new shared header tcg-pool.inc.c adds new_pool_label,
for registering a tcg_target_ulong to be emitted after
the generated code, plus relocation data to install a
pointer to the data.

A new pointer is added to the TCGContext, so that we
dump the constant pool as data, not code.

Backports commit 57a269469dbf70013dab3a176e1735636010a772 from qemu
2018-03-04 22:17:33 -05:00
Richard Henderson
f96514a99c
tcg: Rearrange ldst label tracking
Dispense with TCGBackendData, as it has never been used for more than
holding a single pointer. Use a define in the cpu/tcg-target.h to
signal requirement for TCGLabelQemuLdst, so that we can drop the no-op
tcg-be-null.h stubs. Rename tcg-be-ldst.h to tcg-ldst.inc.c.

Backports commit 659ef5cbb893872d25e9d95191cc23b16546c8a1 from qemu
2018-03-04 22:13:13 -05:00
Emilio G. Cota
d3ada2feb5
tcg: allocate TB structs before the corresponding translated code
Allocating an arbitrarily-sized array of tbs results in either
(a) a lot of memory wasted or (b) unnecessary flushes of the code
cache when we run out of TB structs in the array.

An obvious solution would be to just malloc a TB struct when needed,
and keep the TB array as an array of pointers (recall that tb_find_pc()
needs the TB array to run in O(log n)).

Perhaps a better solution, which is implemented in this patch, is to
allocate TB's right before the translated code they describe. This
results in some memory waste due to padding to have code and TBs in
separate cache lines--for instance, I measured 4.7% of padding in the
used portion of code_gen_buffer when booting aarch64 Linux on a
host with 64-byte cache lines. However, it can allow for optimizations
in some host architectures, since TCG backends could safely assume that
the TB and the corresponding translated code are very close to each
other in memory. See this message by rth for a detailed explanation:

https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05172.html
Subject: Re: GSoC 2017 Proposal: TCG performance enhancements

Backports commit 6e3b2bfd6af488a896f7936e99ef160f8f37e6f2 from qemu
2018-03-03 17:05:49 -05:00
Emilio G. Cota
8f4f15e5f5
tcg: Introduce goto_ptr opcode and tcg_gen_lookup_and_goto_ptr
Instead of exporting goto_ptr directly to TCG frontends, export
tcg_gen_lookup_and_goto_ptr(), which calls goto_ptr with the pointer
returned by the lookup_tb_ptr() helper. This is the only use case
we have for goto_ptr and lookup_tb_ptr, so having this function is
very convenient. Furthermore, it trivially allows us to avoid calling
the lookup helper if goto_ptr is not implemented by the backend.

Backports commit cedbcb01529cb6cf9a2289cdbebbc63f6149fc18 from qemu
2018-03-02 21:05:18 -05:00
Richard Henderson
b4b173615c
tcg: Allow an operand to be matching or a constant
This allows an output operand to match an input operand
only when the input operand needs a register.

Backports commit 17280ff4a5f264e01e55ae514ee6d3586f9577b2 from qemu
2018-03-01 15:49:05 -05:00
Richard Henderson
3f38611159
tcg: Pass the opcode width to target_parse_constraint
This will let us choose how to interpret a given constraint
depending on whether the opcode is 32- or 64-bit. Which will
let us share more constraint combinations between opcodes.

At the same time, change the interface to return the advanced
pointer instead of passing it in/out by reference.

Backports commit 069ea736b50b75fdec99c9b8cc603b97bd98419e from qemu
2018-03-01 15:45:40 -05:00
Richard Henderson
b8c93597b4
tcg: Transition flat op_defs array to a target callback
This will allow the target to tailor the constraints to the
auto-detected ISA extensions.

Backports commit f69d277ece43c42c7ab0144c2ff05ba740f6706b from qemu
2018-03-01 15:40:11 -05:00
Richard Henderson
551ef0a9f7
tcg: Add markup for output requires new register
This is the same concept as, and same markup as, the
early clobber markup in gcc.

Backports commit 82790a870992bd87d5fd9e607f40859dcf4f82ac from qemu
2018-03-01 15:24:58 -05:00
Paolo Bonzini
8734e13a73
tcg: try sti when moving a constant into a dead memory temp
This comes from free from unifying tcg_reg_alloc_mov and
tcg_reg_alloc_movi's handling of TEMP_VAL_CONST. It triggers
often on moves to cc_dst, such as the following translation
of "sub $0x3c,%esp":

before: after:
subl $0x3c,%ebp subl $0x3c,%ebp
movl %ebp,0x10(%r14) movl %ebp,0x10(%r14)
movl $0x3c,%ebx movl $0x3c,0x2c(%r14)
movl %ebx,0x2c(%r14)

Backports commit 0fe4fca4e1a5e06a270127dd80bb753d4dda61c6 from qemu
2018-02-26 10:08:47 -05:00
Thomas Huth
b581d4033f
tcg: Remove duplicate header includes
host-utils.h and timer.h are included twice in tcg.c.
One time should be enough.

Backports commit 347519eb9d68303a6c23a7663c0fa6c20a225191 from qemu
2018-02-26 02:29:38 -05:00
Richard Henderson
ede1cae3dc
tcg: Lower indirect registers in a separate pass
Rather than rely on recursion during the middle of register allocation,
lower indirect registers to loads and stores off the indirect base into
plain temps.

For an x86_64 host, with sufficient registers, this results in identical
code, modulo the actual register assignments.

For an i686 host, with insufficient registers, this means that temps can
be (temporarily) spilled to the stack in order to satisfy an allocation.
This as opposed to the possibility of not being able to spill, to allocate
a register for the indirect base, in order to perform a spill.

Backports commit 5a18407f55ade924aa6397c9a043a9ffd59645fe from qemu
2018-02-25 22:32:28 -05:00
Richard Henderson
8a012ff6d3
tcg: Require liveness analysis
Backports commit c0ef05b5e62ab0c291a94022f14104e61e306f03 from qemu
2018-02-25 22:20:42 -05:00
Richard Henderson
2aa46dd9a1
tcg: Include liveness info in the dumps
Backports commit bdfb460ef77500f7b186759b585f06ff2120929d from qemu
2018-02-25 22:13:08 -05:00
Richard Henderson
e973e89a57
tcg: Compress dead_temps and mem_temps into a single array
We only need two bits per temporary. Fold the two bytes into one,
and reduce the memory and cachelines required during compilation.

Backports commit c70fbf0a9938baf3b4f843355a77c17a7e945b98 from qemu
2018-02-25 22:07:08 -05:00
Richard Henderson
690985a582
tcg: Fold life data into TCGOp
Reduce the size of other bitfields to make room.
This reduces the cache footprint of compilation.

Backports commit bee158cb4dde35c41632a3a129c869f14a32f8f0 from qemu
2018-02-25 21:49:42 -05:00
Richard Henderson
1547048a22
tcg: Reorg TCGOp chaining
Instead of using -1 as end of chain, use 0, and link through the 0
entry as a fully circular double-linked list.

Backports commit dcb8e75870e2de199db853697f8839cb603beefe from qemu
2018-02-25 21:44:50 -05:00
Richard Henderson
b2e6e351c2
tcg: Compress liveness data to 16 bits
This reduces both memory usage and per-insn cacheline usage
during code generation.

Backports commit a1b3c48d2b23d6eaeb4529d3e1183d2648731bf8 from qemu
2018-02-25 21:27:24 -05:00
Sergey Sorokin
e4d123caa9
tcg: Improve the alignment check infrastructure
Some architectures (e.g. ARMv8) need the address which is aligned
to a size more than the size of the memory access.
To support such check it's enough the current costless alignment
check implementation in QEMU, but we need to support
an alignment size specifying.

Backports commit 1f00b27f17518a1bcb4cedca49eaec96a4d560bd from qemu
2018-02-25 02:23:28 -05:00
Richard Henderson
23586e2674
tcg: Optimize spills of constants
While we can store constants via constrants on INDEX_op_st_i32 et al,
we weren't able to spill constants to backing store.

Add a new backend interface, tcg_out_sti, which may store the constant
(and is allowed to fail). Rearrange the temp_* helpers so that we only
attempt to directly store a constant when the temp is becoming dead/free.

Backports commit 59d7c14eeff8d2ad7f61aed86ce5a176113bc153 from qemu
2018-02-25 01:45:29 -05:00
Richard Henderson
64fda683b1
tcg: Fix name for high-half register 2018-02-25 01:36:35 -05:00
Emilio G. Cota
8518f55df7
compiler.h: add QEMU_ALIGNED() to enforce struct alignment
Backports commit 911a4d2215b05267b16925503218f49d607c6b29 from qemu
2018-02-24 17:32:43 -05:00
Paolo Bonzini
9485b7c2e1
cpu: move exec-all.h inclusion out of cpu.h
exec-all.h contains TCG-specific definitions. It is not needed outside
TCG-specific files such as translate.c, exec.c or *helper.c.

One generic function had snuck into include/exec/exec-all.h; move it to
include/qom/cpu.h.

Backports commit 63c915526d6a54a95919ebece83fa9ca631b2508 from qemu
2018-02-24 02:39:08 -05:00