Commit Graph

109 Commits

Author SHA1 Message Date
Lioncash
a2e7d86ccf
tcg/mips: Support r6 SEL{NE, EQ}Z instead of MOVN/MOVZ
Extend MIPS movcond implementation to support the SELNEZ/SELEQZ
instructions introduced in MIPS r6 (where MOVN/MOVZ have been removed).

Whereas the "MOVN/MOVZ rd, rs, rt" instructions have the following
semantics:
rd = [!]rt ? rs : rd

The "SELNEZ/SELEQZ rd, rs, rt" instructions are slightly different:
rd = [!]rt ? rs : 0

First we ensure that if one of the movcond input values is zero that it
comes last (we can swap the input arguments if we invert the condition).
This is so that it can exactly match one of the SELNEZ/SELEQZ
instructions and avoid the need to emit the other one.

Otherwise we emit the opposite instruction first into a temporary
register, and OR that into the result:
SELNEZ/SELEQZ TMP1, v2, c1
SELEQZ/SELNEZ ret, v1, c1
OR ret, ret, TMP1

Which does the following:
ret = cond ? v1 : v2

Backports commit 137d63902faf4960081856db9242cbaf234a23af from qemu
2018-02-17 15:24:04 -05:00
James Hogan
e71d19df81
tcg/mips: Support r6 multiply/divide encodings
MIPSr6 adds several new integer multiply, divide, and modulo
instructions, and removes several pre-r6 encodings, along with the HI/LO
registers which were the implicit operands of some of those
instructions. Update TCG to use the new instructions when built for r6.

The new instructions actually map much more directly to the TCG ops, as
they only provide a single 32-bit half of the result and in a normal
general purpose register instead of HI or LO.

The mulu2_i32 and muls2_i32 operations are no longer appropriate for r6,
so they are removed from the TCG opcode table. This is because they
would need to emit two separate host instructions anyway (for the high
and low half of the result), which TCG can arrange automatically for us
in the absense of mulu2_i32/muls2_i32 by splitting it into mul_i32 and
mul*h_i32 TCG ops.

Backports commit bc6d0c22b09a72897d9db4482076f89e7de97400 from qemu
2018-02-17 15:24:04 -05:00
James Hogan
9dac598855
tcg/mips: Support r6 JR encoding
MIPSr6 encodes JR as JALR with zero as the link register, and the pre-r6
JR encoding is removed. Update TCG to use the new encoding when built
for r6.

We still use the old encoding for pre-r6, so as not to confuse return
prediction stack hardware which may detect only particular encodings of
the return instruction.

Backports commit 6e0d096989be52c2b945fc83a9bd15d887bbdb47 from qemu
2018-02-17 15:24:04 -05:00
James Hogan
7f1bc28513
tcg/mips: Add use_mips32r6_instructions definition
Add definition use_mips32r6_instructions to the MIPS TCG backend which
is constant 1 when built for MIPS release 6. This will be used to decide
between pre-R6 and R6 instruction encodings.

Backports commit ce14bd4d469f3a14f6cbfceb6360aee066a60d72 from qemu
2018-02-17 15:24:04 -05:00
James Hogan
9d3a2feea0
tcg-opc.h: Simplify insn_start def
We already have a TLADDR_ARGS definition, so rearrange the order
slightly and use it in the definition of insn_start, instead of
having an #ifdef.

Backports commit c0e40dbdcc291c85faa289a53be60b7b1b7c7598 from qemu
2018-02-17 15:24:03 -05:00
Richard Henderson
d167379211
tcg/ppc: Prefer mask over andi.
Prefer the instruction that isn't required to modify cr0.

Backports commit 1e1df962e325e18a5188c4814cd1a10215a48f79 from qemu
2018-02-17 15:24:03 -05:00
Richard Henderson
3c3dee3747
tcg/ppc: Revise goto_tb implementation
Restrict the size of code_gen_buffer to 2GB on ppc64, which
lets us assert that everything is reachable with addis+addi
from tb_ret_addr. This lets us use a max of 4 insns for goto_tb
instead of 7.

Emit the indirect branch portion of goto_tb up front, which
means we only have to update two insns to update any link.
With a 64-bit store, we can update the link atomically, which
may be required in future.

Backports commit 5bfd75a35c11dd3aa61c73d0d2cd88137c31519c from qemu
2018-02-17 15:24:03 -05:00
Richard Henderson
13ad21a21f
tcg/ppc: Adjust exit_tb for change in prologue placement
Changing the prologue to the beginning of the code_gen_buffer
changes the direction of the "return" branch. Need to change
the logic to match.

Backports commit 70f897bdc4ce4101ec008317d43090f532bfb07d from qemu
2018-02-17 15:24:03 -05:00
Richard Henderson
bdf667fd4e
tcg: Check for overflow via highwater mark
We currently pre-compute an worst case code size for any TB, which
works out to be 122kB. Since the average TB size is near 1kB, this
wastes quite a lot of storage.

Instead, check for overflow in between generating code for each opcode.
The overhead of the check isn't measurable and wastage is minimized.

Backports commit b125f9dc7bd68cd4c57189db4da83b0620b28a72 from qemu
2018-02-17 15:24:00 -05:00
Richard Henderson
19a3c7e03f
tcg: Emit prologue to the beginning of code_gen_buffer
By putting the prologue at the end, we risk overwriting the
prologue should our estimate of maximum TB size. Given the
two different placements of the call to tcg_prologue_init,
move the high water mark computation into tcg_prologue_init.

Backports commit 8163b74938d8b7d12e70597c4553dd0dc49443d5 from qemu
2018-02-17 15:24:00 -05:00
Richard Henderson
532877a366
tcg: Remove tcg_gen_code_search_pc
It's no longer used, so tidy up everything reached by it.

Backports commit 04fe64000162c45d8974da9ca4d266f8d0e67eb7 from qemu
2018-02-17 15:24:00 -05:00
Richard Henderson
a5ac288135
tcg: Remove gen_intermediate_code_pc
It is no longer used, so tidy up everything reached by it.
This includes the gen_opc_* arrays, the search_pc parameter
and the inline gen_intermediate_code_internal functions.

Backports commit 4e5e1215156662b2b153255c49d4640d82c5568b from qemu
2018-02-17 15:23:59 -05:00
Richard Henderson
66de6cc37c
tcg: Save insn data and use it in cpu_restore_state_from_tb
We can now restore state without retranslation.

Backports commit fca8a500d519a56abeaedf8073167a61d3c6b9c4 from qemu
2018-02-17 15:23:59 -05:00
Richard Henderson
1cbd175736
tcg: Pass data argument to restore_state_to_opc
The gen_opc_* arrays are already redundant with the data stored in
the insn_start arguments. Transition restore_state_to_opc to use
data from the latter.

Backports commit bad729e272387de7dbfa3ec4319036552fc6c107 from qemu
2018-02-17 15:23:58 -05:00
Lioncash
b115c5509d
tcg: Add TCG_MAX_INSNS
Adjust all translators to respect it.

Backports commit 190ce7fbc79fd0883a6170d7f30da59d366e6830 from qemu
2018-02-17 15:23:58 -05:00
Richard Henderson
2c1ae7a408
target-sparc: Remove gen_opc_jump_pc
Since jump_pc[1] is always npc + 4, we can infer after incrementing
that jump_pc[1] == pc + 4. Because of that, we can encode the branch
destination into a single word, and store that in npc.

Backports commit 6c42444f9a53b6af39d46008cb9f650b11e96cb9 from qemu
2018-02-17 15:23:56 -05:00
Richard Henderson
500e116581
target-mips: Add delayed branch state to insn_start
Backports commit c20d594e45bc8c4b21be1a7637cba0f279f72879 from qemu
2018-02-17 15:23:56 -05:00
Aurelien Jarno
b5f5e2dbc2
tcg/mips: pass oi to tcg_out_tlb_load
Instead of computing mem_index and s_bits in both tcg_out_qemu_ld and
tcg_out_qemu_st function and passing them to tcg_out_tlb_load, directly
pass oi to the tcg_out_tlb_load function and compute mem_index and
s_bits there.

Backports commit 81dfaf1a8f7f95259801da9732472f879023ef77 from qemu
2018-02-17 15:23:54 -05:00
Peter Crosthwaite
2b15db6e12
tcg: split tcg_op_defs to -common
tcg_op_defs (and the _max) are both needed by the TCI disassembler. For
multi-arch, tcg.c will be multiple-compiled (arch-obj) with its symbols
hidden from common code. So split the definition off to new file,
tcg-common.c which will remain a regular obj-y for use by both the TCI
disas as well as the multiple tcg.c's.

Backports commit 7d8f787d9d261d6880b69e35ed682241e3f9242f from qemu
2018-02-17 15:23:51 -05:00
Pavel Dovgalyuk
6cdaaf9b1b
softmmu: add helper function to pass through retaddr
This patch introduces several helpers to pass return address
which points to the TB. Correct return address allows correct
restoring of the guest PC and icount. These functions should be used when
helpers embedded into TB invoke memory operations.

Backports commit 282dffc8a4bfe8724548cabb8a26698bde0a6e18 from qemu
2018-02-17 15:23:38 -05:00
Aurelien Jarno
11cfddad05
tcg/i386: use softmmu fast path for unaligned accesses
Softmmu unaligned load/stores currently goes through through the slow
path for two reasons:
  - to support unaligned access on host with strict alignement
  - to correctly handle accesses crossing pages

x86 is only concerned by the second reason. Unaligned accesses are
avoided by compilers, but are not uncommon. We therefore would like
to see them going through the fast path, if they don't cross pages.

For that we can use the fact that two adjacent TLB entries can't contain
the same page. Therefore accessing the TLB entry corresponding to the
first byte, but comparing its content to page address of the last byte
ensures that we don't cross pages. We can do this check without adding
more instructions in the TLB code (but increasing its length by one
byte) by using the LEA instruction to combine the existing move with the
size addition.

On an x86-64 host, this gives a 3% boot time improvement for a powerpc
guest and 4% for an x86-64 guest.

Backports commit 8cc580f6a0d8c0e2f590c1472cf5cd8e51761760 from qemu
2018-02-17 15:23:33 -05:00
Laurent Vivier
ea2ee48d9c
s390: fix softmmu compilation
guest_base must be used only in linux-user mode.

Backports commit 090d0bfd948343d522cd20bc634105b5cfe2483b from qemu
2018-02-17 15:23:32 -05:00
James Hogan
dba4828444
tcg/mips: Fix clobbering of qemu_ld inputs
The MIPS TCG backend implements qemu_ld with 64-bit targets using the v0
register (base) as a temporary to load the upper half of the QEMU TLB
comparator (see line 5 below), however this happens before the input
address is used (line 8 to mask off the low bits for the TLB
comparison, and line 12 to add the host-guest offset). If the input
address (addrl) also happens to have been placed in v0 (as in the second
column below), it gets clobbered before it is used.

addrl in t2 addrl in v0

1 srl a0,t2,0x7 srl a0,v0,0x7
2 andi a0,a0,0x1fe0 andi a0,a0,0x1fe0
3 addu a0,a0,s0 addu a0,a0,s0
4 lw at,9136(a0) lw at,9136(a0) set TCG_TMP0 (at)
5 lw v0,9140(a0) lw v0,9140(a0) set base (v0)
6 li t9,-4093 li t9,-4093
7 lw a0,9160(a0) lw a0,9160(a0) set addend (a0)
8 and t9,t9,t2 and t9,t9,v0 use addrl
9 bne at,t9,0x836d8c8 bne at,t9,0x836d838 use TCG_TMP0
10 nop nop
11 bne v0,t8,0x836d8c8 bne v0,a1,0x836d838 use base
12 addu v0,a0,t2 addu v0,a0,v0 use addrl, addend
13 lw t0,0(v0) lw t0,0(v0)

Fix by using TCG_TMP0 (at) as the temporary instead of v0 (base),
pushing the load on line 5 forward into the delay slot of the low
comparison (line 10). The early load of the addend on line 7 also needs
pushing even further for 64-bit targets, or it will clobber a0 before
we're done with it. The output for 32-bit targets is unaffected.

srl a0,v0,0x7
andi a0,a0,0x1fe0
addu a0,a0,s0
lw at,9136(a0)
-lw v0,9140(a0) load high comparator
li t9,-4093
-lw a0,9160(a0) load addend
and t9,t9,v0
bne at,t9,0x836d838
- nop
+ lw at,9140(a0) load high comparator
+lw a0,9160(a0) load addend
-bne v0,a1,0x836d838
+bne at,a1,0x836d838
addu v0,a0,v0
lw t0,0(v0)

Backports commit 33fca8589cf2aa7bf91564e6a8f26b3ba0910541 from qemu
2018-02-17 15:23:24 -05:00
Aurelien Jarno
45927edecf
tcg/mips: fix add2
The add2 code in the tcg_out_addsub2 function doesn't take into account
the case where rl == al == bl. In that case we can't compute the carry
after the addition. As it corresponds to a multiplication by 2, the
carry bit is the bit 31.

While this is a corner case, this prevents x86-64 guests to boot on a
MIPS host.

Backports commit c99d69694af4ed15b33e3f7c2e3ef6972c14358d from qemu
2018-02-17 15:23:23 -05:00
Aurelien Jarno
4e68b4167d
tcg/s390x: Mask TCGMemOp appropriately for indexing
Commit 2b7ec66f fixed TCGMemOp masking following the MO_AMASK addition,
but two cases were forgotten in the TCG S390 backend.

Backports commit 3c8691f568f49bf623dcb2850464d4156d95e61b from qemu
2018-02-17 15:23:23 -05:00
Aurelien Jarno
096d1a975d
tcg/mips: Mask TCGMemOp appropriately for indexing
Commit 2b7ec66f fixed TCGMemOp masking following the MO_AMASK addition,
but two cases were forgotten in the TCG MIPS backend.

Backports commit 4214a8cb7c15ec43d4b2a43ebf248b273a0f4d45 from qemu
2018-02-17 15:23:23 -05:00
Aurelien Jarno
8396601082
tcg/mips: fix TLB loading for BE host with 32-bit guests
For 32-bit guest, we load a 32-bit address from the TLB, so there is no
need to compensate for the low or high part. This fixes 32-bit guests on
big-endian hosts.

Backports commit e72c4fb81db52be881c9356f1c60e0a7817d2d32 from qemu
2018-02-17 15:23:23 -05:00
Aurelien Jarno
ba73fd9162
tcg/s390: fix branch target change during code retranslation
Make sure to not modify the branch target. This ensure that the
branch target is not corrupted during partial retranslation.

Backports commit cd3b29b745b0ff393b2d37317837bc726b8dacc8 from qemu
2018-02-17 15:23:17 -05:00
Peter Crosthwaite
a591219ad6
cpu-defs: Move CPU_TEMP_BUF_NLONGS to tcg
The usages of this define are pure TCG and there is no architecture
specific variation of the value. Localise it to the TCG engine to
remove another architecture agnostic piece from cpu-defs.h.

This follows on from a28177820a868eafda8fab007561cc19f41941f4 where
temp_buf was moved out of the CPU_COMMON obsoleting the need for
the super early definition.

Backports commit 6e0b07306d1793e8402dd218d2e38a7377b5fc27 from qemu
2018-02-17 15:23:15 -05:00
Paolo Bonzini
b34c233c2f
tcg: add TCG_TARGET_TLB_DISPLACEMENT_BITS
This will be used to size the TLB when more than 8 MMU modes are
used by the target. Limitations come from the limited size of
the immediate fields (which sometimes, as in the case of Aarch64,
extend to instructions that shift the immediate).

Backports commit 006f8638c62bca2b0caf609485f47fa5e14d8a3c from qemu
2018-02-13 08:28:29 -05:00
Peter Crosthwaite
3501c34344
tcg: Delete unused cpu_pc_from_tb()
No code uses the cpu_pc_from_tb() function. Delete from tricore and
arm which each provide an unused implementation. Update the comment
in tcg.h to reflect that this is obsoleted by synchronize_from_tb.

Backports commit fee068e4f190a36ef3bda9aa7c802f90434ef8e5 from qemu
2018-02-12 21:14:22 -05:00
Richard Henderson
3f9502dc8b
tcg: Allow extra data to be attached to insn_start
With an eye toward having this data replace the gen_opc_* arrays
that each target collects in order to enable restore_state_from_tb.

Backports commit 9aef40ed1f6e2bd794bbb3ba8c8b773e506334c9 from qemu
2018-02-11 13:03:51 -05:00
Lioncash
b3f9ff667b
tcg: Rename debug_insn_start to insn_start
With an eye toward making it mandatory.

Backports commit 765b842adec4c5a359e69ca08785553599f71496 from qemu
2018-02-11 12:34:01 -05:00
Lioncash
fb2fe4580f
optimize: Add missing extrh/extrl case 2018-02-11 02:57:55 -05:00
Richard Henderson
352f93a119
tcg/aarch64: Fix tcg_out_qemu_{ld, st} for guest_base == 0
In ffc6372851d8631a9f9fa56ec613b3244dc635b9, we swapped the guest
base to the address base register from the address index register.
Except that 31 in the base slot is SP not XZR, so we need to be
more intelligent about which reg gets placed in which slot.

Backports commit 352bcb0a2b816ff9ab9d75d0f2384650d9e9ab19 from qemu
2018-02-10 23:33:24 -05:00
Richard Henderson
7d57c2e4ce
tcg/aarch64: Use softmmu fast path for unaligned accesses
Backports commit 9ee14902bf107e37fb2c8119fa7bca424396237c from qemu
2018-02-10 23:25:34 -05:00
Richard Henderson
aaf89ed84d
tcg/s390: Use softmmu fast path for unaligned accesses
Backports commit a5e39810b9088b5d20fac8e0293f281e1c8b608f from qemu
2018-02-10 23:14:14 -05:00
Richard Henderson
a3aaf5a864
tcg: Remove tcg_gen_trunc_i64_i32
Replacing it with tcg_gen_extrl_i64_i32.

Backports commit ecc7b3aa71f5fdcf9ee87e74ca811d988282641d from qemu
2018-02-10 23:11:02 -05:00
Richard Henderson
58e939b91f
tcg: Split trunc_shr_i32 opcode into extr[lh]_i64_i32
Rather than allow arbitrary shift+trunc, only concern ourselves
with low and high parts. This is all that was being used anyway.

Backports commit 609ad70562793937257c89d07bf7c1370b9fc9aa from qemu
2018-02-10 23:00:45 -05:00
Aurelien Jarno
a05256b206
tcg: update README about size changing ops
Backports commit 870ad1547ac53bc79c21d86cf453b3b20cc660a2 from qemu
2018-02-10 22:49:36 -05:00
Aurelien Jarno
4bd3d5005e
tcg/optimize: add optimizations for ext_i32_i64 and extu_i32_i64 ops
They behave the same as ext32s_i64 and ext32u_i64 from the constant
folding and zero propagation point of view, except that they can't
be replaced by a mov, so we don't compute the affected value.

Backports commit 8bcb5c8f34f9215d4f88f388c7ff14c9bd5cecd3 from qemu
2018-02-10 22:47:26 -05:00
Aurelien Jarno
f279c93768
tcg: implement real ext_i32_i64 and extu_i32_i64 ops
Implement real ext_i32_i64 and extu_i32_i64 ops. They ensure that a
32-bit value is always converted to a 64-bit value and not propagated
through the register allocator or the optimizer.

Backports commit 4f2331e5b67af8172419eb1c8db510b497b30a7b from qemu
2018-02-10 22:45:13 -05:00
Aurelien Jarno
80223e7ad5
tcg: rename trunc_shr_i32 into trunc_shr_i64_i32
The op is sometimes named trunc_shr_i32 and sometimes trunc_shr_i64_i32,
and the name in the README doesn't match the name offered to the
frontends.

Always use the long name to make it clear it is a size changing op.

Backports commit 0632e555fc4d281d69cb08d98d500d96185b041f from qemu
2018-02-10 22:29:30 -05:00
Aurelien Jarno
5f0920ad0f
tcg/optimize: allow constant to have copies
Now that copies and constants are tracked separately, we can allow
constant to have copies, deferring the choice to use a register or a
constant to the register allocation pass. This prevent this kind of
regular constant reloading:

-OUT: [size=338]
+OUT: [size=298]
   mov    -0x4(%r14),%ebp
   test   %ebp,%ebp
   jne    0x7ffbe9cb0ed6
   mov    $0x40002219f8,%rbp
   mov    %rbp,(%r14)
-  mov    $0x40002219f8,%rbp
   mov    $0x4000221a20,%rbx
   mov    %rbp,(%rbx)
   mov    $0x4000000000,%rbp
   mov    %rbp,(%r14)
-  mov    $0x4000000000,%rbp
   mov    $0x4000221d38,%rbx
   mov    %rbp,(%rbx)
   mov    $0x40002221a8,%rbp
   mov    %rbp,(%r14)
-  mov    $0x40002221a8,%rbp
   mov    $0x4000221d40,%rbx
   mov    %rbp,(%rbx)
   mov    $0x4000019170,%rbp
   mov    %rbp,(%r14)
-  mov    $0x4000019170,%rbp
   mov    $0x4000221d48,%rbx
   mov    %rbp,(%rbx)
   mov    $0x40000049ee,%rbp
   mov    %rbp,0x80(%r14)
   mov    %r14,%rdi
   callq  0x7ffbe99924d0
   mov    $0x4000001680,%rbp
   mov    %rbp,0x30(%r14)
   mov    0x10(%r14),%rbp
   mov    $0x4000001680,%rbp
   mov    %rbp,0x30(%r14)
   mov    0x10(%r14),%rbp
   shl    $0x20,%rbp
   mov    (%r14),%rbx
   mov    %ebx,%ebx
   mov    %rbx,(%r14)
   or     %rbx,%rbp
   mov    %rbp,0x10(%r14)
   mov    %rbp,0x90(%r14)
   mov    0x60(%r14),%rbx
   mov    %rbx,0x38(%r14)
   mov    0x28(%r14),%rbx
   mov    $0x4000220e60,%r12
   mov    %rbx,(%r12)
   mov    $0x40002219c8,%rbx
   mov    %rbp,(%rbx)
   mov    0x20(%r14),%rbp
   sub    $0x8,%rbp
   mov    $0x4000004a16,%rbx
   mov    %rbx,0x0(%rbp)
   mov    %rbp,0x20(%r14)
   mov    $0x19,%ebp
   mov    %ebp,0xa8(%r14)
   mov    $0x4000015110,%rbp
   mov    %rbp,0x80(%r14)
   xor    %eax,%eax
   jmpq   0x7ffbebcae426
   lea    -0x5f6d72a(%rip),%rax        # 0x7ffbe3d437b3
   jmpq   0x7ffbebcae426

Backports commit 299f80130401153af1a6ddb3cc011781bcd47600 from qemu
2018-02-10 22:18:03 -05:00
Aurelien Jarno
59909fe549
tcg/optimize: track const/copy status separately
Instead of using an enum which could be either a copy or a const, track
them separately. This will be used in the next patch.

Constants are tracked through a bool. Copies are tracked by initializing
temp's next_copy and prev_copy to itself, allowing to simplify the code
a bit.

Backports commit b41059dd9deec367a4ccd296659f0bc5de2dc705 from qemu
2018-02-10 22:15:43 -05:00
Aurelien Jarno
134a7dfe82
tcg/optimize: add temp_is_const and temp_is_copy functions
Add two accessor functions temp_is_const and temp_is_copy, to make the
code more readable and make code change easier.

Backports commit d9c769c60948815ee03b2684b1c1c68ee4375149 from qemu
2018-02-10 22:07:02 -05:00
Aurelien Jarno
b450b79622
tcg/optimize: optimize temps tracking
The tcg_temp_info structure uses 24 bytes per temp. Now that we emulate
vector registers on most guests, it's not uncommon to have more than 100
used temps. This means we have initialize more than 2kB at least twice
per TB, often more when there is a few goto_tb.

Instead used a TCGTempSet bit array to track which temps are in used in
the current basic block. This means there are only around 16 bytes to
initialize.

This improves the boot time of a MIPS guest on an x86-64 host by around
7% and moves out tcg_optimize from the the top of the profiler list.

Backports commit 1208d7dd5fddc1fbd98de800d17429b4e5578848 from qemu
2018-02-10 21:51:46 -05:00
Aurelien Jarno
5f67ab74e7
tcg/optimize: fix constant signedness
By convention, on a 64-bit host TCG internally stores 32-bit constants
as sign-extended. This is not the case in the optimizer when a 32-bit
constant is folded.

This doesn't seem to have more consequences than suboptimal code
generation. For instance the x86 backend assumes sign-extended constants,
and in some rare cases uses a 32-bit unsigned immediate 0xffffffff
instead of a 8-bit signed immediate 0xff for the constant -1. This is
with a ppc guest:

before
------

 ---- 0x9f29cc
 movi_i32 tmp1,$0xffffffff
 movi_i32 tmp2,$0x0
 add2_i32 tmp0,CA,CA,tmp2,r6,tmp2
 add2_i32 tmp0,CA,tmp0,CA,tmp1,tmp2
 mov_i32 r10,tmp0

0x7fd8c7dfe90c:  xor    %ebp,%ebp
0x7fd8c7dfe90e:  mov    %ebp,%r11d
0x7fd8c7dfe911:  mov    0x18(%r14),%r9d
0x7fd8c7dfe915:  add    %r9d,%r10d
0x7fd8c7dfe918:  adc    %ebp,%r11d
0x7fd8c7dfe91b:  add    $0xffffffff,%r10d
0x7fd8c7dfe922:  adc    %ebp,%r11d
0x7fd8c7dfe925:  mov    %r11d,0x134(%r14)
0x7fd8c7dfe92c:  mov    %r10d,0x28(%r14)

after
-----

 ---- 0x9f29cc
 movi_i32 tmp1,$0xffffffffffffffff
 movi_i32 tmp2,$0x0
 add2_i32 tmp0,CA,CA,tmp2,r6,tmp2
 add2_i32 tmp0,CA,tmp0,CA,tmp1,tmp2
 mov_i32 r10,tmp0

0x7f37010d490c:  xor    %ebp,%ebp
0x7f37010d490e:  mov    %ebp,%r11d
0x7f37010d4911:  mov    0x18(%r14),%r9d
0x7f37010d4915:  add    %r9d,%r10d
0x7f37010d4918:  adc    %ebp,%r11d
0x7f37010d491b:  add    $0xffffffffffffffff,%r10d
0x7f37010d491f:  adc    %ebp,%r11d
0x7f37010d4922:  mov    %r11d,0x134(%r14)
0x7f37010d4929:  mov    %r10d,0x28(%r14)

Backports commit 29f3ff8d6cbc28f79933aeaa25805408d0984a8f from qemu
2018-02-10 21:40:20 -05:00
Aurelien Jarno
e273acf87a
tcg/optimize: fix tcg_opt_gen_movi
Due to a copy&paste, the new op value is tested against mov_i32 instead
of movi_i32. The test is therefore always false. Fix that.

Backports commit 961521261a3d600b0695b2e6d2b0f490076f7e90 from qemu
2018-02-10 21:38:09 -05:00
Aurelien Jarno
42dd2addbe
tcg/optimize: rename tcg_constant_folding
The tcg_constant_folding folding ends up doing all the optimizations
(which is a good thing to avoid looping on all ops multiple time), so
make it clear and just rename it tcg_optimize.

Backports commit 36e60ef6ac5d8a262d0fbeedfdb2b588514cb1ea from qemu
2018-02-10 21:36:34 -05:00