Commit Graph

153 Commits

Author SHA1 Message Date
Richard Henderson
8436080518
target/arm: Implement FCVT (scalar, integer) for fp16
Backports commit 564a0632504fad840491aa9a59453f4e64a316c4 from qemu
2018-05-15 22:06:49 -04:00
Richard Henderson
75643ab1cf
target/arm: Early exit after unallocated_encoding in disas_fp_int_conv
No sense in emitting code after the exception.

Backports commit 8c738d430796edeae5e13d6daf0895c02c62bd54 from qemu
2018-05-15 21:55:42 -04:00
Richard Henderson
bcaceb9bc7
target/arm: Implement FMOV (general) for fp16
Adding the fp16 moves to/from general registers.

Backports commit 68130236e30a1ec64363f4915349feee181bfbc1 from qemu
2018-05-15 21:54:32 -04:00
Richard Henderson
5902f32abf
target/arm: Clear SVE high bits for FMOV
Use write_fp_dreg and clear_vec_high to zero the bits
that need zeroing for these cases.

Backports commit 9a9f1f59521f46e8ff4527d9a2b52f83577e2aa3 from qemu
2018-05-14 08:43:55 -04:00
Richard Henderson
67740bbc7f
target/arm: Fix float16 to/from int16
The instruction "ucvtf v0.4h, v04h, #2", with input 0x8000u,
overflows the intermediate float16 to infinity before we have a
chance to scale the output. Use float64 as the intermediate type
so that no input argument (uint32_t in this case) can overflow
or round before scaling. Given the declared argument, the signed
int32_t function has the same problem.

When converting from float16 to integer, using u/int32_t instead
of u/int16_t means that the bounding is incorrect.

Backports commit 88808a022c06f98d81cd3f2d105a5734c5614839 from qemu
2018-05-14 08:41:20 -04:00
Richard Henderson
e403957a5e
target/arm: Implement vector shifted FCVT for fp16
While we have some of the scalar paths for FCVT for fp16,
we failed to decode the fp16 version of these instructions.

Backports commit d0ba8e74acd299b092786ffc30b306638d395a9e from qemu
2018-05-14 08:36:54 -04:00
Richard Henderson
ad6c191d96
target/arm: Implement vector shifted SCVF/UCVF for fp16
While we have some of the scalar paths for *CVF for fp16,
we failed to decode the fp16 version of these instructions.

Backports commit a6117fae4576edfe7a5a5b802a742c33112c0993 from qemu
2018-05-14 08:31:29 -04:00
Richard Henderson
688d0fd0ed
target/arm: Implement CAS and CASP
Backports commit 44ac14b06fa33f60982923b6b8a3bf8dd2fea61d from qemu
2018-05-14 08:28:45 -04:00
Richard Henderson
b23c543e1a
target/arm: Fill in disas_ldst_atomic
This implements all of the v8.1-Atomics instructions except
for compare-and-swap, which is decoded elsewhere.

Backports commit 74608ea45434c9b07055b21885e093528c5ed98c from qemu
2018-05-14 08:18:37 -04:00
Richard Henderson
7ae8671b5e
target/arm: Introduce ARM_FEATURE_V8_ATOMICS and initial decode
The insns in the ARMv8.1-Atomics are added to the existing
load/store exclusive and load/store reg opcode spaces.
Rearrange the top-level decoders for these to accomodate.
The Atomics insns themselves still generate Unallocated.

Backports commit 68412d2ecedbab5a43b0d346cddb27e00d724aff from qemu
2018-05-14 08:15:52 -04:00
Richard Henderson
b2af557a0f
target/arm: Use new min/max expanders
The generic expanders replace nearly identical code in the translator.

Backports commit ecb8ab8d71aab770555a6972428b711400a27248 from qemu
2018-05-14 07:34:52 -04:00
Emilio G. Cota
d26bf1d446
translator: merge max_insns into DisasContextBase
While at it, use int for both num_insns and max_insns to make
sure we have same-type comparisons.

Backports commit b542683d77b4f56cef0221b267c341616d87bce9 from qemu
2018-05-11 13:59:17 -04:00
Richard Henderson
5940a36394
target/arm: Tidy condition in disas_simd_two_reg_misc
Path analysis shows that size == 3 && !is_q has been eliminated.

Fixes: Coverity CID1385853

Backports commit a8766e3172c1671cab297c1ef4566a3c5d094822 from qemu
2018-05-08 08:26:31 -04:00
Richard Henderson
cb324fd039
target/arm: Tidy conditions in handle_vec_simd_shri
The (size > 3 && !is_q) condition is identical to the preceeding test
of bit 3 in immh; eliminate it. For the benefit of Coverity, assert
that size is within the bounds we expect.

Fixes: Coverity CID1385846
Fixes: Coverity CID1385849
Fixes: Coverity CID1385852
Fixes: Coverity CID1385857

Backports commit 8dae46970532afcf93470b00e83ca9921980efc3 from qemu
2018-05-08 08:25:37 -04:00
Peter Maydell
7a3ee5fd95
target/arm: Honour MDCR_EL2.TDE when routing exceptions due to BKPT/BRK
The MDCR_EL2.TDE bit allows the exception level targeted by debug
exceptions to be set to EL2 for code executing at EL0. We handle
this in the arm_debug_target_el() function, but this is only used for
hardware breakpoint and watchpoint exceptions, not for the exception
generated when the guest executes an AArch32 BKPT or AArch64 BRK
instruction. We don't have enough information for a translate-time
equivalent of arm_debug_target_el(), so instead make BKPT and BRK
call a special purpose helper which can do the routing, rather than
the generic exception_with_syndrome helper.

Backports commit c900a2e62dd6dde11c8f5249b638caad05bb15be from qemu
2018-03-25 16:33:04 -04:00
Victor Kamensky
ecd2ecb590
arm/translate-a64: treat DISAS_UPDATE as variant of DISAS_EXIT
In OE project 4.15 linux kernel boot hang was observed under
single cpu aarch64 qemu. Kernel code was in a loop waiting for
vtimer arrival, spinning in TC generated blocks, while interrupt
was pending unprocessed. This happened because when qemu tried to
handle vtimer interrupt target had interrupts disabled, as
result flag indicating TCG exit, cpu->icount_decr.u16.high,
was cleared but arm_cpu_exec_interrupt function did not call
arm_cpu_do_interrupt to process interrupt. Later when target
reenabled interrupts, it happened without exit into main loop, so
following code that waited for result of interrupt execution
run in infinite loop.

To solve the problem instructions that operate on CPU sys state
(i.e enable/disable interrupt), and marked as DISAS_UPDATE,
should be considered as DISAS_EXIT variant, and should be
forced to exit back to main loop so qemu will have a chance
processing pending CPU state updates, including pending
interrupts.

This change brings consistency with how DISAS_UPDATE is treated
in aarch32 case.

Backports commit a75a52d62418dafe462be4fe30485501d1010bb9 from qemu
2018-03-25 16:27:27 -04:00
Lioncash
0dd13de42f
target/arm/translate-a64: Correct bad merge 2018-03-12 11:17:33 -04:00
Richard Henderson
abd86b2287
target/arm: Decode aa64 armv8.3 fcmla
Backports commit d17b7cdcf4ea3e858ceee8b86fc8544bb71561e6 from qemu

Also remember to commit vec_helper.
2018-03-09 01:05:02 -05:00
Richard Henderson
4b39a36416
target/arm: Decode aa64 armv8.3 fcadd
Backports commit 1695cd61b08d4376c11e0658836c4f08b4fc3aa1 from qemu
2018-03-09 00:58:37 -05:00
Richard Henderson
152c9484bd
target/arm: Decode aa64 armv8.1 scalar/vector x indexed element
Backports commit d345df7a3f1336ceb0537c1fa0a7261030426768 from qemu
2018-03-09 00:12:00 -05:00
Lioncash
12fd2cc113
target/arm: Decode aa64 armv8.1 three same extra 2018-03-09 00:10:09 -05:00
Richard Henderson
4f585f71fb
target/arm: Decode aa64 armv8.1 scalar three same extra
Backports commit d9061ec3d27eb940402a7eafee3fb77ce1146ad4 from qemu
2018-03-09 00:02:23 -05:00
Richard Henderson
774cbded7a
target/arm: Refactor disas_simd_indexed size checks
The integer size check was already outside of the opcode switch;
move the floating-point size check outside as well. Unify the
size vs index adjustment between fp and integer paths.

Backports commit 449f264b1749ac0e59c58bbc2eacdb3dc302c2bf from qemu
2018-03-08 23:53:39 -05:00
Richard Henderson
1fd2644738
target/arm: Refactor disas_simd_indexed decode
Include the U bit in the switches rather than testing separately.

Backports commit 5f81b1de43259ed0969e62a7419ab9dd9da2c5c0 from qemu
2018-03-08 23:44:03 -05:00
Alex Bennée
6e41113897
arm/translate-a64: add all single op FP16 to handle_fp_1src_half
This includes FMOV, FABS, FNEG, FSQRT and FRINT[NPMZAXI]. We re-use
existing helpers to achieve this.

Backports commit c2c08713a6a5846bbe601d4d1b4f9708ba77efdc from qemu
2018-03-08 23:44:02 -05:00
Alex Bennée
c6c8a1cccc
arm/translate-a64: implement simd_scalar_three_reg_same_fp16
This covers the encoding group:

Advanced SIMD scalar three same FP16

As all the helpers are already there it is simply a case of calling the
existing helpers in the scalar context.

Backports commit 7c93b7741b29b3ffda81a6e9525771b4409db99f from qemu
2018-03-08 23:44:02 -05:00
Alex Bennée
dd29452046
arm/translate-a64: add all FP16 ops in simd_scalar_pairwise
I only needed to do a little light re-factoring to support the
half-precision helpers.

Backports commit 5c36d89567cfd049a7c59ff219639f788225068f from qemu
2018-03-08 23:44:02 -05:00
Alex Bennée
8bbabd7eb3
arm/translate-a64: add FP16 FMOV to simd_mod_imm
Only one half-precision instruction has been added to this group.

Backports commit 70b4e6a445715519ae55179dc54f6e961ab30c27 from qemu
2018-03-08 23:43:52 -05:00
Alex Bennée
b117df18df
arm/translate-a64: add FP16 FRSQRTE to simd_two_reg_misc_fp16
Backports commit c625ff95070e3ef96bd007de744e1d97c881efeb from qemu
2018-03-08 22:45:39 -05:00
Alex Bennée
fdb07713e6
arm/translate-a64: add FP16 FSQRT to simd_two_reg_misc_fp16
Backports commit b96a54c7e5576bd35b7d00d37b7929d2892d8cac from qemu
2018-03-08 21:57:35 -05:00
Alex Bennée
6102a61b14
arm/translate-a64: add FP16 FRCPX to simd_two_reg_misc_fp16
We go with the localised helper.

Backports commit 986950283837f697b35782b9ac3bc99fca614640 from qemu
2018-03-08 19:15:23 -05:00
Alex Bennée
4ea310c131
arm/translate-a64: add FP16 FRECPE
Now we have added f16 during the re-factoring we can simply call the
helper.

Backports commit fbd06e1e4b6566b4d727f9e553c819d034942f68 from qemu
2018-03-08 19:12:06 -05:00
Alex Bennée
c590ff441c
arm/translate-a64: add FP16 FNEG/FABS to simd_two_reg_misc_fp16
Neither of these operations alter the floating point status registers
so we can do a pure bitwise operation, either squashing any sign
bit (ABS) or inverting it (NEG).

Backports commit 15f8a233c8c023dbc77b6fe6cd7c79eac9bee263 from qemu
2018-03-08 18:51:35 -05:00
Alex Bennée
7161c1ed52
arm/translate-a64: add FP16 SCVTF/UCVFT to simd_two_reg_misc_fp16 2018-03-08 18:48:25 -05:00
Alex Bennée
8ac9e3cff2
arm/translate-a64: add FP16 FCMxx (zero) to simd_two_reg_misc_fp16
I re-use the existing handle_2misc_fcmp_zero handler and tweak it
slightly to deal with the half-precision case.

Backports commit 7d4dd1a73a023f75c893623710e43743501b318e from qemu
2018-03-08 18:32:36 -05:00
Alex Bennée
39a68548d1
arm/translate-a64: add FCVTxx to simd_two_reg_misc_fp16
This covers all the floating point convert operations.

Backports commit 2df581304193d70eaf0d22cf4cb4613f74b6e59b from qemu
2018-03-08 18:25:29 -05:00
Alex Bennée
d5f002b39a
arm/translate-a64: add FP16 FPRINTx to simd_two_reg_misc_fp16
This adds the full range of half-precision floating point to integral
instructions.

Backports commit 6109aea2d954891027acba64a13f1f1c7463cfac from qemu
2018-03-08 18:21:58 -05:00
Alex Bennée
33eda0f5d4
arm/translate-a64: initial decode for simd_two_reg_misc_fp16
This actually covers two different sections of the encoding table:

Advanced SIMD scalar two-register miscellaneous FP16
Advanced SIMD two-register miscellaneous (FP16)

The difference between the two is covered by a combination of Q (bit
30) and S (bit 28). Notably the FRINTx instructions are only
available in the vector form.

This is just the decode skeleton which will be filled out by later
patches.

Backports commit 5d432be6fd6efe37833ac82623c3abd35117b421 from qemu
2018-03-08 18:14:04 -05:00
Alex Bennée
82ffaab7de
arm/translate-a64: add FP16 x2 ops for simd_indexed
A bunch of the vectorised bitwise operations just operate on larger
chunks at a time. We can do the same for the new half-precision
operations by introducing some TWOHALFOP helpers which work on each
half of a pair of half-precision operations at once.

Hopefully all this hoop jumping will get simpler once we have
generically vectorised helpers here.

Backports commit 6089030c7322d8f96b54fb9904e53b0f464bb8fe from qemu
2018-03-08 18:08:39 -05:00
Alex Bennée
38815b2901
arm/translate-a64: add FP16 FMULX/MLS/FMLA to simd_indexed
The helpers use the new re-factored muladd support in SoftFloat for
the float16 work.

Backports commit 5d265064cf30daaacce5a4ce9945fc573015fb5f from qemu
2018-03-08 15:56:20 -05:00
Alex Bennée
c6fda07628
arm/translate-a64: add FP16 pairwise ops simd_three_reg_same_fp16
This includes FMAXNMP, FADDP, FMAXP, FMINNMP, FMINP.

Backports commit 7a2c6e618156674cf9eac8bf36e79f674fbf974e from qemu
2018-03-08 15:50:56 -05:00
Alex Bennée
4b2577537b
arm/translate-a64: add FP16 FR[ECP/SQRT]S to simd_three_reg_same_fp16
As some of the constants here will also be needed
elsewhere (specifically for the upcoming SVE support) we move them out
to softfloat.h.

Backports commit 026e2d6ef74000afb9049f46add4b94f594c8fb3 from qemu
2018-03-08 15:47:34 -05:00
Alex Bennée
a02b9b81a9
arm/translate-a64: add FP16 FMULA/X/S to simd_three_reg_same_fp16
Backports commit 2deb992b767d28035fac3b374c7730494ff0b43d from qemu

Also backports the fp16 changes introduced in commit f566c0474a9b9bbd9ed248607e4007e24d3358c0
2018-03-08 15:42:48 -05:00
Alex Bennée
ba8df54753
arm/translate-a64: add FP16 F[A]C[EQ/GE/GT] to simd_three_reg_same_fp16
These use the generic float16_compare functionality which in turn uses
the common float_compare code from the softfloat re-factor.

Backports commit d32adeae1a71a8e71374fa48d3d6ab0ad4c23e94 from qemu
2018-03-08 12:59:37 -05:00
Alex Bennée
4a6a41d2c5
arm/translate-a64: add FP16 FADD/FABD/FSUB/FMUL/FDIV to simd_three_reg_same_fp16
The fprintf is only there for debugging as the skeleton is added to,
it will be removed once the skeleton is complete.

Backports commit 372087348d561e7f4051d7b32609bda417092ddf from qemu
2018-03-08 12:56:15 -05:00
Alex Bennée
2f850606e9
arm/translate-a64: initial decode for simd_three_reg_same_fp16
This is the initial decode skeleton for the Advanced SIMD three same
instruction group.

The fprintf is purely to aid debugging as the additional instructions
are added. It will be removed once the group is complete.

Backports commit 376e8d6cda985df31c8561db4b7ea365b6fe6f87 from qemu
2018-03-08 12:53:23 -05:00
Alex Bennée
fe74abd307
arm/translate-a64: handle_3same_64 comment fix
We do implement all the opcodes.

Backports commit 3840d219b433507f04a685120ff770ce4e06c55d from qemu
2018-03-08 12:51:01 -05:00
Alex Bennée
af75074fe7
arm/translate-a64: implement half-precision F(MIN|MAX)(V|NMV)
This implements the half-precision variants of the across vector
reduction operations. This involves a re-factor of the reduction code
which more closely matches the ARM ARM order (and handles 8 element
reductions).

Backports commit 807cdd504283c11addcd7ea95ba594bbddc86fe4 from qemu
2018-03-08 12:49:30 -05:00
Alex Bennée
27d8d01566
target/arm/helper: pass explicit fpst to set_rmode
As the rounding mode is now split between FP16 and the rest of
floating point we need to be explicit when tweaking it. Instead of
passing the CPU env we now pass the appropriate fpst pointer directly.

Backports commit 9b04991686785e18b18a36d193b68f08f7c91648 from qemu
2018-03-08 12:41:54 -05:00
Alex Bennée
996f38056f
target/arm/cpu.h: add additional float_status flags
Half-precision flush to zero behaviour is controlled by a separate
FZ16 bit in the FPCR. To handle this we pass a pointer to
fp_status_fp16 when working on half-precision operations. The value of
the presented FPCR is calculated from an amalgam of the two when read.

Backports commit d81ce0ef2c4f1052fcdef891a12499eca3084db7 from qemu
2018-03-08 12:34:39 -05:00
Richard Henderson
1f71084740
target/arm: Handle SVE registers when using clear_vec_high
When storing to an AdvSIMD FP register, all of the high
bits of the SVE register are zeroed. Therefore, call it
more often with is_q as a parameter.

Backports commit 4ff55bcb0ee6452b768835f86d94bd727185f812 from qemu
2018-03-08 09:32:33 -05:00
Richard Henderson
07b928eca4
target/arm: Enforce access to ZCR_EL at translation
This also makes sure that we get the correct ordering of
SVE vs FP exceptions.

Backports commit 490aa7f13a2ad31f92205879c4dc2387b602ef14 from qemu
2018-03-08 09:17:33 -05:00
Richard Henderson
d5c4d3e3c3
target/arm: Enforce FP access to FPCR/FPSR
Backports commit fe03d45f9e9baa89e8c4da50de771767d5d48990 from qemu
2018-03-08 09:14:52 -05:00
Richard Henderson
02516c53ff
target/arm: Add SVE state to TB->FLAGS
Add both SVE exception state and vector length.

Backports commit 1db5e96c54d8b3d1df0a6fed6771390be6b010da from qemu
2018-03-07 11:44:32 -05:00
Richard Henderson
834e3a1d04
target/arm: Expand vector registers for SVE
Change vfp.regs as a uint64_t to vfp.zregs as an ARMVectorReg.
The previous patches have made the change in representation
relatively painless.

Backports commit c39c2b9043ec59516c80f2c6f3e8193e99d04d4b from qemu
2018-03-07 11:33:49 -05:00
Ard Biesheuvel
85e6d710e4
target/arm: implement SM4 instructions
This implements emulation of the new SM4 instructions that have
been added as an optional extension to the ARMv8 Crypto Extensions
in ARM v8.2.

Backports commit b6577bcd251ca0d57ae1de149e3c706b38f21587 from qemu
2018-03-07 08:57:53 -05:00
Ard Biesheuvel
78d15a9cd0
target/arm: implement SM3 instructions
This implements emulation of the new SM3 instructions that have
been added as an optional extension to the ARMv8 Crypto Extensions
in ARM v8.2.

Backports commit 80d6f4c6bbb718f343a832df8dee15329cc7686c from qemu
2018-03-07 08:53:47 -05:00
Ard Biesheuvel
72078a7674
target/arm: implement SHA-3 instructions
This implements emulation of the new SHA-3 instructions that have
been added as an optional extensions to the ARMv8 Crypto Extensions
in ARM v8.2.

Backports commit cd270ade74ea86467f393a9fb9c54c4f1148c28f from qemu
2018-03-07 08:44:47 -05:00
Ard Biesheuvel
66b8b01f09
target/arm: implement SHA-3 instructions
This implements emulation of the new SHA-3 instructions that have
been added as an optional extensions to the ARMv8 Crypto Extensions
in ARM v8.2.

Backports commit cd270ade74ea86467f393a9fb9c54c4f1148c28f from qemu
2018-03-07 08:41:40 -05:00
Ard Biesheuvel
0ef74f6d6d
target/arm: implement SHA-512 instructions
This implements emulation of the new SHA-512 instructions that have
been added as an optional extensions to the ARMv8 Crypto Extensions
in ARM v8.2.

Backports commit 90b827d131812d7f0a8abb13dba1942a2bcee821 from qemu
2018-03-07 08:39:49 -05:00
Richard Henderson
16a0a3e156
target/arm: Use vector infrastructure for aa64 orr/bic immediate
Backports commit 064e265d5680e5c605d6ee8370fc1e8da094e66d from qemu
2018-03-06 16:17:42 -05:00
Richard Henderson
c5c8488928
target/arm: Use vector infrastructure for aa64 multiplies
Backports commit 0c7c55c492c918b6275baa3fee8b176c31465e3c from qemu
2018-03-06 16:14:47 -05:00
Richard Henderson
955fec9300
target/arm: Use vector infrastructure for aa64 compares
Backports commit 79d61de6bdc3980f0efef85f7539e129ab8a4a40 from qemu
2018-03-06 16:10:10 -05:00
Richard Henderson
5626cb714e
target/arm: Use vector infrastructure for aa64 constant shifts
Backports commit cdb45a6063feb5efbb5896795e791dd3db006fbb from qemu
2018-03-06 16:10:10 -05:00
Richard Henderson
a63be5a7aa
target/arm: Use vector infrastructure for aa64 dup/movi
Backports commit 861a1ded24917843b9a5a99ea0a6b37c2c9a1930 from qemu
2018-03-06 16:10:10 -05:00
Richard Henderson
84f848d876
target/arm: Use vector infrastructure for aa64 mov/not/neg
Backports commit 377ef731a85773788ae328e638698d27691bd77d from qemu
2018-03-06 16:10:10 -05:00
Richard Henderson
03b76589b4
target/arm: Use vector infrastructure for aa64 add/sub/logic
Backports commit bc48092f5865c20893bb19200a7a320feac99eeb from qemu
2018-03-06 16:10:10 -05:00
Richard Henderson
b0578edcf7
target/arm: Use pointers in crypto helpers
Rather than passing regnos to the helpers, pass pointers to the
vector registers directly. This eliminates the need to pass in
the environment pointer and reduces the number of places that
directly access env->vfp.regs[].

Backports commit 1a66ac61af45af04688d1d15896737310e366c06 from qemu
2018-03-06 10:10:06 -05:00
Richard Henderson
7fe5f620df
tcg: Dynamically allocate TCGOps
With no fixed array allocation, we can't overflow a buffer.
This will be important as optimizations related to host vectors
may expand the number of ops used.

Use QTAILQ to link the ops together.

Backports commit 15fa08f8451babc88d733bd411d4c94976f9d0f8 from qemu
2018-03-05 16:34:40 -05:00
Richard Henderson
5f074f09ab
tcg: Remove TCGV_UNUSED* and TCGV_IS_UNUSED*
These are now trivial sets and tests against NULL. Unwrap.

Backports commit f764718d0cb30af9f1f8e1d6a33622cc05ca4155 from qemu
2018-03-05 15:58:15 -05:00
Emilio G. Cota
208014df9e
arm/translate-a64: mark path as unreachable to eliminate warning
Fixes the following warning when compiling with gcc 5.4.0 with -O1
optimizations and --enable-debug:

target/arm/translate-a64.c: In function ‘aarch64_tr_translate_insn’:
target/arm/translate-a64.c:2361:8: error: ‘post_index’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
if (!post_index) {
^
target/arm/translate-a64.c:2307:10: note: ‘post_index’ was declared here
bool post_index;
^
target/arm/translate-a64.c:2386:8: error: ‘writeback’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
if (writeback) {
^
target/arm/translate-a64.c:2308:10: note: ‘writeback’ was declared here
bool writeback;
^

Note that idx comes from selecting 2 bits, and therefore its value
can be at most 3.

Backports commit 5ca66278c859bb1ded243755aeead2be6992ce73 from qemu
2018-03-05 11:40:11 -05:00
Stefano Stabellini
1212c9b73c
fix WFI/WFE length in syndrome register
WFI/E are often, but not always, 4 bytes long. When they are, we need to
set ARM_EL_IL_SHIFT in the syndrome register.

Pass the instruction length to HELPER(wfi), use it to decrement pc
appropriately and to pass an is_16bit flag to syn_wfx, which sets
ARM_EL_IL_SHIFT if needed.

Set dc->insn in both arm_tr_translate_insn and thumb_tr_translate_insn.

Backports commit 58803318e5a546b2eb0efd7a053ed36b6c29ae6f from qemu
2018-03-05 11:21:51 -05:00
Emilio G. Cota
5fae6dd433
tcg: remove addr argument from lookup_tb_ptr
It is unlikely that we will ever want to call this helper passing
an argument other than the current PC. So just remove the argument,
and use the pc we already get from cpu_get_tb_cpu_state.

This change paves the way to having a common "tb_lookup" function.

Backports commit 7f11636dbee89b0e4d03e9e2b96e14649a7db778 from qemu
2018-03-05 02:16:34 -05:00
Peter Maydell
f0569ba11a
target/arm: Remove out of date ARM ARM section references in A64 decoder
In the A64 decoder, we have a lot of references to section numbers
from version A.a of the v8A ARM ARM (DDI0487). This version of the
document is now long obsolete (we are currently on revision B.a),
and various intervening versions renumbered all the sections.

The most recent B.a version of the document doesn't assign
section numbers at all to the individual instruction classes
in the way that the various A.x versions did. The simplest thing
to do is just to delete all the out of date C.x.x references.

Backports commit 4ce31af4aeb8471f6a913de7c59d3bde1fc4f03d from qemu
2018-03-05 01:05:53 -05:00
Richard Henderson
c5e952978c
target/arm: Avoid an extra temporary for store_exclusive
Instead of copying addr to a local temp, reuse the value (which we
have just compared as equal) already saved in cpu_exclusive_addr.

Backports commit 37e29a64254bf82a1901784fcca17c25f8164c2f from qemu
2018-03-04 23:17:50 -05:00
Jaroslaw Pelczar
7fded6c15c
AArch64: Fix single stepping of ERET instruction
Previously when single stepping through ERET instruction via GDB
would result in debugger entering the "next" PC after ERET instruction.
When debugging in kernel mode, this will also cause unintended behavior,
because debugger will try to access memory from EL0 point of view.

Backports commit dddbba9943ef6a81c8702e4a50cb0a8b1a4201fe from qemu
2018-03-04 23:15:30 -05:00
Richard Henderson
dd36ec2bbf
target/arm: [a64] Move page and ss checks to init_disas_context
Since AArch64 uses a fixed-width ISA, we can pre-compute the number of
insns remaining on the page. Also, we can check for single-step once.

Backports commit dcc3a21209a8eeae0fe43966012f8e08d3566f98 from qemu
2018-03-04 20:32:45 -05:00
Lluís Vilanova
74d437827b
target/arm: [tcg] Port to generic translation framework
Backports commit 2316922420da6fd0d1ffb5557d0cdcc5958bcf44 from qemu
2018-03-04 20:28:06 -05:00
Lluís Vilanova
cc00feb2df
target/arm: [tcg,a64] Port to disas_log
Incrementally paves the way towards using the generic instruction translation
loop.

Backports commit 58350fa4b2852fede96cfebad0b26bf79bca419c from qemu
2018-03-04 20:09:39 -05:00
Lluís Vilanova
7a02cb360c
target/arm: [tcg,a64] Port to tb_stop
Incrementally paves the way towards using the generic instruction translation
loop.

Backports commit be4079641f1bc755fc5d3ff194cf505c506227d8 from qemu
2018-03-04 20:02:45 -05:00
Lluís Vilanova
665192d96f
target/arm: [tcg,a64] Port to translate_insn
Incrementally paves the way towards using the generic instruction translation
loop.

Backports commit 24299c892cbfe29120f051b6b7d0bcf3e0cc8e85 from qemu
2018-03-04 19:47:54 -05:00
Lluís Vilanova
7b89c4c813
target/arm: [tcg,a64] Port to breakpoint_check
Incrementally paves the way towards using the generic instruction translation
loop.

Backports commit 0cb56b373da70047979b61b042f59aaff4012e1b from qemu
2018-03-04 19:34:06 -05:00
Lluís Vilanova
67e0d99080
target/arm: [tcg,a64] Port to insn_start
Incrementally paves the way towards using the generic instruction translation
loop.

Backports commit a68956ad7f8510bdc0b54793c65c62c6a94570a4 from qemu
2018-03-04 19:31:22 -05:00
Lluís Vilanova
529c6c17f1
target/arm: [tcg,a64] Port to init_disas_context
Incrementally paves the way towards using the generic instruction translation
loop.

Backports commit 5c03990665aa9095e4d2734c8ca0f936a8e8f000 from qemu
2018-03-04 19:17:09 -05:00
Lluís Vilanova
8581e6f6fe
target/arm: [tcg] Port to DisasContextBase
Incrementally paves the way towards using the generic
instruction translation loop.

Backports commit dcba3a8d443842f7a30a2c52d50a6b50b6982b35 from qemu
2018-03-04 19:00:06 -05:00
Paolo Bonzini
6997a5a090
gen-icount: check cflags instead of use_icount global
Backports commit cd42d5b23691ad73edfd6dbcfc935a960a9c5a65 from qemu
2018-03-04 14:26:26 -05:00
Richard Henderson
4a5b1aec34
target/arm: Use DISAS_NORETURN
Fold DISAS_EXC and DISAS_TB_JUMP into DISAS_NORETURN.

In both cases all following code is dead. In the first
case because we have exited the TB via exception; in the
second case because we have exited the TB via goto_tb
and its associated machinery.

Backports commit a0c231e651b249960906f250b8e5eef5ed9888c4 from qemu
2018-03-04 13:57:18 -05:00
Alistair Francis
5d742aad0b
target/arm: Require alignment for load exclusive
According to the ARM ARM exclusive loads require the same alignment as
exclusive stores. Let's update the memops used for the load to match
that of the store. This adds the alignment requirement to the memops.

Backports commit 4a2fdb78e794c1ad93aa9e160235d6a61a2125de from qemu
2018-03-04 01:53:04 -05:00
Richard Henderson
4a8f556c29
target/arm: Correct load exclusive pair atomicity
We are not providing the required single-copy atomic semantics for
the 64-bit operation that is the 32-bit paired load.

At the same time, leave the entire 64-bit value in cpu_exclusive_val
and stop writing to cpu_exclusive_high. This means that we do not
have to re-assemble the 64-bit quantity when it comes time to store.

At the same time, drop a redundant temporary and perform all loads
directly into the cpu_exclusive_* globals.

Backports commit 19514cde3b92938df750acaecf2caaa85e1d36a6 from qemu
2018-03-04 01:49:35 -05:00
Alistair Francis
009a52dd13
target/arm: Correct exclusive store cmpxchg memop mask
When we perform the atomic_cmpxchg operation we want to perform the
operation on a pair of 32-bit registers. Previously we were just passing
the register size in which was set to MO_32. This would result in the
high register to be ignored. To fix this issue we hardcode the size to
be 64-bits long when operating on 32-bit pairs.

Backports commit 955fd0ad5d610f62ba2f4ce46a872bf50434dcf8 from qemu
2018-03-04 01:43:55 -05:00
Lluís Vilanova
32b3c3815d
tcg: Pass generic CPUState to gen_intermediate_code()
Needed to implement a target-agnostic gen_intermediate_code()
in the future.

Backports commit 9c489ea6bed134fecfd556b439c68bba48fbe102 from qemu
2018-03-03 23:34:18 -05:00
Alex Bennée
0bd8dc4e0a
target/arm: use DISAS_EXIT for eret handling
Previously DISAS_JUMP did ensure this but with the optimisation of
8a6b28c7 (optimize indirect branches) we might not leave the loop.
This means if any pending interrupts are cleared by changing IRQ flags
we might never get around to servicing them. You usually notice this
by seeing the lookup_tb_ptr() helper gainfully chaining TBs together
while cpu->interrupt_request remains high and the exit_request has not
been set.

This breaks amongst other things the OPTEE test suite which executes
an eret from the secure world after a non-secure world IRQ has gone
pending which then never gets serviced.

Instead of using the previously implied semantics of DISAS_JUMP we use
DISAS_EXIT which will always exit the run-loop.

Backports commit b29fd33db578decacd14f34933b29aece3e7c25e from qemu
2018-03-03 22:43:16 -05:00
Alex Bennée
65356210a8
target/arm: use gen_goto_tb for ISB handling
While an ISB will ensure any raised IRQs happen on the next
instruction it doesn't cause any to get raised by itself. We can
therefore use a simple tb exit for ISB instructions and rely on the
exit_request check at the top of each TB to deal with exiting if
needed.

Backports commit 0b609cc128ba5ef16cc841bcade898d1898f1dc3 from qemu
2018-03-03 22:42:33 -05:00
Alex Bennée
63d40e1a55
target/arm/translate: make DISAS_UPDATE match declared semantics
DISAS_UPDATE should be used when the wider CPU state other than just
the PC has been updated and we should therefore exit the TCG runtime
and return to the main execution loop rather assuming DISAS_JUMP would
do that.

Backports commit e8d5230221851e8933811f1579fd13371f576955 from qemu
2018-03-03 22:38:07 -05:00
Richard Henderson
42bb73fa96
target/arm: Exit after clearing aarch64 interrupt mask
Exit to cpu loop so we reevaluate cpu_arm_hw_interrupts.

Backports commit 8da54b2507c1cabf60c2de904cf0383b23239231 from qemu
2018-03-03 17:19:40 -05:00
Emilio G. Cota
baa0983ae3
target/aarch64: optimize indirect branches
Measurements:

[Baseline performance is that before applying this and the previous commit]

- NBench, aarch64-softmmu. Host: Intel i7-4790K @ 4.00GHz

1.7x +-+--------------------------------------------------------------------------------------------------------------+-+
| |
| cross |
1.6x +cross+jr.................................................####...................................................+-+
| #++# |
| # # |
1.5x +-+...................................................*****..#...................................................+-+
| *+++* # |
| * * # |
1.4x +-+...................................................*...*..#...................................................+-+
| * * # |
| ##### * * # |
1.3x +-+................................****+++#...........*...*..#...................................................+-+
| *++* # * * # |
| * * # * * # |
1.2x +-+................................*..*...#...........*...*..#...................................................+-+
| * * # * * # |
| #### * * # * * # |
1.1x +-+.......................+++#..#..*..*...#...........*...*..#...................................................+-+
| **** # * * # * * # ****#### |
| * * # * * # * * # ****### +++#### ****### * * # |
1x +-++-++++++-++++****###++-*++*++#++*++*+-+#++****+++++*+++*++#++*++*-+#++*****++#++****###-++*++*-+#++*+-*+++#+-++-+
| *****### * * # * * # * * # *++*### * * # * * # * * # * *++# * * # * * # |
| * *++# * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # |
0.9x +-+---*****###--****###---****###--****####--****###--*****###--****###--*****###--****###---****###--****####---+-+
ASSIGNMENT BITFIELD FOURFP EMULATION HUFFMAN LU DECOMPOSITIONNEURAL NUMERIC SORSTRING SORT hmean
png: http://imgur.com/qO9ubtk
NB. cross here represents the previous commit.

- SPECint06 (test set), aarch64-linux-user. Host: Intel i7-4790K @ 4.00GHz

1.5x +-+--------------------------------------------------------------------------------------------------------------+-+
| ***** |
| *+++* jr |
| * * |
1.4x +-+.....................................................................*...*.....................+++............+-+
| * * | |
| ***** * * | |
| * * * * ***** |
1.3x +-+....................................*...*............................*...*....................*.|.*...........+-+
| +++ * * * * * | * |
| ***** * * * * *+++* |
| * * * * * * * * |
1.2x +-+....................*...*...........*...*............................*...*...........*****....*...*...........+-+
| ***** * * * * * * * * * * +++ |
| * * * * * * * * * * * * ***** |
| * * * * ***** * * * * * * * * * * |
1.1x +-+...*...*............*...*...*...*...*...*............................*...*....+++....*...*....*...*...*...*...+-+
| * * * * * * * * * * ***** * * * * * * |
| * * * * * * * * ***** * * * * * * * * * * |
| * * ***** * * * * * * * * ****** * * * * * * * * * * |
1x +-++-+*+++*-++*+++*++++*+-+*+++*-++*+++*-++*+++*+++*++-*++++*-++*****+++*++-*+++*++-*+++*+-+*++++*+++*++-*+++*+-++-+
| * * * * * * * * * * * * * * *+++* * * * * * * * * * * |
| * * * * * * * * * * * * * * * * * * * * * * * * * * |
| * * * * * * * * * * * * * * * * * * * * * * * * * * |
0.9x +-+---*****---*****----*****---*****---*****---*****---******---*****---*****---*****---*****----*****---*****---+-+
astar bzip2 gcc gobmk h264ref hmmlibquantum mcf omnetpperlbench sjengxalancbmk hmean
png: http://imgur.com/3Dp4vvq

- SPECint06 (train set), aarch64-linux-user. Host: Intel i7-4790K @ 4.00GHz

1.7x +-+--------------------------------------------------------------------------------------------------------------+-+
| |
| jr |
1.6x +-+...............................................................................................+++............+-+
| ***** |
| *+++* |
| * * |
1.5x +-+..............................................................................................*...*...........+-+
| +++ * * |
| ***** * * |
1.4x +-+.....................................................................*+++*....................*...*...........+-+
| * * * * |
| ***** * * * * |
| * * * * ***** * * |
1.3x +-+....................................*...*............................*...*...*...*............*...*...........+-+
| +++ * * * * * * * * |
| ***** * * * * * * ***** * * |
1.2x +-+....................*...*...........*...*............................*...*...*...*...*+++*....*...*...*****...+-+
| * * * * * * * * * * * * *+++* |
| ***** * * ***** * * * * * * * * * * * * |
| * * * * *+++* * * * * * * * * * * * * |
1.1x +-+...*...*............*...*...*...*...*...*............................*...*...*...*...*...*....*...*...*...*...+-+
| * * ***** * * * * * * ***** * * * * * * * * * * |
| * * * * * * * * * * +++ ****** *+++* * * * * * * * * * * |
1x +-+---*****---*****----*****---*****---*****---*****---******---*****---*****---*****---*****----*****---*****---+-+
astar bzip2 gcc gobmk h264ref hmmlibquantum mcf omnetpperlbench sjengxalancbmk hmean
png: http://imgur.com/vRrdc9j

Backports commit e75449a346bf558296966a44277bfd93412c6da6 from qemu
2018-03-03 14:22:12 -05:00
Emilio G. Cota
83ea5b72f2
target/aarch64: optimize cross-page direct jumps in softmmu
Perf numbers in next commit's log.

Backports commit e78722368c721f3c5b8109ed525adac1653ae97b from qemu
2018-03-03 14:20:55 -05:00
Peter Maydell
b7bf752d3c
arm: Add support for M profile CPUs having different MMU index semantics
The M profile CPU's MPU has an awkward corner case which we
would like to implement with a different MMU index.

We can avoid having to bump the number of MMU modes ARM
uses, because some of our existing MMU indexes are only
used by non-M-profile CPUs, so we can borrow one.
To avoid that getting too confusing, clean up the code
to try to keep the two meanings of the index separate.

Instead of ARMMMUIdx enum values being identical to core QEMU
MMU index values, they are now the core index values with some
high bits set. Any particular CPU always uses the same high
bits (so eventually A profile cores and M profile cores will
use different bits). New functions arm_to_core_mmu_idx()
and core_to_arm_mmu_idx() convert between the two.

In general core index values are stored in 'int' types, and
ARM values are stored in ARMMMUIdx types.

Backports commit 8bd5c82030b2cb09d3eef6b444f1620911cc9fc5 from qemu
2018-03-02 18:59:13 -05:00
Richard Henderson
13242af398
target/arm: Fix aa64 ldp register writeback
For "ldp x0, x1, [x0]", if the second load is on a second page and
the second page is unmapped, the exception would be raised with x0
already modified. This means the instruction couldn't be restarted.

Backports commit 2d1bbf51c2cb948da4b6fd5f91cf3ecc80b28156 from qemu
2018-03-02 14:35:46 -05:00
Nick Reilly
4114fb2c0e
Add missing fp_access_check() to aarch64 crypto instructions
The aarch64 crypto instructions for AES and SHA are missing the
check for if the FPU is enabled.

Backports commit a4f5c5b72380deeccd53a6890ea3782f10ca8054 from qemu
2018-03-02 10:39:16 -05:00