Commit Graph

7024 Commits

Author SHA1 Message Date
Peter Maydell
3eddb77327 target/arm: Convert Neon 2-reg-misc fp-compare-with-zero insns to decodetree
Convert the fp-compare-with-zero insns in the Neon 2-reg-misc group to
decodetree.

Backports commit baa59323e841f76523f6ad4d746cdeb47ea574cd from qemu
2021-02-25 12:59:22 -05:00
Peter Maydell
6eb852ec1c target/arm: Convert simple fp Neon 2-reg-misc insns
Convert the Neon 2-reg-misc insns which are implemented with
simple calls to functions that take the input, output and
fpstatus pointer.

Backports commit 3e96b205286dfb8bbf363229709e4f8648fce379 from qemu
2021-02-25 12:56:28 -05:00
Peter Maydell
3dcee11013 target/arm: Convert Neon VQABS, VQNEG to decodetree
Convert the Neon VQABS and VQNEG insns to decodetree.
Since these are the only ones which need cpu_env passing to
the helper, we wrap the helper rather than creating a whole
new do_2misc_env() function.

Backports commit 4936f38abe6db0a9d23fd04e4cb0cf4d51cff174 from qemu
2021-02-25 12:53:18 -05:00
Peter Maydell
4033a3ca5c target/arm: Convert remaining simple 2-reg-misc Neon ops
Convert the remaining ops in the Neon 2-reg-misc group which
can be implemented simply with our do_2misc() helper.

Backports commit 84eae770af69c37a92496a4c4248875c070d5ee3 from qemu
2021-02-25 12:50:55 -05:00
Peter Maydell
88f8111500 target/arm: Convert Neon 2-reg-misc VREV32 and VREV16 to decodetree
Convert the VREV32 and VREV16 insns in the Neon 2-reg-misc group
to decodetree.

Backports commit 8966808205b59d6c196b380b638475bcd1657ef4 from qemu
2021-02-25 12:49:16 -05:00
Peter Maydell
db1e503708 target/arm: Make gen_swap_half() take separate src and dest
Make gen_swap_half() take a source and destination TCGv_i32 rather
than modifying the input TCGv_i32; we're going to want to be able to
use it with the more flexible function signature, and this also
brings it into line with other functions like gen_rev16() and
gen_revsh().

Backports commit 8ec3de7018a8198624aae49eef5568256114a829 from qemu
2021-02-25 12:40:23 -05:00
Peter Maydell
3c1289c594 target/arm: Fix capitalization in NeonGenTwo{Single, Double}OPFn typedefs
All the other typedefs like these spell "Op" with a lowercase 'p';
remane the NeonGenTwoSingleOPFn and NeonGenTwoDoubleOPFn typedefs to
match.

Backports commit 5de3fd045be11b74cd0fbf36c6d4fb8387d5463b from qemu
2021-02-25 12:38:30 -05:00
Peter Maydell
fa6727ebba target/arm: Rename NeonGenOneOpFn to NeonGenOne64OpFn
The NeonGenOneOpFn typedef breaks with the pattern of the other
NeonGen*Fn typedefs, because it is a TCGv_i64 -> TCGv_i64 operation
but it does not have '64' in its name. Rename it to NeonGenOne64OpFn,
so that the old name is available for a TCGv_i32 -> TCGv_i32 operation
(which we will need in a subsequent commit).

Backports commit 039f4e809ad2772fb33de4511ff68a485d875618 from qemu
2021-02-25 12:34:51 -05:00
Peter Maydell
27e74962e5 target/arm: Convert Neon 2-reg-misc crypto operations to decodetree
Convert the Neon-2-reg misc crypto ops (AESE, AESMC, SHA1H, SHA1SU1)
to decodetree.

Backports commit 0b30dd5b85e20aba259768cb7aaa952b3e319468 from qemu
2021-02-25 12:32:39 -05:00
Peter Maydell
4354448f57 target/arm: Convert vectorised 2-reg-misc Neon ops to decodetree
Convert to decodetree the insns in the Neon 2-reg-misc grouping which
we implement using gvec.

Backports commit 75153179e9928775d5333243ea4b278f438d75ae from qemu
2021-02-25 12:28:31 -05:00
Peter Maydell
6301f9acaa target/arm: Convert Neon VCVT f16/f32 insns to decodetree
Convert the Neon insns in the 2-reg-misc group which are
VCVT between f32 and f16 to decodetree.

Backports commit 654a517355e249435505ae5ff14a7520410cf7a4 from qemu
2021-02-25 12:25:32 -05:00
Peter Maydell
4ca33c54a2 target/arm: Convert Neon 2-reg-misc VSHLL to decodetree
Convert the VSHLL insn in the 2-reg-misc Neon group to decodetree.

Backports commit 749e2be36d75f11d5fa8f8277e2a0569bd2a1c97 from qemu
2021-02-25 12:20:57 -05:00
Peter Maydell
48d57d0dc7 target/arm: Convert Neon narrowing moves to decodetree
Convert the Neon narrowing moves VMQNV, VQMOVN, VQMOVUN in the 2-reg-misc
group to decodetree.

Backports commit 3882bdacb0ad548864b9f2582a32bb5c785e3165 from qemu
2021-02-25 12:18:01 -05:00
Peter Maydell
35d8a3e83f target/arm: Convert VZIP, VUZP to decodetree
Convert the Neon VZIP and VUZP insns in the 2-reg-misc group to
decodetree.

Backports commit 567663a2af2457da8aa74f221b1f3f8a6d2eddf6 from qemu
2021-02-25 12:14:29 -05:00
Peter Maydell
d21fae82ba target/arm: Convert Neon 2-reg-misc pairwise ops to decodetree
Convert the pairwise ops VPADDL and VPADAL in the 2-reg-misc grouping
to decodetree.

At this point we can get rid of the weird CPU_V001 #define that was
used to avoid having to explicitly list all the arguments being
passed to some TCG gen/helper functions.

Backports commit 6106af3aa2304fccee91a3a90138352b0c2af998 from qemu
2021-02-25 12:12:11 -05:00
Peter Maydell
505923e676 target/arm: Convert Neon 2-reg-misc VREV64 to decodetree
Convert the Neon VREV64 insn from the 2-reg-misc grouping to decodetree.

Backports commit 353d2b85058711a5e44c2dc63eb5b620db50a602 from qemu
2021-02-25 12:07:06 -05:00
Alistair Francis
e1f49dc888 target/riscv: Implement checks for hfence
Call the helper_hyp_tlb_flush() function on hfence instructions which
will generate an illegal insruction execption if we don't have
permission to flush the Hypervisor level TLBs.

Backports commit 2761db5fc20943bbd606b6fd49640ac000398de6 from qemu
2021-02-25 12:03:57 -05:00
Alistair Francis
8eb8bc290f target/riscv: Move the hfence instructions to the rvh decode
Also correct the name of the VVMA instruction.

Backports commit b8429ded723ec52568e05f6a24ed78c93224687c from qemu
2021-02-25 11:59:49 -05:00
Alistair Francis
39ff690eff target/riscv: Report errors validating 2nd-stage PTEs
Backports commit 88914473e748db20d8e18b9735f647a683319fa6 from qemu
2021-02-25 11:55:53 -05:00
Alistair Francis
a6c323c912 target/riscv: Set access as data_load when validating stage-2 PTEs
Backports commit efe9f9c820d1322729957a60ff785c9527a79ddf from qemu
2021-02-25 11:54:31 -05:00
Ian Jiang
5c3a2f391c riscv: Add helper to make NaN-boxing for FP register
The function that makes NaN-boxing when a 32-bit value is assigned
to a 64-bit FP register is split out to a helper gen_nanbox_fpr().
Then it is applied in translating of the FLW instruction.

Backports commit 354908cee1f7ff761b5fedbdb6376c378c10f941 from qemu
2021-02-25 11:53:27 -05:00
LC
e5725839b9
Merge pull request #29 from MerryMage/master
arm/translate: Do not tracecode when in an IT block
2021-02-07 16:05:14 -05:00
MerryMage
92243aefd4 arm/translate: Do not tracecode when in an IT block 2021-02-07 19:14:32 +00:00
LC
43ec6383ef
Merge pull request #28 from sunho/fix-until
unicorn: fix uc_emu_start until if end instruction is in another tlb
2020-08-05 12:23:38 -04:00
Sunho Kim
d56c79776e unicorn: fix uc_emu_start until if end instruction is in another tlb 2020-08-05 03:18:51 +09:00
LC
cf84b2ecac
Merge pull request #26 from MerryMage/vec_internal
arm: Add missing file vec_internal.h
2020-06-19 21:09:28 -04:00
MerryMage
9ac17104b8 arm: Add missing file vec_internal.h
Missing from commit 1df7314dc3.

Ported from qemu a04b68e1d4c4f0cd5cd7542697b1b230b84532f5.
2020-06-20 00:12:09 +01:00
Philippe Mathieu-Daudé
4465ff9c93 fpu/softfloat: Silence 'bitwise negation of boolean expression' warning
When building with clang version 10.0.0-4ubuntu1, we get:

CC lm32-softmmu/fpu/softfloat.o
fpu/softfloat.c:3365:13: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
absZ &= ~ ( ( ( roundBits ^ 0x40 ) == 0 ) & roundNearestEven );
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

fpu/softfloat.c:3423:18: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
absZ0 &= ~ ( ( (uint64_t) ( absZ1<<1 ) == 0 ) & roundNearestEven );
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

...

fpu/softfloat.c:4273:18: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
zSig1 &= ~ ( ( zSig2 + zSig2 == 0 ) & roundNearestEven );
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fix by rewriting the fishy bitwise AND of two bools as an int.

Backports commit 4066288694c3bdd175df813cad675a3b5191956b from qemu
2020-06-18 23:56:27 -04:00
Peter Maydell
709610e606 target/arm: Convert Neon VDUP (scalar) to decodetree
Convert the Neon VDUP (scalar) insn to decodetree. (Note that we
can't call this just "VDUP" as we used that already in vfp.decode for
the "VDUP (general purpose register" insn.)

Backports commit 9aaa23c2ae18e6fb9a291b81baf91341db76dfa0 from qemu
2020-06-17 00:43:19 -04:00
Peter Maydell
8de8a4500a target/arm: Convert Neon VTBL, VTBX to decodetree
Convert the Neon VTBL, VTBX instructions to decodetree. The actual
implementation of the insn is copied across to the new trans function
unchanged except for renaming 'tmp5' to 'tmp4'.

Backports commit 54e96c744b70a5d19f14b212a579dd3be8fcaad9 from qemu
2020-06-17 00:39:27 -04:00
Peter Maydell
4731a69d66 target/arm: Convert Neon VEXT to decodetree
Convert the Neon VEXT insn to decodetree. Rather than keeping the
old implementation which used fixed temporaries cpu_V0 and cpu_V1
and did the extraction with by-hand shift and logic ops, we use
the TCG extract2 insn.

We don't need to special case 0 or 8 immediates any more as the
optimizer is smart enough to throw away the dead code.

Backports commit 0aad761fb0aed40c99039eacac470cbd03d07019 from qemu
2020-06-17 00:29:04 -04:00
Peter Maydell
1aa9046120 target/arm: Convert Neon 2-reg-scalar long multiplies to decodetree
Convert the Neon 2-reg-scalar long multiplies to decodetree.
These are the last instructions in the group.

Backports commit 77e576a9281825fc170f3b3af83f47e110549b5c from qemu
2020-06-17 00:24:12 -04:00
Peter Maydell
088a1e8ba9 target/arm: Convert Neon 2-reg-scalar VQRDMLAH, VQRDMLSH to decodetree
Convert the VQRDMLAH and VQRDMLSH insns in the 2-reg-scalar
group to decodetree.

Backports commit aa318f5b9b4ab3b6744b5305dd8ae9b96676f20e from qemu
2020-06-17 00:15:18 -04:00
Peter Maydell
c0551804d4 target/arm: Convert Neon 2-reg-scalar VQDMULH, VQRDMULH to decodetree
Convert the VQDMULH and VQRDMULH insns in the 2-reg-scalar group
to decodetree.

Backports commit b2fc7be972b94872f6a6dd32d9bda1b88ddbcaad from qemu
2020-06-17 00:11:56 -04:00
Peter Maydell
2e8ae1130e target/arm: Convert Neon 2-reg-scalar float multiplies to decodetree
Convert the float versions of VMLA, VMLS and VMUL in the Neon
2-reg-scalar group to decodetree.

Backports commit 85ac9aef9a5418de3168df569e21258e853840a2 from qemu
2020-06-17 00:09:32 -04:00
Peter Maydell
bf1b0374b9 target/arm: Convert Neon 2-reg-scalar integer multiplies to decodetree
Convert the VMLA, VMLS and VMUL insns in the Neon "2 registers and a
scalar" group to decodetree. These are 32x32->32 operations where
one of the inputs is the scalar, followed by a possible accumulate
operation of the 32-bit result.

The refactoring removes some of the oddities of the old decoder:
* operands to the operation and accumulation were often
reversed (taking advantage of the fact that most of these ops
are commutative); the new code follows the pseudocode order
* the Q bit in the insn was in a local variable 'u'; in the
new code it is decoded into a->q

Backports commit 96fc80f5f186decd1a649f6c04252faceb057ad2 from qemu
2020-06-17 00:04:29 -04:00
Peter Maydell
1817f28afd target/arm: Add missing TCG temp free in do_2shift_env_64()
In commit 37bfce81b10450071 we accidentally introduced a leak of a TCG
temporary in do_2shift_env_64(); free it.

Backports commit a4f67e180def790ff0bbb33fc93bb6e80382f041 from qemu
2020-06-16 23:57:17 -04:00
Peter Maydell
06dfc2ada6 target/arm: Add 'static' and 'const' annotations to VSHLL function arrays
Mark the arrays of function pointers in trans_VSHLL_S_2sh() and
trans_VSHLL_U_2sh() as both 'static' and 'const'.

Backports commit 448f0e5f3ecfbd089b934e5e3aa0ccd1f51a6174 from qemu
2020-06-16 23:56:30 -04:00
Peter Maydell
6383a2bd15 target/arm: Convert Neon 3-reg-diff polynomial VMULL
Convert the Neon 3-reg-diff insn polynomial VMULL. This is the last
insn in this group to be converted.

Backports commit 18fb58d588898550919392277787979ee7d0d84e from qemu
2020-06-16 23:54:51 -04:00
Peter Maydell
090426b120 target/arm: Convert Neon 3-reg-diff saturating doubling multiplies
Convert the Neon 3-reg-diff insns VQDMULL, VQDMLAL and VQDMLSL:
these are all saturating doubling long multiplies with a possible
accumulate step.

These are the last insns in the group which use the pass-over-each
elements loop, so we can delete that code.

Backports commit 9546ca5998d3cbd98a81b2d46a2e92a11b0f78a4 from qemu
2020-06-16 23:51:56 -04:00
Peter Maydell
5464405d5c target/arm: Convert Neon 3-reg-diff long multiplies
Convert the Neon 3-reg-diff insns VMULL, VMLAL and VMLSL; these perform
a 32x32->64 multiply with possible accumulate.

Note that for VMLSL we do the accumulate directly with a subtraction
rather than doing a negate-then-add as the old code did.

Backports commit 3a1d9eb07b767a7592abca642af80906f9eab0ed from qemu
2020-06-16 23:47:28 -04:00
Peter Maydell
21044a1d11 target/arm: Convert Neon 3-reg-diff VABAL, VABDL to decodetree
Convert the Neon 3-reg-diff insns VABAL and VABDL to decodetree.
Like almost all the remaining insns in this group, these are
a combination of a two-input operation which returns a double width
result and then a possible accumulation of that double width
result into the destination.

Backports commit f5b28401200ec95ba89552df3ecdcdc342f6b90b from qemu
2020-06-16 23:41:20 -04:00
Peter Maydell
34418f1998 target/arm: Convert Neon 3-reg-diff narrowing ops to decodetree
Convert the narrow-to-high-half insns VADDHN, VSUBHN, VRADDHN,
VRSUBHN in the Neon 3-registers-different-lengths group to
decodetree.

Backports commit 0fa1ab0302badabc3581aefcbb2f189ef52c4985 from qemu
2020-06-16 23:36:18 -04:00
Peter Maydell
d25998ba7d target/arm: Convert Neon 3-reg-diff prewidening ops to decodetree
Convert the "pre-widening" insns VADDL, VSUBL, VADDW and VSUBW
in the Neon 3-registers-different-lengths group to decodetree.
These insns work by widening one or both inputs to double their
size, performing an add or subtract at the doubled size and
then storing the double-size result.

As usual, rather than copying the loop of the original decoder
(which needs awkward code to avoid problems when source and
destination registers overlap) we just unroll the two passes.

Backports commit b28be09570d0827969b62b8f82b0f720a9915427 from qemu
2020-06-16 23:29:53 -04:00
Peter Maydell
a9d0e36bcf target/arm: Fix missing temp frees in do_vshll_2sh
The widenfn() in do_vshll_2sh() does not free the input 32-bit
TCGv, so we need to do this in the calling code.

Backports commit 9593a3988c3e788790aa107d778386b09f456a6d from qemu
2020-06-16 23:26:04 -04:00
Thomas Huth
6053203c1c target/i386: Remove obsolete TODO file
The last real change to this file is from 2012, so it is very likely
that this file is completely out-of-date and ignored today. Let's
simply remove it to avoid confusion if someone finds it by accident.

Backports commit 3575b0aea983ad57804c9af739ed8ff7bc168393 from qemu
2020-06-15 13:22:56 -04:00
Joseph Myers
18b0ae9ebd target/i386: correct fix for pcmpxstrx substring search
This corrects a bug introduced in my previous fix for SSE4.2 pcmpestri
/ pcmpestrm / pcmpistri / pcmpistrm substring search, commit
ae35eea7e4a9f21dd147406dfbcd0c4c6aaf2a60.

That commit fixed a bug that showed up in four GCC tests with one libc
implementation. The tests in question generate random inputs to the
intrinsics and compare results to a C implementation, but they only
test 1024 possible random inputs, and when the tests use the cases of
those instructions that work with word rather than byte inputs, it's
easy to have problematic cases that show up much less frequently than
that. Thus, testing with a different libc implementation, and so a
different random number generator, showed up a problem with the
previous patch.

When investigating the previous test failures, I found the description
of these instructions in the Intel manuals (starting from computing a
16x16 or 8x8 set of comparison results) confusing and hard to match up
with the more optimized implementation in QEMU, and referred to AMD
manuals which described the instructions in a different way. Those
AMD descriptions are very explicit that the whole of the string being
searched for must be found in the other operand, not running off the
end of that operand; they say "If the prototype and the SUT are equal
in length, the two strings must be identical for the comparison to be
TRUE.". However, that statement is incorrect.

In my previous commit message, I noted:

The operation in this case is a search for a string (argument d to
the helper) in another string (argument s to the helper); if a copy
of d at a particular position would run off the end of s, the
resulting output bit should be 0 whether or not the strings match in
the region where they overlap, but the QEMU implementation was
wrongly comparing only up to the point where s ends and counting it
as a match if an initial segment of d matched a terminal segment of
s. Here, "run off the end of s" means that some byte of d would
overlap some byte outside of s; thus, if d has zero length, it is
considered to match everywhere, including after the end of s.

The description "some byte of d would overlap some byte outside of s"
is accurate only when understood to refer to overlapping some byte
*within the 16-byte operand* but at or after the zero terminator; it
is valid to run over the end of s if the end of s is the end of the
16-byte operand. So the fix in the previous patch for the case of d
being empty was correct, but the other part of that patch was not
correct (as it never allowed partial matches even at the end of the
16-byte operand). Nor was the code before the previous patch correct
for the case of d nonempty, as it would always have allowed partial
matches at the end of s.

Fix with a partial revert of my previous change, combined with
inserting a check for the special case of s having maximum length to
determine where it is necessary to check for matches.

In the added test, test 1 is for the case of empty strings, which
failed before my 2017 patch, test 2 is for the bug introduced by my
2017 patch and test 3 deals with the case where a match of an initial
segment at the end of the string is not valid when the string ends
before the end of the 16-byte operand (that is, the case that would be
broken by a simple revert of the non-empty-string part of my 2017
patch).

Backports commit bc921b2711c4e2e8ab99a3045f6c0f134a93b535 from qemu
2020-06-15 13:20:48 -04:00
Joseph Myers
e79024e0cf target/i386: fix IEEE x87 floating-point exception raising
Most x87 instruction implementations fail to raise the expected IEEE
floating-point exceptions because they do nothing to convert the
exception state from the softfloat machinery into the exception flags
in the x87 status word. There is special-case handling of division to
raise the divide-by-zero exception, but that handling is itself buggy:
it raises the exception in inappropriate cases (inf / 0 and nan / 0,
which should not raise any exceptions, and 0 / 0, which should raise
"invalid" instead).

Fix this by converting the floating-point exceptions raised during an
operation by the softfloat machinery into exceptions in the x87 status
word (passing through the existing fpu_set_exception function for
handling related to trapping exceptions). There are special cases
where some functions convert to integer internally but exceptions from
that conversion are not always correct exceptions for the instruction
to raise.

There might be scope for some simplification if the softfloat
exception state either could always be assumed to be in sync with the
state in the status word, or could always be ignored at the start of
each instruction and just set to 0 then; I haven't looked into that in
detail, and it might run into interactions with the various ways the
emulation does not yet handle trapping exceptions properly. I think
the approach taken here, of saving the softfloat state, setting
exceptions there to 0 and then merging the old exceptions back in
after carrying out the operation, is conservatively safe

Backports commit 975af797f1e04e4d1b1a12f1731141d3770fdbce from qemu
2020-06-15 13:19:27 -04:00
Joseph Myers
cb50df6aae target/i386: fix fisttpl, fisttpll handling of out-of-range values
The fist / fistt family of instructions should all store the most
negative integer in the destination format when the rounded /
truncated integer result is out of range or the input is an invalid
encoding, infinity or NaN. The fisttpl and fisttpll implementations
(32-bit and 64-bit results, truncate towards zero) failed to do this,
producing the most positive integer in some cases instead. Fix this
by copying the code used to handle this issue for fistpl and fistpll,
adjusted to use the _round_to_zero functions for the actual
conversion (but without any other changes to that code).

Backports commit c8af85b10c818709755f5dc8061c69920611fd4c from qemu
2020-06-15 13:10:23 -04:00
Joseph Myers
ceaa77e576 target/i386: fix fbstp handling of out-of-range values
The fbstp implementation fails to check for out-of-range and invalid
values, instead just taking the result of conversion to int64_t and
storing its sign and low 18 decimal digits. Fix this by checking for
an out-of-range result (invalid conversions always result in INT64_MAX
or INT64_MIN from the softfloat code, which are large enough to be
considered as out-of-range by this code) and storing the packed BCD
indefinite encoding in that case.

Backports commit 374ff4d0a3c2cce2bc6e4ba8a77eaba55c165252 from qemu
2020-06-15 13:09:23 -04:00