Commit Graph

268 Commits

Author SHA1 Message Date
Richard Henderson
fcce8d4aa1 target/arm: Convert PMULL.8 to gvec
We still need two different helpers, since NEON and SVE2 get the
inputs from different locations within the source vector. However,
we can convert both to the same internal form for computation.

The sve2 helper is not used yet, but adding it with this patch
helps illustrate why the neon changes are helpful.

Backports commit e7e96fc5ec8c79dc77fef522d5226ac09f684ba5 from qemu
2020-03-21 19:35:46 -04:00
Richard Henderson
c00f72f74f target/arm: Convert PMULL.64 to gvec
The gvec form will be needed for implementing SVE2.

Backports commit b9ed510e46f2f9e31e5e8adb4661d5d1cbe9a459 from qemu
2020-03-21 19:27:38 -04:00
Richard Henderson
db8a935b44 target/arm: Convert PMUL.8 to gvec
The gvec form will be needed for implementing SVE2.

Extend the implementation to operate on uint64_t instead of uint32_t.
Use a counted inner loop instead of terminating when op1 goes to zero,
looking toward the required implementation for ARMv8.4-DIT.

Backports commit a21bb78e5817be3f494922e1dadd6455fe5d6318 from qemu
2020-03-21 19:22:18 -04:00
Richard Henderson
d3139f2f0a target/arm: Vectorize USHL and SSHL
These instructions shift left or right depending on the sign
of the input, and 7 bits are significant to the shift. This
requires several masks and selects in addition to the actual
shifts to form the complete answer.

That said, the operation is still a small improvement even for
two 64-bit elements -- 13 vector operations instead of 2 * 7
integer operations.

Backports commit 87b74e8b6edd287ea2160caa0ebea725fa8f1ca1 from qemu
2020-03-21 19:14:17 -04:00
Richard Henderson
12b4e01d9c tcg: Add tcg_gen_gvec_5_ptr
Extend the vector generator infrastructure to handle
5 vector arguments.

Backports commit 2445971604c1cfd3ec484457159f4ac300fb04d2 from qemu
2020-03-21 16:54:01 -04:00
Richard Henderson
d6150127b4 target/arm: Add the hypervisor virtual counter
Backports commit 8c94b071a09c2183f032febff3112f2b7662156c from qemu
2020-03-21 15:35:36 -04:00
Beata Michalska
0716794d86 Memory: Enable writeback for given memory region
Add an option to trigger memory writeback to sync given memory region
with the corresponding backing store, case one is available.
This extends the support for persistent memory, allowing syncing on-demand.

Backports commit 61c490e25e081af39ff40556f6c1229b8b011585 from qemu
2020-01-14 07:44:24 -05:00
Beata Michalska
47776dc862 tcg: cputlb: Add probe_read
Add probe_read alongside the write probing equivalent.

Backports commit 9e70492b4389d4355ae9c9ee2ba6286cfdadc257 from qemu
2020-01-14 07:16:41 -05:00
David Hildenbrand
d9d91c1db6 tcg: Factor out probe_write() logic into probe_access()
Let's also allow to probe other access types.

Backports commit c25c283df0f08582df29f1d5d7be1516b851532d from qemu
2020-01-14 07:07:54 -05:00
Richard Henderson
07f30382c0 cputlb: Handle watchpoints via TLB_WATCHPOINT
The raising of exceptions from check_watchpoint, buried inside
of the I/O subsystem, is fundamentally broken. We do not have
the helper return address with which we can unwind guest state.

Replace PHYS_SECTION_WATCH and io_mem_watch with TLB_WATCHPOINT.
Move the call to cpu_check_watchpoint into the cputlb helpers
where we do have the helper return address.

This allows watchpoints on RAM to bypass the full i/o access path.

Backports commit 50b107c5d617eaf93301cef20221312e7a986701 from qemu
2020-01-14 06:58:33 -05:00
Tony Nguyen
da98d0da4e memory: Access MemoryRegion with endianness
Preparation for collapsing the two byte swaps adjust_endianness and
handle_bswap into the former.

Call memory_region_dispatch_{read|write} with endianness encoded into
the "MemOp op" operand.

This patch does not change any behaviour as
memory_region_dispatch_{read|write} is yet to handle the endianness.

Once it does handle endianness, callers with byte swaps can collapse
them into adjust_endianness.

Backports commit d5d680cacc66ef7e3c02c81dc8f3a34eabce6dfe from qemu
2020-01-07 18:54:11 -05:00
Marc Zyngier
868de52f69 target/arm: Handle trapping to EL2 of AArch32 VMRS instructions
HCR_EL2.TID3 requires that AArch32 reads of MVFR[012] are trapped to
EL2, and HCR_EL2.TID0 does the same for reads of FPSID.
In order to handle this, introduce a new TCG helper function that
checks for these control bits before executing the VMRC instruction.

Tested with a hacked-up version of KVM/arm64 that sets the control
bits for 32bit guests.

Backports commit 9ca1d776cb49c09b09579d9edd0447542970c834 from qemu
2020-01-07 18:04:16 -05:00
Peter Maydell
77d90985cc
target/sparc: Switch to do_transaction_failed() hook
Switch the SPARC target from the old unassigned_access hook to the
new do_transaction_failed hook.

This will cause the "if transaction failed" code paths added in
the previous commits to become active if the access is to an
unassigned address. In particular we'll now handle bus errors
during page table walks correctly (generating a translation
error with the right kind of fault status).

Backports commit f8c3db33a5e863291182f8862ddf81618a7c6194 from qemu
2019-11-28 02:56:50 -05:00
Richard Henderson
3d3d56056b
target/arm: Remove helper_double_saturate
Replace x = double_saturate(y) with x = add_saturate(y, y).
There is no need for a separate more specialized helper.

Backports commit 640581a06d14e2d0d3c3ba79b916de6bc43578b0 from qemu
2019-11-18 20:13:21 -05:00
Richard Henderson
2ea6dfbd63
tcg: Add support for vector compare select
Perform a per-element conditional move. This combination operation is
easier to implement on some host vector units than plain cmp+bitsel.
Omit the usual gvec interface, as this is intended to be used by
target-specific gvec expansion call-backs.

Backports commit f75da2988eb2457fa23d006d573220c5c680ec4e from qemu
2019-05-24 18:21:13 -04:00
Richard Henderson
ca58be9cb4
tcg: Add support for vector bitwise select
This operation performs d = (b & a) | (c & ~a), and is present
on a majority of host vector units. Include gvec expanders.

Backports commit 38dc12947ec9106237f9cdbd428792c985cd86ae from qemu
2019-05-24 18:15:10 -04:00
Richard Henderson
dab0061a0d
tcg: Use CPUClass::tlb_fill in cputlb.c
We can now use the CPUClass hook instead of a named function.

Create a static tlb_fill function to avoid other changes within
cputlb.c. This also isolates the asserts within. Remove the
named tlb_fill function from all of the targets.

Backports commit c319dc13579a92937bffe02ad2c9f1a550e73973 from qemu
2019-05-16 17:35:37 -04:00
Richard Henderson
5d83199931
target/sparc: Convert to CPUClass::tlb_fill
Backports commit e84942f2ceaa79430414f2cb68d77c044dadca96 from qemu
2019-05-16 17:29:35 -04:00
Richard Henderson
31ecdb5341
target/arm: Convert to CPUClass::tlb_fill
Backports commit 7350d553b5066abdc662045d7db5cdb73d0f9d53 from qemu
2019-05-16 16:55:12 -04:00
Richard Henderson
552e48f14e
target/arm: Use tcg_gen_abs_i64 and tcg_gen_gvec_abs
Backports commit 4e027a710673f5d4dc6cff88728bcfd32e4c47b0 from qemu
2019-05-16 16:43:02 -04:00
Richard Henderson
6d5e7856ff
tcg: Add support for vector absolute value
Backports commit bcefc90208f8a1d6f619d61c2647281d92277015 from qemu
2019-05-16 16:33:43 -04:00
Richard Henderson
6d1730048d
tcg: Add support for integer absolute value
Remove a function of the same name from target/arm/.
Use a branchless implementation of abs gleaned from gcc.

Backports commit ff1f11f7f8710a768f9313f24bd7f509d3db27e5 from qemu
2019-05-16 16:25:15 -04:00
Richard Henderson
79b9dc559e
tcg: Add gvec expanders for vector shift by scalar
Allow expansion either via shift by scalar or by replicating
the scalar for shift by vector.

Backports commit b4578cd91cda4cef1c413304353ca6dc5b957b60 from qemu
2019-05-16 16:17:58 -04:00
Richard Henderson
8c17687934
tcg: Add gvec expanders for variable shift
The gvec expanders perform a modulo on the shift count. If the target
requires alternate behaviour, then it cannot use the generic gvec
expanders anyway, and will have to have its own custom code.

Backports commit 5ee5c14cacda27e904cd6b0d9e7ffe1acff42838 from qemu
2019-05-16 15:51:09 -04:00
Richard Henderson
66e6bea084
tcg: Add INDEX_op_dupm_vec
Allow the backend to expand dup from memory directly, instead of
forcing the value into a temp first. This is especially important
if integer/vector register moves do not exist.

Note that officially tcg_out_dupm_vec is allowed to fail.
If it did, we could fix this up relatively easily:

VECE == 32/64:
Load the value into a vector register, then dup.
Both of these must work.

VECE == 8/16:
If the value happens to be at an offset such that an aligned
load would place the desired value in the least significant
end of the register, go ahead and load w/garbage in high bits.

Load the value w/INDEX_op_ld{8,16}_i32.
Attempt a move directly to vector reg, which may fail.
Store the value into the backing store for OTS.
Load the value into the vector reg w/TCG_TYPE_I32, which must work.
Duplicate from the vector reg into itself, which must work.

All of which is well and good, except that all supported
hosts can support dupm for all vece, so all of the failure
paths would be dead code and untestable.

Backports commit 37ee55a081b7863ffab2151068dd1b2f11376914 from qemu
2019-05-16 15:38:02 -04:00
Richard Henderson
c54b2776f6
tcg: Specify optional vector requirements with a list
Replace the single opcode in .opc with a null-terminated
array in .opt_opc. We still require that all opcodes be
used with the same .vece.

Validate the contents of this list with CONFIG_DEBUG_TCG.
All tcg_gen_*_vec functions will check any list active
during .fniv expansion. Swap the active list in and out
as we expand other opcodes, or take control away from the
front-end function.

Convert all existing vector aware front ends.

Backports commit 53229a7703eeb2bbe101a19a33ef22aaf960c65b from qemu
2019-05-16 15:05:02 -04:00
David Hildenbrand
f3b4a64d27
tcg: Implement tcg_gen_gvec_3i()
Let's add tcg_gen_gvec_3i(), similar to tcg_gen_gvec_2i(), however
without introducing "gen_helper_gvec_3i *fnoi", as it isn't needed
for now.

Backports commit e1227bb6e59173117f094a6a13b998587b45c928 from qemu
2019-05-16 14:26:50 -04:00
Peter Maydell
77ae3982b4
target/arm: Implement VLLDM for v7M CPUs with an FPU
Implement the VLLDM instruction for v7M for the FPU present cas.

Backports commit 956fe143b4f254356496a0a1c479fa632376dfec from qemu
2019-04-30 11:27:54 -04:00
Peter Maydell
b483951046
target/arm: Implement VLSTM for v7M CPUs with an FPU
Implement the VLSTM instruction for v7M for the FPU present case.

Backports commit 019076b036da4444494de38388218040d9d3a26c from qemu
2019-04-30 11:25:44 -04:00
Peter Maydell
a976d7642a
target/arm: Implement M-profile lazy FP state preservation
The M-profile architecture floating point system supports
lazy FP state preservation, where FP registers are not
pushed to the stack when an exception occurs but are instead
only saved if and when the first FP instruction in the exception
handler is executed. Implement this in QEMU, corresponding
to the check of LSPACT in the pseudocode ExecuteFPCheck().

Backports commit e33cf0f8d8c9998a7616684f9d6aa0d181b88803 from qemu
2019-04-30 11:21:50 -04:00
David Hildenbrand
458942d94e
tcg: Implement tcg_gen_extract2_{i32,i64}
Will be helpful for s390x. Input 128 bit and output 64 bit only,
which is sufficient for now.

Backports commit 2089fcc9e7b4174d1c351eaa7d277c02188a6dd2 from qemu
2019-04-30 09:20:45 -04:00
Lioncash
d6b706a296
qemu/fpu: Synchronize with Qemu
Resolves a few formatting discrepancies
2019-03-09 18:27:31 -05:00
Lioncash
b6f752970b
target/riscv: Initial introduction of the RISC-V target
This ports over the RISC-V architecture from Qemu. This is currently a
very barebones transition. No code hooking or any fancy stuff.
Currently, you can feed it instructions and query the CPU state itself.

This also allows choosing whether or not RISC-V 32-bit or RISC-V 64-bit
is desirable through Unicorn's interface as well.

Extremely basic examples of executing a single instruction have been
added to the samples directory to help demonstrate how to use the basic
functionality.
2019-03-08 21:46:10 -05:00
Richard Henderson
f116560d2c
target/arm: Implement ARMv8.5-FRINT
Backports 6bea25631af92531027d3bf3ef972a4d51d62e7c from qemu.
2019-03-05 23:17:33 -05:00
Richard Henderson
45c297c99b
target/arm: Add set/clear_pstate_bits, share gen_ss_advance
We do not need an out-of-line helper for manipulating bits in pstate.
While changing things, share the implementation of gen_ss_advance.

Backports commit 22ac3c49641f6eed93dca5b852030b4d3eacf6c4 from qemu
2019-03-05 22:55:22 -05:00
Richard Henderson
60742608f5
target/arm: Split helper_msr_i_pstate into 3
The EL0+UMA check is unique to DAIF. While SPSel had avoided the
check by nature of already checking EL >= 1, the other post v8.0
extensions to MSR (imm) allow EL0 and do not require UMA. Avoid
the unconditional write to pc and use raise_exception_ra to unwind.

Backports commit ff730e9666a716b669ac4a8ca7c521177d1d2b15 from qemu
2019-03-05 22:45:11 -05:00
Richard Henderson
5473c3603f
target/arm: Add helpers for FMLAL
Note that float16_to_float32 rightly squashes SNaN to QNaN.
But of course pickNaNMulAdd, for ARM, selects SNaNs first.
So we have to preserve SNaN long enough for the correct NaN
to be selected. Thus float16_to_float32_by_bits.

Backports commit a4e943a716d5fac923d82df3eabc65d1e3624019 from qemu
2019-02-28 15:31:48 -05:00
Lioncash
4f210d0731
header_gen: Add float128_to_uint32 to the list
Avoids multiple definition errors.
2019-02-28 15:19:44 -05:00
Lioncash
b9da32241b
header_gen: Correct multiple definition errors 2019-02-27 17:03:28 -05:00
Richard Henderson
f3cb92c86c
target/arm: Use vector operations for saturation
For same-sign saturation, we have tcg vector operations. We can
compute the QC bit by comparing the saturated value against the
unsaturated value.

Backports commit 89e68b575e138d0af1435f11a8ffcd8779c237bd from qemu
2019-02-15 18:14:09 -05:00
Alex Bennée
babf31dfa0
target/arm: expose CPUID registers to userspace
A number of CPUID registers are exposed to userspace by modern Linux
kernels thanks to the "ARM64 CPU Feature Registers" ABI. For QEMU's
user-mode emulation we don't need to emulate the kernels trap but just
return the value the trap would have done. To avoid too much #ifdef
hackery we process ARMCPRegInfo with a new helper (modify_arm_cp_regs)
before defining the registers. The modify routine is driven by a
simple data structure which describes which bits are exported and
which are fixed.

Backports commit 6c5c0fec29bbfe36c64eca1edfd8455be46b77c6 from qemu
2019-02-15 17:27:30 -05:00
Lioncash
572252fcfd
header_gen: Remove deposit32/64 from the list
These are always inlined.
2019-01-30 14:05:52 -05:00
Richard Henderson
fb684825c8
tcg: Add opcodes for vector minmax arithmetic
Backports commit dd0a0fcdd8848c2a18970c44a62bd8f394c2b495 from qemu
2019-01-29 16:24:52 -05:00
Richard Henderson
e0266239ea
tcg: Add opcodes for vector saturated arithmetic
Backports commit 8afaf0506606f8003ef696df849c5a98637a7a83 from qemu
2019-01-29 16:14:34 -05:00
Richard Henderson
e08d0feee4
tcg: Add gvec expanders for nand, nor, eqv
Backports commit f550805d8309500d642f640af8d9928958465478 from qemu
2019-01-29 15:57:28 -05:00
Aaron Lindsay
001283c45b
target/arm: Reorganize PMCCNTR accesses
pmccntr_read and pmccntr_write contained duplicate code that was already
being handled by pmccntr_sync. Consolidate the duplicated code into two
functions: pmccntr_op_start and pmccntr_op_finish. Add a companion to
c15_ccnt in CPUARMState so that we can simultaneously save both the
architectural register value and the last underlying cycle count - this
ensures time isn't lost and will also allow us to access the 'old'
architectural register value in order to detect overflows in later
patches.

Backports commit 5d05b9d462666ed21b7fef61aa45dec9aaa9f0ff from qemu
2019-01-22 16:57:29 -05:00
Peter Maydell
92bf8ee620
target/arm: Correctly implement handling of HCR_EL2.{VI, VF}
In commit 8a0fc3a29fc2315325400 we tried to implement HCR_EL2.{VI,VF},
but we got it wrong and had to revert it.

In that commit we implemented them as simply tracking whether there
is a pending virtual IRQ or virtual FIQ. This is not correct -- these
bits cause a software-generated VIRQ/VFIQ, which is distinct from
whether there is a hardware-generated VIRQ/VFIQ caused by the
external interrupt controller. So we need to track separately
the HCR_EL2 bit state and the external virq/vfiq line state, and
OR the two together to get the actual pending VIRQ/VFIQ state.

Fixes: 8a0fc3a29fc2315325400c738f807d0d4ae0ab7f

Backports commit 89430fc6f80a5aef1d4cbd6fc26b40c30793786c from qemu
2018-11-16 21:53:53 -05:00
Marc-André Lureau
fc354aa464
memory: learn about non-volatile memory region
Add a new flag to mark memory region that are used as non-volatile, by
NVDIMM for example. That bit is propagated down to the flat view, and
reflected in HMP info mtree with a "nv-" prefix on the memory type.

This way, guest_phys_blocks_region_add() can skip the NV memory
regions for dumps and TCG memory clear in a following patch.

Backports commit c26763f8ec70b1011098cab0da9178666d8256a5 from qemu
2018-11-11 08:50:39 -05:00
Emilio G. Cota
dfb3954571
exec: introduce tlb_init
Paves the way for the addition of a per-TLB lock.

Backports commit 5005e2537d090bee87aca3b924dcd17920fd146a from qemu
2018-10-23 14:41:29 -04:00
Peter Maydell
ca5d7b8fd2
target/arm: Add v8M stack checks on ADD/SUB/MOV of SP
Add code to insert calls to a helper function to do the stack
limit checking when we handle these forms of instruction
that write to SP:
* ADD (SP plus immediate)
* ADD (SP plus register)
* SUB (SP minus immediate)
* SUB (SP minus register)
* MOV (register)

Backports commit 5520318939fea5d659bf808157cd726cb967b761 from qemu
2018-10-08 14:15:15 -04:00