target/arm: Allow M-profile CPUs with FP16 to set FPSCR.FP16

M-profile CPUs with half-precision floating point support should
be able to write to FPSCR.FZ16, but an M-profile specific masking
of the value at the top of vfp_set_fpscr() currently prevents that.
This is not yet an active bug because we have no M-profile
FP16 CPUs, but needs to be fixed before we can add any.

The bits that the masking is effectively preventing from being
set are the A-profile only short-vector Len and Stride fields,
plus the Neon QC bit. Rearrange the order of the function so
that those fields are handled earlier and only under a suitable
guard; this allows us to drop the M-profile specific masking,
making FZ16 writeable.

This change also makes the QC bit correctly RAZ/WI for older
no-Neon A-profile cores.

This refactoring also paves the way for the low-overhead-branch
LTPSIZE field, which uses some of the bits that are used for
A-profile Stride and Len.

Backports commit d31e2ce68d56f5bcc83831497e5fe4b8a7e18e85
This commit is contained in:
Peter Maydell 2021-03-01 20:33:20 -05:00 committed by Lioncash
parent 3ae5543825
commit 8a6e118a17

View File

@ -197,36 +197,45 @@ void HELPER(vfp_set_fpscr)(CPUARMState *env, uint32_t val)
val &= ~FPCR_FZ16;
}
if (arm_feature(env, ARM_FEATURE_M)) {
vfp_set_fpscr_to_host(env, val);
if (!arm_feature(env, ARM_FEATURE_M)) {
/*
* M profile FPSCR is RES0 for the QC, STRIDE, FZ16, LEN bits
* and also for the trapped-exception-handling bits IxE.
* Short-vector length and stride; on M-profile these bits
* are used for different purposes.
* We can't make this conditional be "if MVFR0.FPShVec != 0",
* because in v7A no-short-vector-support cores still had to
* allow Stride/Len to be written with the only effect that
* some insns are required to UNDEF if the guest sets them.
*
* TODO: if M-profile MVE implemented, set LTPSIZE.
*/
val &= 0xf7c0009f;
env->vfp.vec_len = extract32(val, 16, 3);
env->vfp.vec_stride = extract32(val, 20, 2);
}
vfp_set_fpscr_to_host(env, val);
if (arm_feature(env, ARM_FEATURE_NEON)) {
/*
* The bit we set within fpscr_q is arbitrary; the register as a
* whole being zero/non-zero is what counts.
* TODO: M-profile MVE also has a QC bit.
*/
env->vfp.qc[0] = val & FPCR_QC;
env->vfp.qc[1] = 0;
env->vfp.qc[2] = 0;
env->vfp.qc[3] = 0;
}
/*
* We don't implement trapped exception handling, so the
* trap enable bits, IDE|IXE|UFE|OFE|DZE|IOE are all RAZ/WI (not RES0!)
*
* If we exclude the exception flags, IOC|DZC|OFC|UFC|IXC|IDC
* (which are stored in fp_status), and the other RES0 bits
* in between, then we clear all of the low 16 bits.
* The exception flags IOC|DZC|OFC|UFC|IXC|IDC are stored in
* fp_status; QC, Len and Stride are stored separately earlier.
* Clear out all of those and the RES0 bits: only NZCV, AHP, DN,
* FZ, RMode and FZ16 are kept in vfp.xregs[FPSCR].
*/
env->vfp.xregs[ARM_VFP_FPSCR] = val & 0xf7c80000;
env->vfp.vec_len = (val >> 16) & 7;
env->vfp.vec_stride = (val >> 20) & 3;
/*
* The bit we set within fpscr_q is arbitrary; the register as a
* whole being zero/non-zero is what counts.
*/
env->vfp.qc[0] = val & FPCR_QC;
env->vfp.qc[1] = 0;
env->vfp.qc[2] = 0;
env->vfp.qc[3] = 0;
}
void vfp_set_fpscr(CPUARMState *env, uint32_t val)