unicorn

mirror of https://github.com/yuzu-emu/unicorn.git synced 2024-10-21 01:38:21 +02:00

History

Emilio G. Cota cb92eea81a target-arm: emulate aarch64's LL/SC using cmpxchg helpers Emulating LL/SC with cmpxchg is not correct, since it can suffer from the ABA problem. Portable parallel code, however, is written assuming only cmpxchg--and not LL/SC--is available. This means that in practice emulating LL/SC with cmpxchg is a viable alternative. The appended emulates LL/SC pairs in aarch64 with cmpxchg helpers. This works in both user and system mode. In usermode, it avoids pausing all other CPUs to perform the LL/SC pair. The subsequent performance and scalability improvement is significant, as the plots below show. They plot the throughput of atomic_add-bench compiled for ARM and executed on a 64-core x86 machine. Hi-res plots: http://imgur.com/a/JVc8Y atomic_add-bench: 1000000 ops/thread, [0,1] range 18 ++---------+----------+---------+----------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 16 ++master +-H--+ ++ \|\| \| 14 ++ ++ \| \| \| 12 ++\| ++ \| \| \| 10 ++++ ++ 8 ++E ++ \|+++ \| 6 ++ \| ++ \| \| \| 4 ++ \| ++ \| \| \| 2 +H++E+--- ++ + \| +E++----+E+---+--+E+----++E+------+E+------+E++----+E+---+--+E\| 0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 1000000 ops/thread, [0,2] range 18 ++---------+----------+---------+----------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 16 ++master +-H--+ ++ \| \| \| 14 ++E ++ \| \| \| 12 ++\| ++ \|+++ \| 10 ++ \| ++ 8 ++ \| ++ \| \| \| 6 ++ \| ++ \| \| \| 4 ++ \| ++ \| +E+--- \| 2 +H+ +E+-----+++ +++ +++ ---+E+-----+E+------+++ +++ + +E+---+--+E+----++E+------+E+--- ++++ +++ + +E\| 0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 1000000 ops/thread, [0,128] range 70 ++---------+----------+---------+----------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 60 ++master +-H--+ +++ ---+E+-----+E+------+E+ \| +E+------E-------+E+--- \| \| --- +++ \| 50 ++ +++--- ++ \| -+E+ \| 40 ++ +++---- ++ \| E- \| \| --\| \| 30 ++ -- +++ ++ \| +E+ \| 20 ++E+ ++ \|E+ \| \| \| 10 ++ ++ + + + + + + + \| 0 +HH-H----H-+-----H----+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 1000000 ops/thread, [0,1024] range 160 ++---------+---------+----------+---------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 140 ++master +-H--+ +++ +++ \| -+E+-----+E+-------E\| 120 ++ +++ ---- +++ \| +++ ----E-- \| 100 ++ --E--- +++ ++ \| +++ ---- +++ \| 80 ++ --E-- ++ \| ---- +++ \| \| -+E+ \| 60 ++ ---- +++ ++ \| +E+- \| 40 ++ -- ++ \| +E+ \| 20 +EE+ ++ +++ + + + + + + \| 0 +HH-H---H--+-----H---+----------+---------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads Backports commit 1dd089d0eec060dcd8478735114d98421d414805 from qemu		2018-02-28 00:21:27 -05:00
..
arm_ldst.h	cpu: move exec-all.h inclusion out of cpu.h	2018-02-24 02:39:08 -05:00
cpu64.c	target-arm: Get rid of unused variable warnings	2018-02-23 12:43:09 -05:00
cpu-qom.h	target-arm: make cpu-qom.h not target specific	2018-02-24 00:48:59 -05:00
cpu.c	arm: add Cortex A7 CPU parameters	2018-02-26 03:44:24 -05:00
cpu.h	target-arm: Implement new HLT trap for semihosting	2018-02-26 15:28:45 -05:00
crypto_helper.c
helper-a64.c	target-arm: emulate aarch64's LL/SC using cmpxchg helpers	2018-02-28 00:21:27 -05:00
helper-a64.h	target-arm: emulate aarch64's LL/SC using cmpxchg helpers	2018-02-28 00:21:27 -05:00
helper.c	target-arm: Implement new HLT trap for semihosting	2018-02-26 15:28:45 -05:00
helper.h	target-arm: Implement MRS (banked) and MSR (banked) instructions	2018-02-21 21:50:42 -05:00
internals.h	Fix confusing argument names in some common functions	2018-02-25 03:58:27 -05:00
iwmmxt_helper.c
kvm-consts.h
Makefile.objs
neon_helper.c	target-arm: Fix warn about implicit conversion	2018-02-25 22:44:43 -05:00
op_addsub.h
op_helper.c	Fix masking of PC lower bits when doing exception returns	2018-02-26 08:09:28 -05:00
psci.c	Use #include "..." for our own headers, <...> for others	2018-02-25 04:10:33 -05:00
translate-a64.c	target-arm: emulate aarch64's LL/SC using cmpxchg helpers	2018-02-28 00:21:27 -05:00
translate.c	target-arm: emulate SWP with atomic_xchg helper	2018-02-28 00:11:23 -05:00
translate.h	target-arm: Infrastucture changes to enable handling of tagged address loading into PC	2018-02-26 07:58:17 -05:00
unicorn_aarch64.c	qemu-common: push cpu.h inclusion out of qemu-common.h	2018-02-24 01:50:56 -05:00
unicorn_arm.c	qemu-common: push cpu.h inclusion out of qemu-common.h	2018-02-24 01:50:56 -05:00
unicorn.h