unicorn

mirror of https://github.com/yuzu-emu/unicorn.git synced 2024-10-20 09:58:16 +02:00

Author	SHA1	Message	Date
Richard Henderson	f5a35908da	tcg: Add tcg_gen_mulsu2_{i32,i64,tl} This multiply has one signed input and one unsigned input, producing the full double-width result. Backports commit 5087abfb7dfd1d368ae6939420057036b4d8e509 from qemu	2018-03-01 08:39:37 -05:00
Richard Henderson	f9d91a81b5	target-sparc: Use tcg_gen_atomic_cmpxchg_tl Backports commit 5a7267b6a9e94c264ca77a7ca5a239e70dac81da from qemu	2018-03-01 08:34:35 -05:00
Richard Henderson	47313adedd	target-sparc: Use tcg_gen_atomic_xchg_tl Backports commit da1bcae65288bdd51e0a7203d1e6c9cde1be5b3d from qemu	2018-03-01 08:32:10 -05:00
Richard Henderson	6b09040e23	target-sparc: Remove MMU_MODE*_SUFFIX The functions that these generate are no longer used. Backports commit 47b2696b975b794c6fa7b9fa8ae4699e749d662c from qemu	2018-03-01 08:30:27 -05:00
Richard Henderson	00fc847229	target-sparc: Allow 4-byte alignment on fp mem ops The cpu is allowed to require stricter alignment on these 8- and 16-byte operations, and the OS is required to fix up the accesses as necessary, so the previous code was not wrong. However, we can easily handle this misalignment for all direct 8-byte operations and for direct 16-byte loads. We must retain 16-byte alignment for 16-byte stores, so that we don't have to probe for writability of a second page before performing the first of two 8-byte stores. We also retain 8-byte alignment for no-fault loads, since they are rare and it's not worth extending the helpers for this. Backports commit cb21b4da6cca1bb4e3f5fefb698fb9e4d00c8f66 from qemu	2018-03-01 08:29:11 -05:00
Richard Henderson	eec264526e	target-sparc: Implement ldqf and stqf inline At the same time, fix a problem with stqf_asi, when a write might access two pages. Backports commit f939ffe5a022a8798824e2720ed5a14186fca6b6 from qemu	2018-03-01 08:20:36 -05:00
Richard Henderson	3a25695841	target-sparc: Remove asi helper code handled inline Now that we never call out to helpers when direct accesses can handle an asi, remove the corresponding code in those helpers. For ldda, this removes the entire helper. Backports commit 918d9a2c9d36378a3cf6636018900a4731c83b9d from qemu	2018-03-01 08:14:31 -05:00
Richard Henderson	15c8bf0b42	target-sparc: Implement BCOPY/BFILL inline Backports commit 34810610acbde7a0745be3a88e99f2ef9282260f from qemu	2018-02-28 12:54:10 -05:00
Richard Henderson	3c48eb4aaf	target-sparc: Implement cas_asi/casx_asi inline Backports commit 7268adebfda6548b8ae6865dc8337f116a5d266d from qemu	2018-02-28 12:47:26 -05:00
Richard Henderson	b28b5cd3d3	target-sparc: Implement ldstub_asi inline Backports commit fbb4bbb62e5603c991b880e25dc4bb30d342b944 from qemu	2018-02-28 12:42:26 -05:00
Richard Henderson	adf9faf075	target-sparc: Implement swap_asi inline Backports commit 4fb554bc6c88eb45270a3ad3cf6e6e2ad476aede from qemu	2018-02-28 12:39:55 -05:00
Richard Henderson	ebc292c174	target-sparc: Handle more twinx asis As used by HelenOS, presumably for ultra 2 and 3, prior to the sun4v platform and the current twinx names. Backports commit 34a6e13da70b2c798630a8dbd03d09f201c0198f from qemu	2018-02-28 12:28:08 -05:00
Richard Henderson	ecbeea7c56	target-sparc: Use MMU_PHYS_IDX for bypass asis Backports commit 7f87c90527d7363e8cecf1c6b5ad3d4cc85d3d28 from qemu	2018-02-28 12:26:29 -05:00
Richard Henderson	15eea419e5	target-sparc: Add MMU_PHYS_IDX It's handy to have a mmu idx for physical addresses, so that mmu disabled and physical access asis can use the same path as normal accesses. Backports commit af7a06bac7d3abb2da48ef3277d2a415772d2ae8 from qemu	2018-02-28 12:24:17 -05:00
Richard Henderson	9e60a8e432	target-sparc: Introduce cpu_raise_exception_ra Several helpers call helper_raise_exception directly, which requires in turn that their callers have performed save_state. The new function allows a TCG return address to be passed in so that we can restore PC + NPC + flags data from that. This fixes a bug in the usage of helper_check_align, whose callers had not been calling save_state. It fixes another bug in which the divide helpers used GETPC at a level other than the direct callee from TCG. This allows the translator to avoid save_state prior to SAVE, RESTORE, and FLUSHW instructions. Backports commit 2f9d35fc4006122bad33f9ae3e2e51d2263e98ee from qemu	2018-02-28 12:15:06 -05:00
Richard Henderson	62ae2a5102	target-sparc: Use overalignment flags for twinx and block asis This allows us to enforce 16 and 64-byte alignment without any extra overhead. Backports commit 808832277af11dafee5a55da2b9e41d019b879ca from qemu	2018-02-28 12:01:50 -05:00
Alex Bennée	da124da4b1	tcg: move locking for tb_invalidate_phys_page_range up In the linux-user case all things that involve ''l1_map' and PageDesc tweaks are protected by the memory lock (mmpa_lock). For SoftMMU mode we previously relied on single threaded behaviour, with MTTCG we now use the tb_lock(). As a result we need to do a little re-factoring and push the taking of this lock up the call tree. This requires a slightly different entry for the SoftMMU and user-mode cases from tb_invalidate_phys_range. This also means user-mode breakpoint insertion needs to take two locks but it hadn't taken any previously so this is an improvement. Backpoirts commit ba051fb5e56d5ff5e4fa672d37954452e58543b2 from qemu	2018-02-28 10:35:41 -05:00
Paolo Bonzini	9d64a89acf	tcg: comment on which functions have to be called with tb_lock held softmmu requires more functions to be thread-safe, because translation blocks can be invalidated from e.g. notdirty callbacks. Probably the same holds for user-mode emulation, it's just that no one has ever tried to produce a coherent locking there. This patch will guide the introduction of more tb_lock and tb_unlock calls for system emulation. Note that after this patch some (most) of the mentioned functions are still called outside tb_lock/tb_unlock. The next one will rectify this. Backports commit 7d7500d99895f888f97397ef32bb536bb0df3b74 from qemu	2018-02-28 10:26:28 -05:00
Alex Bennée	7aab0bd9a6	translate-all: add DEBUG_LOCKING asserts This adds asserts to check the locking on the various translation engines structures. There are two sets of structures that are protected by locks. The first the l1map and PageDesc structures used to track which translation blocks are associated with which physical addresses. In user-mode this is covered by the mmap_lock. The second case are TB context related structures which are protected by tb_lock which is also user-mode only. Currently the asserts do nothing in SoftMMU mode but this will change for MTTCG. Backports commit 301e40ed8005306c009978be295ed9a4b725178b from qemu	2018-02-28 08:56:15 -05:00
Alex Bennée	075aaad106	translate_all: DEBUG_FLUSH -> DEBUG_TB_FLUSH Make the debug define consistent with the others. The flush operation is all about invalidating TranslationBlocks on flush events. Also fix up the commenting on the other DEBUG for the benefit of checkpatch. Backports commit 955939a2b51f72bea1c200b559ea39985df5a633 from qemu	2018-02-28 08:53:38 -05:00
Anand J	8278af45cd	clean-up: removed duplicate #includes Some files contain multiple #includes of the same header file. Removed most of those unnecessary duplicate entries using scripts/clean-includes. Backports commit 814bb12a561d36aeb5ae4440ad43d2b0761d76da from qemu	2018-02-28 08:51:56 -05:00
Wei Huang	bceed21d23	arm: Add an option to turn on/off vPMU support This patch adds a pmu=[on/off] option to enable/disable vPMU support in guest vCPU. It allows virt tools, such as libvirt, to determine the exsitence of vPMU and configure it. Note this option is only available for cortex-a57/cortex-53/ host CPUs, but unavailable on ARMv7 and other processors. Also even though "pmu=" option is available for TCG mode, setting it doesn't turn PMU on. Backports commit 929e754d5a621cd53f30e69b766ccf381b58d124 from qemu	2018-02-28 08:49:23 -05:00
Laurent Vivier	5daf91ea48	target-m68k: immediate ops manage word and byte operands Backports commit 92c62548f69cb4ba739d7d046e9caf9ea75753e4 from qemu	2018-02-28 08:42:22 -05:00
Laurent Vivier	f7c29f73b3	target-m68k: cmp manages word and bytes operands Backports commit ff99b952c8280853801fe14f7ae62d0f87464f7d from qemu	2018-02-28 08:37:46 -05:00
Laurent Vivier	fc28e8127f	target-m68k: add/sub manage word and byte operands Backports commit 8a370c6cb770b618f7eb66628116c25e84588df8 from qemu	2018-02-28 07:18:25 -05:00
Laurent Vivier	bc27695926	target-m68k: add addressing modes to neg Backports commit 227de713e0f4224a82c32991b4e4c4973381426b from qemu	2018-02-28 07:07:28 -05:00
Laurent Vivier	3558b93f11	target-m68k: introduce byte and word cc_ops Backports commit db3d7945ae7992c91cc5705dccf60fec79b24dc4 from qemu	2018-02-28 06:52:16 -05:00
Laurent Vivier	4e257ffda9	target-m68k: some bit ops cleanup Backports commit 3c980d2ef664e6d5a1a0c98aca4d11d33b17ca59 from qemu	2018-02-28 01:25:58 -05:00
Laurent Vivier	cfab571859	target-m68k: suba/adda can manage word operand Backports commit 415f4b62eb4629bd3702e6fb8aa51437a92983ff from qemu	2018-02-28 01:20:23 -05:00
Laurent Vivier	99c297efe3	target-m68k: and can manage word and byte operands Backports commit 52dc23c5956159a79a4e2d4193e44d2c4cf3883c from qemu	2018-02-28 01:19:02 -05:00
Laurent Vivier	41372b0cc9	target-m68k: or can manage word and byte operands Backports commit 020a4659208a6f9a985881504fd4d3b44ab589be from qemu	2018-02-28 01:15:27 -05:00
Laurent Vivier	e140aac281	target-m68k: eor can manage word and byte operands Backports commit eec37aec85af9f5fd59b534d20c86a775b8e7973 from qemu	2018-02-28 01:05:21 -05:00
Laurent Vivier	bc52777b00	target-m68k: add addressing modes to not Backports commit ea4f2a844132c81f1e6b51fed7019686ce4e3bc5 from qemu	2018-02-28 01:03:38 -05:00
Richard Henderson	549e31cc72	target-m68k: Inline addx, subx, negx And add opcodes for 680x0 Backports commit a665a820e5d46b1611f409fbc7a540fe1c6bf5c8 from qemu	2018-02-28 01:02:31 -05:00
Laurent Vivier	b796f934ff	target-m68k: add dbcc Backports commit beff27ab3a60d8abab4a166670ca79b3c0970005 from qemu	2018-02-28 00:45:39 -05:00
Laurent Vivier	977c3fe6c4	target-m68k: add addressing modes to scc Backports commit d5a3cf33f2f65069d2f79a6e349f0d8140f02bb4 from qemu	2018-02-28 00:43:30 -05:00
Laurent Vivier	77b1754376	target-m68k: add exg ops Backports commit 29cf437da4eeacb46cd7076014d06c85ca47c91d from qemu	2018-02-28 00:37:41 -05:00
Laurent Vivier	56882899be	target-m68k: add linkl Backports commit c630e436c0ed3adc3a858c328119daf6d1b3357f from qemu	2018-02-28 00:31:27 -05:00
Laurent Vivier	59d6a1a744	target-m68k: add bkpt instruction Backports commit 71600eda7cc48f03ea306bc69ed7e52ef1d9dd91 from qemu	2018-02-28 00:29:41 -05:00
Emilio G. Cota	22be035e60	target-arm: remove EXCP_STREX + cpu_exclusive_{test, info} The exception is not emitted anymore; remove it and the associated TCG variables. Backports commit 05188cc72f0399e99c92f608a8e7ca4c8e552c4b from qemu	2018-02-28 00:24:20 -05:00
Emilio G. Cota	cb92eea81a	target-arm: emulate aarch64's LL/SC using cmpxchg helpers Emulating LL/SC with cmpxchg is not correct, since it can suffer from the ABA problem. Portable parallel code, however, is written assuming only cmpxchg--and not LL/SC--is available. This means that in practice emulating LL/SC with cmpxchg is a viable alternative. The appended emulates LL/SC pairs in aarch64 with cmpxchg helpers. This works in both user and system mode. In usermode, it avoids pausing all other CPUs to perform the LL/SC pair. The subsequent performance and scalability improvement is significant, as the plots below show. They plot the throughput of atomic_add-bench compiled for ARM and executed on a 64-core x86 machine. Hi-res plots: http://imgur.com/a/JVc8Y atomic_add-bench: 1000000 ops/thread, [0,1] range 18 ++---------+----------+---------+----------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 16 ++master +-H--+ ++ \|\| \| 14 ++ ++ \| \| \| 12 ++\| ++ \| \| \| 10 ++++ ++ 8 ++E ++ \|+++ \| 6 ++ \| ++ \| \| \| 4 ++ \| ++ \| \| \| 2 +H++E+--- ++ + \| +E++----+E+---+--+E+----++E+------+E+------+E++----+E+---+--+E\| 0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 1000000 ops/thread, [0,2] range 18 ++---------+----------+---------+----------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 16 ++master +-H--+ ++ \| \| \| 14 ++E ++ \| \| \| 12 ++\| ++ \|+++ \| 10 ++ \| ++ 8 ++ \| ++ \| \| \| 6 ++ \| ++ \| \| \| 4 ++ \| ++ \| +E+--- \| 2 +H+ +E+-----+++ +++ +++ ---+E+-----+E+------+++ +++ + +E+---+--+E+----++E+------+E+--- ++++ +++ + +E\| 0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 1000000 ops/thread, [0,128] range 70 ++---------+----------+---------+----------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 60 ++master +-H--+ +++ ---+E+-----+E+------+E+ \| +E+------E-------+E+--- \| \| --- +++ \| 50 ++ +++--- ++ \| -+E+ \| 40 ++ +++---- ++ \| E- \| \| --\| \| 30 ++ -- +++ ++ \| +E+ \| 20 ++E+ ++ \|E+ \| \| \| 10 ++ ++ + + + + + + + \| 0 +HH-H----H-+-----H----+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 1000000 ops/thread, [0,1024] range 160 ++---------+---------+----------+---------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 140 ++master +-H--+ +++ +++ \| -+E+-----+E+-------E\| 120 ++ +++ ---- +++ \| +++ ----E-- \| 100 ++ --E--- +++ ++ \| +++ ---- +++ \| 80 ++ --E-- ++ \| ---- +++ \| \| -+E+ \| 60 ++ ---- +++ ++ \| +E+- \| 40 ++ -- ++ \| +E+ \| 20 +EE+ ++ +++ + + + + + + \| 0 +HH-H---H--+-----H---+----------+---------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads Backports commit 1dd089d0eec060dcd8478735114d98421d414805 from qemu	2018-02-28 00:21:27 -05:00
Emilio G. Cota	3546558f66	target-arm: emulate SWP with atomic_xchg helper Backports commit cf12bce088f22b92bf62ffa0d7f6a3e951e355a9 from qemu	2018-02-28 00:11:23 -05:00
Emilio G. Cota	ec14a00925	target-arm: emulate LL/SC using cmpxchg helpers Emulating LL/SC with cmpxchg is not correct, since it can suffer from the ABA problem. Portable parallel code, however, is written assuming only cmpxchg--and not LL/SC--is available. This means that in practice emulating LL/SC with cmpxchg is a viable alternative. The appended emulates LL/SC pairs in ARM with cmpxchg helpers. This works in both user and system mode. In usermode, it avoids pausing all other CPUs to perform the LL/SC pair. The subsequent performance and scalability improvement is significant, as the plots below show. They plot the throughput of atomic_add-bench compiled for ARM and executed on a 64-core x86 machine. Hi-res plots: http://imgur.com/a/aNQpB atomic_add-bench: 1000000 ops/thread, [0,1] range 9 ++---------+----------+----------+----------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 8 +Emaster +-H--+ ++ \| \| \| 7 ++E ++ \| \| \| 6 ++++ ++ \| \| \| 5 ++ \| ++ 4 ++ \| ++ \| \| \| 3 ++ \| ++ \| \| \| 2 ++ \| ++ \|H++E+--- +++ ---+E+------+E+------+E\| 1 +++ +E+-----+E+------+E+------+E+------+E+-- +++ +++ ++ ++H+ + +++ + +++ ++++ + + + \| 0 ++--H----H-+-----H----+----------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 1000000 ops/thread, [0,2] range 16 ++---------+----------+---------+----------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 14 ++master +-H--+ ++ \| \| \| 12 ++\| ++ \| E \| 10 ++\| ++ \| \| \| 8 ++++ ++ \|E+\| \| \| \| \| 6 ++ \| ++ \| \| \| 4 ++ \| ++ \| +E+--- +++ +++ +++ ---+E+------+E\| 2 +H+ +E+------E-------+E+-----+E+------+E+------+E+-- +++ + \| + +++ + ++++ + + + \| 0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 1000000 ops/thread, [0,128] range 70 ++---------+----------+---------+----------+----------+----------+---++ +cmpxchg +-E--+ + + + ++++ + \| 60 ++master +-H--+ ----E------+E+-------++ \| -+E+--- +++ +++ +E\| \| +++ ---- +++ ++\| 50 ++ +++ ---+E+- ++ \| -E--- \| 40 ++ ---+++ ++ \| +++--- \| \| -+E+ \| 30 ++ +++---- ++ \| +E+ \| 20 ++ +++-- ++ \| +E+ \| \|+E+ \| 10 +E+ ++ + + + + + + + \| 0 +HH-H----H-+-----H----+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 1000000 ops/thread, [0,1024] range 120 ++---------+---------+----------+---------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| \| master +-H--+ ++\| 100 ++ ----E+ \| +++ ---+E+--- ++\| \| --E--- +++ \| 80 ++ ---- +++ ++ \| ---+E+- \| 60 ++ -+E+-- ++ \| +++ ---- +++ \| \| -+E+- \| 40 ++ +++---- ++ \| +++ ---+E+ \| \| -+E+--- \| 20 ++ +E+ ++ \|+E+++ \| +E+ + + + + + + \| 0 +HH-H---H--+-----H---+----------+---------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads Backports commit 354161b37c6465a32073eac5f16fa35939af2bb4 from qemu	2018-02-28 00:07:44 -05:00
Richard Henderson	fd9933fbd5	target-arm: Rearrange aa32 load and store functions Stop specializing on TARGET_LONG_BITS == 32; unconditionally allocate a temp and expand with tcg_gen_extu_i32_tl. Split out gen_aa32_addr, gen_aa32_frob64, gen_aa32_ld_i32 and gen_aa32_st_i32 as separate interfaces. Backports commit 7f5616f53896a4e08ad37de3ac50d3a4cc8eff7a from qemu	2018-02-27 23:59:16 -05:00
Emilio G. Cota	3dc16ebca3	target-i386: remove helper_lock() It's been superseded by the atomic helpers. The use of the atomic helpers provides a significant performance and scalability improvement. Below is the result of running the atomic_add-test microbenchmark with: $ x86_64-linux-user/qemu-x86_64 tests/atomic_add-bench -o 5000000 -r $r -n $n , where $n is the number of threads and $r is the allowed range for the additions. The scenarios measured are: - atomic: implements x86' ADDL with the atomic_add helper (i.e. this patchset) - cmpxchg: implement x86' ADDL with a TCG loop using the cmpxchg helper - master: before this patchset Results sorted in ascending range, i.e. descending degree of contention. Y axis is Throughput in Mops/s. Tests are run on an AMD machine with 64 Opteron 6376 cores. atomic_add-bench: 5000000 ops/thread, [0,1] range 25 ++---------+----------+---------+----------+----------+----------+---++ + atomic +-E--+ + + + + + \| \|cmpxchg +-H--+ \| 20 +Emaster +-N--+ ++ \|\| \| \|++ \| \|\| \| 15 +++ ++ \|N\| \| \|+\| \| 10 ++\| ++ \|+\|+ \| \| \| -+E+------ +++ ---+E+------+E+------+E+-----+E+------+E\| \|+E+E+- +++ +E+------+E+-- \| 5 ++\|+ ++ \|+N+H+--- +++ \| ++++N+--+H++----+++ + +++ --++H+------+H+------+H++----+H+---+--- \| 0 ++---------+-----H----+---H-----+----------+----------+----------+---H+ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 5000000 ops/thread, [0,2] range 25 ++---------+----------+---------+----------+----------+----------+---++ ++atomic +-E--+ + + + + + \| \|cmpxchg +-H--+ \| 20 ++master +-N--+ ++ \|E\| \| \|++ \| \|\|E \| 15 ++\| ++ \|N\|\| \| \|+\|\| ---+E+------+E+-----+E+------+E\| 10 ++\| \| ---+E+------+E+-----+E+--- +++ +++ \|\|H+E+--+E+-- \| \|+++++ \| \| \|\| \| 5 ++\|+H+-- +++ ++ \|+N+ - ---+H+------+H+------ \| + +N+--+H++----+H+---+--+H+----++H+--- + + +H+---+--+H\| 0 ++---------+----------+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 5000000 ops/thread, [0,8] range 40 ++---------+----------+---------+----------+----------+----------+---++ ++atomic +-E--+ + + + + + \| 35 +cmpxchg +-H--+ ++ \| master +-N--+ ---+E+------+E+------+E+-----+E+------+E\| 30 ++\| ---+E+-- +++ ++ \| \| -+E+--- \| 25 ++E ---- +++ ++ \|+++++ -+E+ \| 20 +E+ E-- +++ ++ \|H\|+++ \| \|+\| +H+------- \| 15 ++H+ ---+++ +H+------ ++ \|N++H+-- +++--- +H+------++\| 10 ++ +++ - +++ ---+H+ +++ +H+ \| \| +H+-----+H+------+H+-- \| 5 ++\| +++ ++ ++N+N+--+N++ + + + + + \| 0 ++---------+----------+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 5000000 ops/thread, [0,128] range 160 ++---------+---------+----------+---------+----------+----------+---++ + atomic +-E--+ + + + + + \| 140 +cmpxchg +-H--+ +++ +++ ++ \| master +-N--+ E--------E------+E+------++\| 120 ++ --\| \| +++ E+ \| -- +++ +++ ++\| 100 ++ - ++ \| +++- +++ ++\| 80 ++ -+E+ -+H+------+H+------H--------++ \| ---- ---- +++ H\| \| ---+E+-----+E+- ---+H+ ++\| 60 ++ +E+--- +++ ---+H+--- ++ \| --+++ ---+H+-- \| 40 ++ +E+-+H+--- ++ \| +H+ \| 20 +EE+ ++ +N+ + + + + + + \| 0 ++N-N---N--+---------+----------+---------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 5000000 ops/thread, [0,1024] range 350 ++---------+---------+----------+---------+----------+----------+---++ + atomic +-E--+ + + + + + \| 300 +cmpxchg +-H--+ +++ \| master +-N--+ +++ \|\| \| +++ \| ----E\| 250 ++ \| ----E---- ++ \| ----E--- \| ---+H\| 200 ++ -+E+--- +++ ---+H+--- ++ \| ---- -+H+-- \| \| +E+ +++ ---- +++ \| 150 ++ ---+++ ---+H+- ++ \| --- -+H+-- \| 100 ++ ---+E+ ---- +++ ++ \| +++ ---+E+-----+H+- \| \| -+E+------+H+-- \| 50 ++ +E+ ++ +EE+ + + + + + + \| 0 ++N-N---N--+---------+----------+---------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads hi-res: http://imgur.com/a/fMRmq For master I stopped measuring master after 8 threads, because there is little point in measuring the well-known performance collapse of a contended lock. Backports commit 37b995f6e7a1cb6fa378c5cd4217b9dd9e1fc98b from qemu	2018-02-27 23:43:22 -05:00
Emilio G. Cota	9d9b7dedac	target-i386: emulate XCHG using atomic helper Backports commit ea97ebe89f7a879ea9aba90140e40c29b5cbd653 from qemu	2018-02-27 23:40:20 -05:00
Emilio G. Cota	8f96b6beb9	target-i386: emulate LOCK'ed BTX ops using atomic helpers Backports commit cfe819d309d472f75fd129faf1d1064a2498326c from qemu	2018-02-27 23:39:21 -05:00
Emilio G. Cota	089965fa8d	target-i386: emulate LOCK'ed XADD using atomic helper Backports commit f53b01817f95781d2bcc8a82e057d1416601e13b from qemu	2018-02-27 23:06:28 -05:00
Emilio G. Cota	f9ed728f27	target-i386: emulate LOCK'ed NEG using cmpxchg helper Backports commit 8eb8c7385608b99bed6055a22d897ff727a6cb8e from qemu	2018-02-27 23:03:28 -05:00
Emilio G. Cota	fedeb0f93e	target-i386: emulate LOCK'ed NOT using atomic helper Backports commit 2a5fe8ae145ef7a3ab480922116d27efcc97b85d from qemu	2018-02-27 23:00:33 -05:00

1 2 3 4 5 ...

3440 Commits