unicorn

mirror of https://github.com/yuzu-emu/unicorn.git synced 2024-10-20 14:58:16 +02:00

Author	SHA1	Message	Date
Eduardo Habkost	c35e9eb9af	target-i386: xsave: Simplify CPUID[0xD,0].{EAX,EDX} calculation Instead of assigning individual bits in a loop, just copy the values from ena_mask. Backports commit 8057c621b1b17cbcb35fe67d1a09ada9055873a9 from qemu	2018-02-26 04:35:14 -05:00
Eduardo Habkost	c7195afd32	target-i386: xsave: Calculate enabled components only once Instead of checking both env->features and ena_mask at two different places in the CPUID code, initialize ena_mask based on the features that are enabled for the CPU, and then clear unsupported bits based on kvm_arch_get_supported_cpuid(). The results should be exactly the same, but it will make it easier to move the mask calculation elsewhare, and reuse x86_cpu_filter_features() for the kvm_arch_get_supported_cpuid() check. Backports commit 4928cd6de6b4211a79f98c8dc39115be1e815c2b from qemu	2018-02-26 04:33:18 -05:00
Eduardo Habkost	c3a0cba5b1	target-i386: Don't try to enable PT State xsave component The code that calculates the set of supported XSAVE components on CPUID looks at ext_save_areas to find out which components should be enabled. However, if there are zeroed entries in the ext_save_areas array, the ((env->features[esa->feature] & esa->bits) == esa->bits) check will always succeed and QEMU will unconditionally try to enable the component. Luckily this never caused any problems because the only missing entry in ext_save_areas is the PT State component (bit 8), and KVM currently doesn't support it (so it was cleared on ena_mask). But the code was still incorrect and would break if KVM starts returning CPUID[EAX=0xD,ECX=0].EAX[bit 8] as supported on GET_SUPPORTED_CPUID. Fix the problem by changing the code to not enable a XSAVE component if ExtSaveArea::bits is zero. Backports commit 9646f4927faf68e8690588c2fd6dc9834c440b58 from qemu	2018-02-26 04:30:35 -05:00
Eduardo Habkost	6188c6d6e4	target-i386: Move feature name arrays inside FeatureWordInfo It makes it easier to guarantee the arrays are the right size, and to find information when looking at the code. Backports commit 2d5312da566e4424a807d078da05f92ee7be3eec from qemu	2018-02-26 04:29:47 -05:00
Eduardo Habkost	74ae087743	target-i386: Enable CPUID[0x8000000A] if SVM is enabled SVM needs CPUID[0x8000000A] to be available. So if SVM is enabled in a CPU model or explicitly in the command-line, adjust CPUID xlevel to expose the CPUID[0x8000000A] leaf. Backports commit 0c3d7c0051576d220e6da0a8ac08f2d8482e2f0b from qemu	2018-02-26 04:05:47 -05:00
Eduardo Habkost	37406874ea	target-i386: Automatically set level/xlevel/xlevel2 when needed Instead of requiring users and management software to be aware of required CPUID level/xlevel/xlevel2 values for each feature, automatically increase those values when features need them. This was already done for CPUID[7].EBX, and is now made generic for all CPUID feature flags. Unit test included, to make sure we don't break ABI on older machine-types and don't mess with the CPUID level values if they are explicitly set by the user. Backports commit c39c0edf9bb3b968ba95484465a50c7b19f4aa3a from qemu	2018-02-26 04:03:09 -05:00
Eduardo Habkost	6861fe80cf	target-i386: Add a marker to end of the region zeroed on reset Instead of using cpuid_level, use an empty struct as a marker (like we already did with {start,end}_init_save). This will avoid accidentaly resetting the wrong fields if we change the field ordering on CPUX86State. Backports commit 5e992a8e337e710ea2d02f35668ac55a80e15f99 from qemu	2018-02-26 03:59:03 -05:00
Eduardo Habkost	c78d24b93c	target-i386: Remove unused X86CPUDefinition::xlevel2 field No CPU model in builtin_x86_defs has xlevel2 set, so it is always zero. Delete the field. Note that this is not an user-visible change. It doesn't remove the ability to set xlevel2 on the command-line, it just removes an unused field in builtin_x86_defs. Backports commit 0456441b5eb6694a561ad5bb8dad52483e6a08d0 from qemu	2018-02-26 03:57:02 -05:00
Leon Alrae	f60eca6930	target-mips: generate fences Make use of memory barrier TCG opcode in MIPS front end. Backports commit d208ac0c2e4cb43b74153bd584fc63c7b8a93ed6 from qemu	2018-02-26 03:52:35 -05:00
André Draszik	f14ece4aa1	target-mips: add 24KEc CPU definition Define a new CPU definition supporting 24KEc cores, similar to the existing 24Kc, but with added support for DSP instructions and MIPS16e (and without FPU). Backports commit e9deaad8a58c899dc32e9fdeff9e533070e79dca from qemu	2018-02-26 03:50:22 -05:00
Andrey Yurovsky	e24890a580	arm: add Cortex A7 CPU parameters Add the "cortex-a7" CPU with features and registers matching the Cortex-A7 MPCore Technical Reference Manual and the Cortex-A7 Floating-Point Unit Technical Reference Manual. The A7 is very similar to the A15. Backports commit dcf578ed8cec89543158b103940e854ebd21a8cf from qemu	2018-02-26 03:44:24 -05:00
Richard Henderson	552ef4b3e6	target-i386: Use struct X86XSaveArea in fpu_helper.c This avoids a double hand-full of magic numbers in the xsave and xrstor helper functions. Backports commit 3f32bd21df655e62eb271182a5c63280d631c7b3 from qemu	2018-02-26 03:38:53 -05:00
Richard Henderson	2ab4b8fa4d	tcg/i386: Extend TARGET_PAGE_MASK to the proper type TARGET_PAGE_MASK, as defined, has type "int". We need to extend that to the proper target width before oring in an "unsigned". Backports commit ebb90a005da67147245cd38fb04a965a87a961b7 from qemu	2018-02-26 03:32:38 -05:00
Pranith Kumar	16d71f0f10	tcg: Optimize fence instructions This commit optimizes fence instructions. Two optimizations are currently implemented: (1) unnecessary duplicate fence instructions, and (2) merging weaker fences into a stronger fence. [rth: Merge tcg_optimize_mb back into tcg_optimize, so that we only loop over the opcode stream once. Merge "unrelated" weaker barriers into one stronger barrier.] Backports commit 34f939218ce78163171addd63750e1e0300376ab from qemu	2018-02-26 03:29:59 -05:00
Pranith Kumar	533e083495	target-i386: Generate fences for x86 Backports commit cc19e497a047193db5083425957d7292c8dd3226 from qemu	2018-02-26 03:28:31 -05:00
Pranith Kumar	32b7cee81e	target-aarch64: Generate fences for aarch64 Backports commit ce1bd93f94e8d4b7117744e49652d2f907bed99f from qemu	2018-02-26 03:26:35 -05:00
Pranith Kumar	7849f8d72a	target-arm: Generate fences in ARMv7 frontend Backports commit 61e4c432ab26526bab0f3ef746c1861415b6da29 from qemu	2018-02-26 03:22:53 -05:00
Pranith Kumar	65a73763e3	tcg/sparc: Add support for fence Backports commit f8f03b3707b49898052fb8cd75ee31d19c8161fc from qemu	2018-02-26 03:20:39 -05:00
Pranith Kumar	a6fdc24e28	tcg/s390: Add support for fence Backports commit c9314d610e0e5da4d2cd5a36f3563d102b3294e0 from qemu	2018-02-26 03:19:41 -05:00
Pranith Kumar	bdd9cad15c	tcg/ppc: Add support for fence Backports commit 7b4af5ee8a1336bc39714b6de47924ee71fba761 from qemu	2018-02-26 03:18:43 -05:00
Pranith Kumar	5f10101245	tcg/mips: Add support for fence Backports commit 6f0b99104a396905870edc3049310ece29b6b8d6 from qemu	2018-02-26 03:17:34 -05:00
Pranith Kumar	e29cbe9640	tcg/arm: Add support for fence Backports commit 40f191ab8226fdada185efa49c44b60d8f494890 from qemu	2018-02-26 03:13:17 -05:00
Pranith Kumar	907060b865	tcg/aarch64: Add support for fence Backports commit c7a59c2a92592e556b9361437c9c4229917bd1e3 from qemu	2018-02-26 03:11:03 -05:00
Pranith Kumar	d49bd55f52	tcg/i386: Add support for fence Generate a 'lock orl $0,0(%esp)' instruction for ordering instead of mfence which has similar ordering semantics. Backports commit a7d00d4effb58889ac6df64f98ac50c9d1594149 from qemu	2018-02-26 03:10:58 -05:00
Pranith Kumar	5e44ce9be8	Introduce TCGOpcode for memory barrier This commit introduces the TCGOpcode for memory barrier instruction. This opcode takes an argument which is the type of memory barrier which should be generated. Backports commit f65e19bc2c9e8358e634d309606144ac2a3c2936 from qemu	2018-02-26 03:02:41 -05:00
Richard Henderson	66d79ac959	tcg: Merge GETPC and GETRA The return address argument to the softmmu template helpers was confused. In the legacy case, we wanted to indicate that there is no return address, and so passed in NULL. However, we then immediately subtracted GETPC_ADJ from NULL, resulting in a non-zero value, indicating the presence of an (invalid) return address. Push the GETPC_ADJ subtraction down to the only point it's required: immediately before use within cpu_restore_state_from_tb, after all NULL pointer checks have been completed. This makes GETPC and GETRA identical. Remove GETRA as the lesser used macro, replacing all uses with GETPC. Backports commit 01ecaf438b1eb46abe23392c8ce5b7628b0c8cf5 from qemu	2018-02-26 02:54:44 -05:00
Richard Henderson	91f5cf0417	tcg: Support arbitrary size + alignment Previously we allowed fully unaligned operations, but not operations that are aligned but with less alignment than the operation size. In addition, arm32, ia64, mips, and sparc had been omitted from the previous overalignment patch, which would have led to that alignment being enforced. Backports commit 85aa80813dd9f5c1f581c743e45678a3bee220f8 from qemu	2018-02-26 02:47:26 -05:00
Stanislav Shmarov	5f9552657e	target-i386: Fixed syscall posssible segfault In user-mode emulation env->idt.base memory is allocated in linux-user/main.c with size 8512 = 4096 (for 64-bit). When fake interrupt EXCP_SYSCALL is thrown do_interrupt_user checks destination privilege level for this fake exception, and tries to read 4 bytes at address base + (256 2^4)=4096, that causes segfault. Privlege level was checked only for int's, so lets read dpl from memory only for this case. Backports commit 885b7c44e4f8b7a012a92770a0dba8b238662caa from qemu	2018-02-26 02:36:09 -05:00
Paolo Bonzini	d8d0d08262	target-i386: fix ordering of fields in CPUX86State Make sure reset zeroes TSC_AUX, XCR0, PKRU. Move XSTATE_BV from the "vmstate only" section to the "KVM only" section. Backports commit 7616f1c2da1c0f336a474a56ad6d32e15ccd666e from qemu	2018-02-26 02:34:22 -05:00
Ladi Prosek	7acc14da16	Remove unused function declarations Unused function declarations were found using a simple gcc plugin and manually verified by grepping the sources. Backports commit d4b84d564ee3eb7a58e4585d671fb3c220b6c3b9 from qemu	2018-02-26 02:31:46 -05:00
Thomas Huth	b581d4033f	tcg: Remove duplicate header includes host-utils.h and timer.h are included twice in tcg.c. One time should be enough. Backports commit 347519eb9d68303a6c23a7663c0fa6c20a225191 from qemu	2018-02-26 02:29:38 -05:00
Lioncash	1ff9724b46	cutils: Remove unused vector ifdef block	2018-02-26 02:28:50 -05:00
Andrew Dutcher	26b36e5ff8	fpu: add mechanism to check for invalid long double formats All operations that take a floatx80 as an operand need to have their inputs checked for malformed encodings. In all of these cases, use the function floatx80_invalid_encoding to perform the check. If an invalid operand is found, raise an invalid operation exception, and then return either NaN (for fp-typed results) or the integer indefinite value (the minimum representable signed integer value, for int-typed results). For the non-quiet comparison operations, this touches adjacent code in order to pass style checks. Backports cast correction portion of commit d1eb8f2acba579830cf3798c3c15ce51be852c56m from qemu	2018-02-26 02:27:40 -05:00
Pranith Kumar	9e6fec8741	atomics: Use __atomic__n() variant primitives Use the __atomic__n() primitives which take the value as argument. It is not necessary to store the value locally before calling the primitive, hence saving us a stack store and load. Backports commit 89943de17c4e276f2c47f05b4604e8816a6a636c from qemu	2018-02-26 02:16:48 -05:00
Fam Zheng	1a2c30abbf	rules.mak: Don't extract libs from .mo-libs in link command For module build, .mo objects are passed to LINK and consumed in process-archive-undefs. The reason behind that is documented in the comment above process-archive-undefs. Similarly, extract-libs should be called with .mo filtered out too. Otherwise, the .mo-libs are added to the link command incorrectly, spoiling the purpose of modularization. Currently we don't have any .mo-libs usage, but it will be used soon when we modularize more multi-source objects, like sdl and gtk. Backports commit 5b1b6dbd94e2e2e98920f886cb32fcf4a1520b50 from qemu	2018-02-26 02:08:03 -05:00
Sergey Fedorov	58ff618708	tcg: rename tb_find_physical() In fact, this function does not exactly perform a lookup by physical address as it is descibed for comment on get_page_addr_code(). Thus it may be a bit confusing to have "physical" in it's name. So rename it to tb_htable_lookup() to better reflect its actual functionality. Backports commit b34de45fc40d01c14b31d3a682e284180a2ed8c5 from qemu	2018-02-26 02:07:06 -05:00
Sergey Fedorov	ab0c87bc6f	tcg: Merge tb_find_slow() and tb_find_fast() These functions are not too big and can be merged together. This makes locking scheme more clear and easier to follow. Backports commit bd2710d5da06ad7706d4864f65b3f0c9f7cb4d7f from qemu	2018-02-26 02:05:19 -05:00
Sergey Fedorov	9b6f287488	tcg: Avoid bouncing tb_lock between tb_gen_code() and tb_add_jump() Backports commit 74d356dd48b64eaa2a6104ac1493ca64cb31fa16 from qemu	2018-02-26 02:01:40 -05:00
Alex Bennée	09c3ef656e	tcg: cpu-exec: remove tb_lock from the hot-path Lock contention in the hot path of moving between existing patched TranslationBlocks is the main drag in multithreaded performance. This patch pushes the tb_lock() usage down to the two places that really need it: - code generation (tb_gen_code) - jump patching (tb_add_jump) The rest of the code doesn't really need to hold a lock as it is either using per-CPU structures, atomically updated or designed to be used in concurrent read situations (qht_lookup). To keep things simple I removed the #ifdef CONFIG_USER_ONLY stuff as the locks become NOPs anyway until the MTTCG work is completed. Backports commit 518615c6503ad78d3bb67ddf1cd848c4a41de02e from qemu	2018-02-26 01:58:33 -05:00
Alex Bennée	62aa0abd02	tcg: set up tb->page_addr before insertion This ensures that if we find the TB on the slow path that tb->page_addr is correctly set before being tested. Backports commit 2e1ae44a4f4a6149fbb9dc812243522f07284700 from qemu	2018-02-26 01:50:04 -05:00
Paolo Bonzini	30845ae475	tcg: Prepare TB invalidation for lockless TB lookup When invalidating a translation block, set an invalid flag into the TranslationBlock structure first. It is also necessary to check whether the target TB is still valid after acquiring 'tb_lock' but before calling tb_add_jump() since TB lookup is to be performed out of 'tb_lock' in future. Note that we don't have to check 'last_tb'; an already invalidated TB will not be executed anyway and it is thus safe to patch it. Backports commit 6d21e4208f382dd8ca1f7995a6dd9ea7ca281163 from qemu	2018-02-26 01:48:13 -05:00
Sergey Fedorov	c0dda5fbe9	tcg: Prepare safe access to tb_flushed out of tb_lock Ensure atomicity and ordering of CPU's 'tb_flushed' access for future translation block lookup out of 'tb_lock'. This field can only be touched from another thread by tb_flush() in user mode emulation. So the only access to be sequential atomic is: * a single write in tb_flush(); * reads/writes out of 'tb_lock'. In future, before enabling MTTCG in system mode, tb_flush() must be safe and this field becomes unnecessary. Backports commit 118b07308a8cedc16ef63d7ab243a95f1701db40 from qemu	2018-02-25 23:33:58 -05:00
Sergey Fedorov	9eb02a540d	tcg: Prepare safe tb_jmp_cache lookup out of tb_lock Ensure atomicity of CPU's 'tb_jmp_cache' access for future translation block lookup out of 'tb_lock'. Note that this patch does not make CPU's TLB invalidation safe if it is done from some other thread while the CPU is in its execution loop. Backports commit 89a16b1e4294e3664667a151c2f70c84dfac6fd9 from qemu	2018-02-25 23:29:18 -05:00
Sergey Fedorov	371101a184	tcg: Pass last_tb by value to tb_find_fast() This is a small clean up. tb_find_fast() is a final consumer of this variable so no need to pass it by reference. 'last_tb' is always updated by subsequent cpu_loop_exec_tb() in cpu_exec(). This change also simplifies calling cpu_exec_nocache() in cpu_handle_exception(). Backports commit 4b7e69509df2fcbfdab8c62c294dbfcfdab8a6e1 from qemu	2018-02-25 23:23:22 -05:00
Cao jin	cc45b82472	timer/cpus: fix some typos and update some comments Backports commit 3224e8786fcbe531746f1530c37210c425625213 from qemu	2018-02-25 23:21:57 -05:00
Paolo Bonzini	57fff7a94b	target-m68k: fix get_mac_extf helper val is assigned twice; the second one should be combined with "\|". Reported by Coverity. Backports commit 5ce747cfac697f61668ab4fa4a71c1dba15cc272 from qemu	2018-02-25 23:21:05 -05:00
Thomas Huth	aed5df31b7	sparc: Use g_memdup() instead of g_new0() + memcpy() There is no need to make sure that the memory is zeroed after the allocation if we also immediatly fill the whole buffer afterwards with memcpy(). Thus g_new0 should be g_new instead. But since we are also doing a memcpy() here, we can also simply replace both with g_memdup() instead. Backports commit a337f295defad7eb977da4d6317cf70f7f2fa4b4 from qemu	2018-02-25 23:19:44 -05:00
Peter Maydell	eb77f61bea	configure: Always compile with -fwrapv QEMU's code relies on left shifts of signed integers always being defined behaviour with the obvious 2s-complement semantics. The only way to tell the compiler (and any associated undefined-behaviour sanitizer) that we require a C dialect with these semantics is to use the -fwrapv option. This is a bit of a heavy hammer for the job as it also gives us guaranteed semantics on integer arithmetic overflow which in theory we don't require. In an ideal world this would allow us to drop the warning flag -Wno-shift-negative-value, but we must retain this to avoid spurious warnings on clang versions predating the fix to https://llvm.org/bugs/show_bug.cgi?id=25552. Backports commit 2d31515bc0880a1cea86ce638d2a109f4f4e6f7d from qemu	2018-02-25 23:17:41 -05:00
Longpeng(Mike)	8b5400d675	target-i386: present virtual L3 cache info for vcpus Some software algorithms are based on the hardware's cache info, for example, for x86 linux kernel, when cpu1 want to wakeup a task on cpu2, cpu1 will trigger a resched IPI and told cpu2 to do the wakeup if they don't share low level cache. Oppositely, cpu1 will access cpu2's runqueue directly if they share llc. The relevant linux-kernel code as bellow: static void ttwu_queue(struct task_struct p, int cpu) { struct rq rq = cpu_rq(cpu); ...... if (... && !cpus_share_cache(smp_processor_id(), cpu)) { ...... ttwu_queue_remote(p, cpu); /* will trigger RES IPI / return; } ...... ttwu_do_activate(rq, p, 0); / access target's rq directly / ...... } In real hardware, the cpus on the same socket share L3 cache, so one won't trigger a resched IPIs when wakeup a task on others. But QEMU doesn't present a virtual L3 cache info for VM, then the linux guest will trigger lots of RES IPIs under some workloads even if the virtual cpus belongs to the same virtual socket. For KVM, there will be lots of vmexit due to guest send IPIs. The workload is a SAP HANA's testsuite, we run it one round(about 40 minuates) and observe the (Suse11sp3)Guest's amounts of RES IPIs which triggering during the period: No-L3 With-L3(applied this patch) cpu0: 363890 44582 cpu1: 373405 43109 cpu2: 340783 43797 cpu3: 333854 43409 cpu4: 327170 40038 cpu5: 325491 39922 cpu6: 319129 42391 cpu7: 306480 41035 cpu8: 161139 32188 cpu9: 164649 31024 cpu10: 149823 30398 cpu11: 149823 32455 cpu12: 164830 35143 cpu13: 172269 35805 cpu14: 179979 33898 cpu15: 194505 32754 avg: 268963.6 40129.8 The VM's topology is "1socket 8cores 2threads". After present virtual L3 cache info for VM, the amounts of RES IPIs in guest reduce 85%. For KVM, vcpus send IPIs will cause vmexit which is expensive, so it can cause severe performance degradation. We had tested the overall system performance if vcpus actually run on sparate physical socket. With L3 cache, the performance improves 7.2%~33.1%(avg:15.7%). Backports commit 14c985cffa6cb177fc01a163d8bcf227c104718c from qemu	2018-02-25 23:16:14 -05:00
Lioncash	2d87095858	glib_compat: Amend header guard	2018-02-25 23:12:20 -05:00

1 2 3 4 5 ...

3298 Commits