Commit Graph

336 Commits

Author SHA1 Message Date
bunnei
ba480ea2fb
Merge pull request #1273 from Subv/ld_sizes
Shaders: Implemented multiple-word loads and stores to and from attribute memory.
2018-09-15 15:27:12 -04:00
bunnei
daee15b058
Merge pull request #1271 from Subv/kepler_engine
GPU: Basic implementation of the Kepler Inline Memory engine (p2mf).
2018-09-15 13:27:07 -04:00
Subv
c878a819d7 Shaders: Implemented multiple-word loads and stores to and from attribute memory.
This seems to be an optimization performed by nouveau.
2018-09-15 11:21:21 -05:00
fearlessTobi
63c2e32e20 Port #4182 from Citra: "Prefix all size_t with std::" 2018-09-15 15:21:06 +02:00
bunnei
cc50857460
Merge pull request #1263 from FernandoS27/tex-mode
shader_decompiler:  Implemented (Partially) Texture Processing Modes
2018-09-12 16:03:34 -04:00
Subv
bb5eb4f20a GPU: Basic implementation of the Kepler Inline Memory engine (p2mf).
This engine writes data from a FIFO register into the configured address.
2018-09-12 13:57:08 -05:00
FernandoS27
a99d9db32f Implemented Texture Processing Modes 2018-09-12 12:28:22 -04:00
FernandoS27
3f0922715a Implemented encodings for LEA and PSET 2018-09-11 12:50:25 -04:00
FernandoS27
2b48cfd44b Replace old FragmentHeader for the new Header 2018-09-11 12:48:19 -04:00
FernandoS27
e926757c8f Implemented (Partialy) Shader Header 2018-09-11 12:34:27 -04:00
Markus Wick
c560043581 rasterizer: Drop unused handler.
This virtual function is called in a very hot spot, and it does nothing.

If this kind of feature is required, please be more specific and add callbacks
in the switch statement within Maxwell3D::WriteReg. There is no point in having
another switch statement within the rasterizer.
2018-09-10 22:03:10 +02:00
bunnei
49b15af054 gl_rasterizer: Implement multiple color attachments. 2018-09-09 22:48:28 -04:00
bunnei
e58855c7a4
Merge pull request #1268 from FernandoS27/tmml
shader_decompiler: Implemented TMML
2018-09-09 21:39:39 -04:00
FernandoS27
00131e752d Implemented TMML 2018-09-09 20:46:31 -04:00
bunnei
223ddb2008
Merge pull request #1272 from Subv/dma_2d
GPU/DMA: Partially implemented the 'enable_2d' bit in the DMA engine.
2018-09-09 19:53:17 -04:00
FernandoS27
073a21ac0b Implemented TXQ dimension query type, used by SMO. 2018-09-09 11:59:01 -04:00
FernandoS27
82a313a14c Change name of TEXQ to TXQ, in order to match NVIDIA's naming 2018-09-08 18:08:57 -04:00
Subv
fdb199290b GPU/DMA: Partially implemented the 'enable_2d' bit in the DMA engine.
When not set, this tells the GPU to only use the X size when performing a DMA copy.
This is only implemented for linear->linear and tiled->tiled copies. Conversion copies still retain the assert.

This bit is unset by some games for various purposes, and by nouveau when copying the vertex buffers.
2018-09-08 16:02:16 -05:00
bunnei
fdd5c97a14 maxwell_3d: Remove assert that no longer applies. 2018-09-08 02:53:39 -04:00
bunnei
77554ac773
Merge pull request #1243 from degasus/VAO_cache
gl_rasterizer: Implement a VAO cache.
2018-09-05 22:50:52 -04:00
FernandoS27
e63b229f4a Implemented IPA Properly 2018-09-05 20:15:47 -04:00
Markus Wick
d3ad9469a1 gl_rasterizer: Implement a VAO cache.
This patch caches VAO objects instead of re-emiting all pointers per draw call.
Configuring this pointers is known as a fast task, but it yields too many GL
calls. So for better performance, just bind the VAO instead of 16 pointers.
2018-09-05 18:46:35 +02:00
bunnei
325f3e0693
Merge pull request #1213 from DarkLordZach/octopath-fs
filesystem/maxwell_3d: Various changes to boot Project Octopath Traveller
2018-09-02 10:49:18 -04:00
bunnei
89be49d2f3
Merge pull request #1215 from ogniK5377/texs-nodep-assert
Added assert for TEXS nodep
2018-09-02 10:48:27 -04:00
bunnei
177c45e97d
Merge pull request #1214 from ogniK5377/ipa-assert
Added better asserts to IPA, Renamed IPA modes to match mesa
2018-09-02 10:44:43 -04:00
bunnei
9c206fe94d
Merge pull request #1216 from ogniK5377/ffma-assert
Added FFMA asserts and missing fields
2018-09-02 10:44:13 -04:00
David Marcec
60754b4728 Removed saturate assert
Unneeded as we already implement it
2018-09-01 19:33:32 +10:00
David Marcec
2edab4e840 Removed saturate assert
Saturate already implemented
2018-09-01 19:29:20 +10:00
David Marcec
6f8ed9508d Added FMUL asserts 2018-09-01 19:05:10 +10:00
David Marcec
b89fc407d7 Added FFMA asserts 2018-09-01 18:45:14 +10:00
David Marcec
948bc87a59 Added assert for TEXS nodep 2018-09-01 17:00:01 +10:00
David Marcec
ad3dca7e62 Added better asserts to IPA, Renamed IPA modes to match mesa
IpaMode is changed to IpaInterpMode
IpaMode is suppose to be 2 bits not 3
Added IpaSampleMode
Added Saturate

Renamed modes based on
d27c791891/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp (L2530)
2018-09-01 16:34:27 +10:00
Zach Hilman
f32e28c7b8 maxwell_3d: Use CoreTiming for query timestamp 2018-08-31 23:25:18 -04:00
Lioncash
4a587b81b2 core/core: Replace includes with forward declarations where applicable
The follow-up to e2457418da, which
replaces most of the includes in the core header with forward declarations.

This makes it so that if any of the headers the core header was
previously including change, then no one will need to rebuild the bulk
of the core, due to core.h being quite a prevalent inclusion.

This should make turnaround for changes much faster for developers.
2018-08-31 16:30:14 -04:00
Hexagon12
d626bc8c62 Added predicate comparison GreaterEqualWithNan 2018-08-31 10:40:18 +03:00
Laku
915ab81ec2 gl_shader_decompiler: Implement POPC (#1203)
* Implement POPC

* implement invert
2018-08-30 21:32:58 -04:00
bunnei
d6accf96ff
Merge pull request #1200 from bunnei/improve-ipa
gl_shader_decompiler: Improve IPA for Pass mode with Position attribute.
2018-08-30 10:31:26 -04:00
tech4me
a6dd577d02 Shaders: Implemented IADD3 2018-08-29 13:44:41 -04:00
bunnei
b1ccd88434 gl_shader_decompiler: Improve IPA for Pass mode with Position attribute. 2018-08-29 00:37:29 -04:00
bunnei
2f5ed3877c
Merge pull request #1169 from Lakumakkara/sel
shader_bytecode: fix SEL_IMM bitstring
2018-08-27 18:24:57 -04:00
bunnei
be2f1eabd7
Merge pull request #1173 from lioncash/batch
maxwell3d: Move FinishedPrimitiveBatch event after AcceleratedDrawBatch()
2018-08-25 10:59:54 -04:00
Lioncash
20800f2df7 maxwell3d: Move FinishedPrimitiveBatch event after AcceleratedDrawBatch()
The start and finish events should likely not be right after one another
like this, otherwise the batch will appear to complete immediately
2018-08-24 19:58:05 -04:00
Laku
36093a3e4d fix SEL_IMM bitstring 2018-08-24 07:18:12 +03:00
tech4me
ba2972bc64 Shaders: Added decodings for IADD3 instructions 2018-08-23 15:46:59 -04:00
bunnei
2a472ff54d maxwell_3d: Update to include additional stencil registers. 2018-08-23 11:08:47 -04:00
Laku
8e8326595f implement lop3 2018-08-22 10:09:44 +03:00
bunnei
125d7122ac
Merge pull request #1124 from Subv/logic_ops
GPU: Implemented logic ops.
2018-08-22 01:05:25 -04:00
Lioncash
a0e2bd85a5 shader_bytecode: Parenthesize conditional expression within GetTextureType()
Resolves a -Wlogical-op-parentheses warning.
2018-08-21 15:08:35 -04:00
bunnei
2ae88feea7 shader_bytecode: Replace some UNIMPLEMENTED logs. 2018-08-20 21:53:49 -04:00
Subv
6bcdf37d4f GPU: Added registers for the logicop functionality. 2018-08-20 18:42:36 -05:00
bunnei
028d90eb79
Merge pull request #1104 from Subv/instanced_arrays
GLRasterizer: Implemented instanced vertex arrays.
2018-08-20 14:32:50 -04:00
bunnei
b20ed93884
Merge pull request #1112 from Subv/sampler_types
Shaders: Use the correct shader type when sampling textures.
2018-08-20 14:30:45 -04:00
bunnei
51ddb130c5
Merge pull request #1089 from Subv/neg_bits
Shaders: Corrected the 'abs' and 'neg' bit usage in the float arithmetic instructions.
2018-08-19 17:01:48 -04:00
Subv
f7edbcd7a3 Shaders/TEXS: Fixed the component mask in the TEXS instruction.
Previously we could end up with a TEXS that didn't write any outputs, this was wrong.
2018-08-19 14:00:12 -05:00
Subv
73b937b190 Shader: Added bitfields for the texture type of the various sampling instructions. 2018-08-19 12:57:51 -05:00
Subv
656758fd81 Shaders: Added decodings for TLD4 and TLD4S 2018-08-19 12:57:08 -05:00
bunnei
29d4f8c2dd
Merge pull request #1109 from Subv/ldg_decode
Shaders: Added decodings for  the LDG and STG instructions.
2018-08-19 13:31:19 -04:00
bunnei
9baf5de90c
Merge pull request #1108 from Subv/front_facing
Shaders: Implemented the gl_FrontFacing input attribute (attr 63).
2018-08-19 13:21:14 -04:00
Subv
1b92ae136f Shaders: Added decodings for the LDG and STG instructions. 2018-08-19 00:46:34 -05:00
Subv
731701a2d2 Shaders: Implemented the gl_FrontFacing input attribute (attr 63). 2018-08-19 00:14:34 -05:00
Subv
e0f66c1fbf GLRasterizer: Implemented instanced vertex arrays.
Before each draw call, for every enabled vertex array configured as instanced, we take the current instance id and divide it by its configured divisor, then we multiply that by the corresponding stride and increment the start address by the resulting amount. This way we can simulate the vertex array being incremented once per instance without actually using OpenGL's instancing functions.
2018-08-18 14:42:26 -05:00
Subv
8335b2f115 Shader: Implemented the predicate and mode arguments of LOP.
The mode can be used to set the predicate to true depending on the result of the logic operation. In some cases, this means discarding the result (writing it to register 0xFF (Zero)).

This is used by Super Mario Odyssey.
2018-08-18 14:36:37 -05:00
Subv
2e95ba2e9c Shaders: Corrected the 'abs' and 'neg' bit usage in the float arithmetic instructions.
We should definitely audit our shader generator for more errors like this.
2018-08-18 10:22:42 -05:00
David Marcec
63dff47e22 Added predcondition GreaterThanWithNan 2018-08-18 17:49:59 +10:00
Subv
c5284efd4f Rasterizer: Implemented instanced rendering.
We keep track of the current instance and update an uniform in the shaders to let them know which instance they are.

Instanced vertex arrays are not yet implemented.
2018-08-14 22:25:07 -05:00
bunnei
534abf9d97 gl_shader_decompiler: Implement XMAD instruction. 2018-08-12 18:30:24 -04:00
bunnei
f2c7b5dcd6
Merge pull request #1024 from Subv/blend_gl
GPU/Maxwell3D: Implemented an alternative set of blend factors.
2018-08-11 22:39:02 -04:00
Subv
969326bd58 GPU/Maxwell3D: Implemented an alternative set of blend factors.
These are used by nouveau and some games like SMO.
2018-08-11 20:57:16 -05:00
Subv
2dad1204e8 RasterizerGL: Ignore invalid/unset vertex attributes.
This should make the es2gears example not crash anymore.
2018-08-11 20:36:40 -05:00
bunnei
403dfd68fc
Merge pull request #1010 from bunnei/unk-vert-attrib-shader
gl_shader_decompiler: Improve handling of unknown input/output attributes.
2018-08-11 19:56:28 -04:00
bunnei
0b668d5ff3 gl_shader_decompiler: Improve handling of unknown input/output attributes. 2018-08-11 19:26:45 -04:00
bunnei
670a2c1f80
Merge pull request #1018 from Subv/ssy_sync
GPU/Shader: Implemented SSY and SYNC as a set_target/jump pair.
2018-08-11 19:10:02 -04:00
Subv
c1ad973881 GPU/Shader: Don't predicate instructions that don't have a predicate field (SSY). 2018-08-11 16:00:14 -05:00
Lioncash
b8c43b6080 video_core: Use variable template variants of type_traits interfaces where applicable 2018-08-09 20:45:48 -04:00
bunnei
efe6b473c5 maxwell_3d: Ignore macros that have not been uploaded yet.
- Used by Super Mario Odyssey (in game).
2018-08-08 23:25:37 -04:00
bunnei
25ba4d1b68
Merge pull request #982 from bunnei/stub-unk-63
gl_shader_decompiler: Stub input attribute Unknown_63.
2018-08-08 22:28:18 -04:00
bunnei
cf917a5e93
Merge pull request #976 from bunnei/shader-imm
gl_shader_decompiler: Let OpenGL interpret floats.
2018-08-08 19:17:01 -04:00
bunnei
7f0d0a93f7 gl_shader_decompiler: Stub input attribute Unknown_63. 2018-08-08 02:35:59 -04:00
bunnei
57982df105 maxwell_3d: Use correct const buffer size and check bounds.
- Fixes mem corruption with Super Mario Odyssey and Pokkén Tournament DX.
2018-08-08 02:10:25 -04:00
bunnei
e542356d0c gl_shader_decompiler: Let OpenGL interpret floats.
- Accuracy is lost in translation to string, e.g. with NaN.
- Needed for Super Mario Odyssey.
2018-08-08 01:45:23 -04:00
bunnei
904d7eaa94 maxwell_3d: Remove outdated assert. 2018-08-05 23:57:19 -04:00
Lioncash
6030c5ce41 video_core: Eliminate the g_renderer global variable
We move the initialization of the renderer to the core class, while
keeping the creation of it and any other specifics in video_core. This
way we can ensure that the renderer is initialized and doesn't give
unfettered access to the renderer. This also makes dependencies on types
more explicit.

For example, the GPU class doesn't need to depend on the
existence of a renderer, it only needs to care about whether or not it
has a rasterizer, but since it was accessing the global variable, it was
also making the renderer a part of its dependency chain. By adjusting
the interface, we can get rid of this dependency.
2018-08-04 02:36:57 -04:00
Subv
8f2c4191ab GPU: Remove the assert that required the CODE_ADDRESS to be 0.
Games usually just leave it at 0 but nouveau sets it to something else. This already works fine, the assert is useless.
2018-07-24 13:54:12 -05:00
bunnei
148a5bef7e shader_bytecode: Implement other TEXS masks. 2018-07-22 03:23:15 -04:00
bunnei
c43eaa94f3 gl_shader_decompiler: Implement SEL instruction. 2018-07-22 00:37:12 -04:00
bunnei
5287991a36 maxwell_3d: Add depth buffer enable, width, and height registers. 2018-07-21 21:51:05 -04:00
Lioncash
bb960c8cb4 video_core: Use nested namespaces where applicable
Compresses a few namespace specifiers to be more compact.
2018-07-20 18:23:54 -04:00
Lioncash
8b08f82dc7 maxwell_3d: Remove unused variable within GetStageTextures() 2018-07-19 22:38:28 -04:00
Subv
3d3b10adc7 GPU: Added register definitions for the stencil parameters. 2018-07-17 15:00:21 -05:00
bunnei
8aeff9cf8e gl_rasterizer: Fix check for if a shader stage is enabled. 2018-07-12 22:57:57 -04:00
bunnei
64b5e5d5d9
Merge pull request #655 from bunnei/pred-lt-nan
gl_shader_decompiler: Implement PredCondition::LessThanWithNan.
2018-07-12 18:59:15 -07:00
bunnei
49c0c081c4 gl_shader_decompiler: Implement PredCondition::LessThanWithNan. 2018-07-12 20:04:35 -04:00
bunnei
4757ffdcce gl_shader_decompiler: Use FlowCondition field in EXIT instruction. 2018-07-12 20:00:37 -04:00
Sebastian Valle
274d1fb0fc
Merge pull request #652 from Subv/fadd32i
GPU: Implement the FADD32I shader instruction.
2018-07-12 17:36:51 -05:00
bunnei
3ff21345b4
Merge pull request #651 from Subv/ffma_decode
GPU: Corrected the decoding of FFMA for immediate operands.
2018-07-12 12:42:58 -07:00
Subv
c1ae841f47 GPU: Implement the FADD32I shader instruction. 2018-07-12 12:00:31 -05:00
Subv
0cad310e12 GPU: Corrected the decoding of FFMA for immediate operands. 2018-07-12 10:15:48 -05:00
bunnei
639346bcfb
Merge pull request #625 from Subv/imnmx
GPU: Implemented the IMNMX shader instruction.
2018-07-07 19:33:50 -07:00
bunnei
51bd76a5fd
Merge pull request #629 from Subv/depth_test
GPU: Allow using the old NV04 values for the depth test function.
2018-07-05 16:43:10 -04:00
Subv
9f6a5660e8 GPU: Allow using the old NV04 values for the depth test function.
These seem to be just a valid as the GL token values. Thanks @ReinUsesLisp

This restores graphical output to Disgaea 5
2018-07-05 13:01:31 -05:00