Commit Graph

2052 Commits

Author SHA1 Message Date
raven02
b91f7d5d67 Add 1D sampler for TLDS - TexelFetch (Mario Rabbids) 2018-09-17 23:25:18 +08:00
bunnei
076add4ccd
Merge pull request #1326 from FearlessTobi/port-4182
Port #4182 from Citra: "Prefix all size_t with std::"
2018-09-17 09:51:47 -04:00
bunnei
3be048e50a
Merge pull request #1329 from raven02/bgr5a1u
Implement RenderTargetFormat::BGR5A1_UNORM
2018-09-17 09:49:00 -04:00
raven02
2845348608 Implement ASTC_2D_8X8 (Bayonetta 2) 2018-09-17 01:04:27 +08:00
bunnei
ba480ea2fb
Merge pull request #1273 from Subv/ld_sizes
Shaders: Implemented multiple-word loads and stores to and from attribute memory.
2018-09-15 15:27:12 -04:00
bunnei
daee15b058
Merge pull request #1271 from Subv/kepler_engine
GPU: Basic implementation of the Kepler Inline Memory engine (p2mf).
2018-09-15 13:27:07 -04:00
raven02
0019a36b41 Implement RenderTargetFormat::BGR5A1_UNORM (Pokken Tournament DX) 2018-09-16 00:21:42 +08:00
Subv
c878a819d7 Shaders: Implemented multiple-word loads and stores to and from attribute memory.
This seems to be an optimization performed by nouveau.
2018-09-15 11:21:21 -05:00
fearlessTobi
63c2e32e20 Port #4182 from Citra: "Prefix all size_t with std::" 2018-09-15 15:21:06 +02:00
FernandoS27
f8e994354f Optimized Texture Swizzling 2018-09-14 12:45:49 -04:00
Lioncash
ae128f0375 gl_shader_decompiler: Get rid of variable shadowing within LEA instructions
These variables are already defined within an outer scope.
2018-09-13 21:53:23 -04:00
ReinUsesLisp
a42376dfad Use ARB_multi_bind for uniform buffers (#1287)
* gl_rasterizer: use ARB_multi_bind for uniform buffers

* address feedback
2018-09-12 20:27:43 -04:00
bunnei
4a43fb7e1d gl_rasterizer_cache: B5G6R5U should use GL_RGB8 as an internal format.
- Fixes a regression with Sonic Mania with ARB_texture_storage.
2018-09-12 18:09:31 -04:00
bunnei
cc50857460
Merge pull request #1263 from FernandoS27/tex-mode
shader_decompiler:  Implemented (Partially) Texture Processing Modes
2018-09-12 16:03:34 -04:00
Subv
bb5eb4f20a GPU: Basic implementation of the Kepler Inline Memory engine (p2mf).
This engine writes data from a FIFO register into the configured address.
2018-09-12 13:57:08 -05:00
FernandoS27
a99d9db32f Implemented Texture Processing Modes 2018-09-12 12:28:22 -04:00
bunnei
89825766ee
Merge pull request #1278 from tech4me/bg-color-fix
Port Citra #4047 & #4052: add change background color support
2018-09-11 23:13:11 -04:00
bunnei
522a11a11f
Merge pull request #1295 from bunnei/accurate-copies
gl_rasterizer_cache: Improve accuracy of caching and copies.
2018-09-11 23:12:15 -04:00
bunnei
4a9acc87f9
Merge pull request #1294 from degasus/optimizations
gl_rasterizer: Use ARB_texture_storage.
2018-09-11 23:11:36 -04:00
bunnei
7bb226f22d gl_rasterizer_cache: Always blit on recreate, regardless of format.
- Fixes several rendering issues with Super Mario Odyssey.
2018-09-11 22:54:46 -04:00
bunnei
cdddd71d08 gl_shader_cache: Remove cache_width/cache_height.
- This was once an optimization, but we no longer need it with the cache reserve.
- This is also inaccurate.
2018-09-11 20:12:29 -04:00
Markus Wick
3e973bc4c6 gl_rasterizer: Use ARB_texture_storage.
It allows us to use texture views and it reduces the overhead within the GPU driver.

But it disallows us to reallocate the texture, but we don't do so anyways.

In the end, it is the new way to allocate textures, so there is no need to use the old way.
2018-09-11 22:18:46 +02:00
FernandoS27
5c676dc884 Implemented LEA and PSET 2018-09-11 12:50:52 -04:00
FernandoS27
3f0922715a Implemented encodings for LEA and PSET 2018-09-11 12:50:25 -04:00
FernandoS27
2b48cfd44b Replace old FragmentHeader for the new Header 2018-09-11 12:48:19 -04:00
FernandoS27
e926757c8f Implemented (Partialy) Shader Header 2018-09-11 12:34:27 -04:00
David Marcec
4c3bd33be2 Fixed renderdoc input/output textures not working due to render targets 2018-09-11 13:31:20 +10:00
bunnei
d6e8e16a66
Merge pull request #1286 from bunnei/multi-clear
gl_rasterizer: Implement clear for non-zero render targets.
2018-09-10 20:30:14 -04:00
bunnei
12445b476d
Merge pull request #1285 from bunnei/depth-fix
gl_rasterizer_cache: Only use depth for applicable texture formats.
2018-09-10 20:28:40 -04:00
bunnei
d884e805c5
Merge pull request #1284 from bunnei/bgra8_srgb
gl_rasterizer_cache: Implement RenderTargetFormat::BGRA8_SRGB.
2018-09-10 20:28:00 -04:00
Markus Wick
c1b8cd9058 video_core: Refactor command_processor.
Inline the WriteReg helper as it is called ~20k times per frame.
2018-09-10 22:06:16 +02:00
Markus Wick
0cfb0bacb2 video_core: Move command buffer loop.
This moves the hot loop into video_core. This refactoring shall reduce the CPU overhead of calling ProcessCommandList.
2018-09-10 22:06:13 +02:00
Markus Wick
c560043581 rasterizer: Drop unused handler.
This virtual function is called in a very hot spot, and it does nothing.

If this kind of feature is required, please be more specific and add callbacks
in the switch statement within Maxwell3D::WriteReg. There is no point in having
another switch statement within the rasterizer.
2018-09-10 22:03:10 +02:00
bunnei
4c0b1cc1ae gl_rasterizer_cache: Only use depth for applicable texture formats.
- Fixes an issue with Octopath Traveler leaving stale data here.
2018-09-10 00:50:38 -04:00
bunnei
035e6bd407 gl_rasterizer: Implement clear for non-zero render targets.
- Several misc. changes to ConfigureFramebuffers in support of this.
2018-09-10 00:41:20 -04:00
bunnei
1c34498368 gl_rasterizer_cache: Implement RenderTargetFormat::BGRA8_SRGB.
- Used by Octopath Traveler (with multiple render targets).
2018-09-10 00:37:52 -04:00
bunnei
49b15af054 gl_rasterizer: Implement multiple color attachments. 2018-09-09 22:48:28 -04:00
bunnei
e58855c7a4
Merge pull request #1268 from FernandoS27/tmml
shader_decompiler: Implemented TMML
2018-09-09 21:39:39 -04:00
FernandoS27
00131e752d Implemented TMML 2018-09-09 20:46:31 -04:00
bunnei
223ddb2008
Merge pull request #1272 from Subv/dma_2d
GPU/DMA: Partially implemented the 'enable_2d' bit in the DMA engine.
2018-09-09 19:53:17 -04:00
bunnei
fcf81147e7
Merge pull request #1280 from zero334/improvements
video_core: fixed arithmetic overflow warnings & improved code style
2018-09-09 19:51:46 -04:00
FernandoS27
073a21ac0b Implemented TXQ dimension query type, used by SMO. 2018-09-09 11:59:01 -04:00
Patrick Elsässer
64e45b04e0 video_core: fixed arithmetic overflow warnings & improved code style
- Fixed all warnings, for renderer_opengl items, which were indicating a
possible incorrect behavior from integral promotion rules and types
larger than those in which arithmetic is typically performed.
- Added const for variables where possible and meaningful.
- Added constexpr where possible.
2018-09-09 17:51:43 +02:00
tech4me
3dcedb36b4 Port Citra #4047 & #4052: add change background color support 2018-09-08 17:00:21 -07:00
FernandoS27
82a313a14c Change name of TEXQ to TXQ, in order to match NVIDIA's naming 2018-09-08 18:08:57 -04:00
Subv
fdb199290b GPU/DMA: Partially implemented the 'enable_2d' bit in the DMA engine.
When not set, this tells the GPU to only use the X size when performing a DMA copy.
This is only implemented for linear->linear and tiled->tiled copies. Conversion copies still retain the assert.

This bit is unset by some games for various purposes, and by nouveau when copying the vertex buffers.
2018-09-08 16:02:16 -05:00
bunnei
af074ee422
Merge pull request #1256 from bunnei/tex-target-support
Initial support for non-2D textures
2018-09-08 16:14:46 -04:00
bunnei
8b08cb925b gl_rasterizer: Use baseInstance instead of moving the buffer points.
This hopefully helps our cache not to redundant upload the vertex buffer.

# Conflicts:
#	src/video_core/renderer_opengl/gl_rasterizer.cpp
2018-09-08 04:05:56 -04:00
Patrick Elsässer
a8974f0556 video_core: Arithmetic overflow warning fix for gl_rasterizer (#1262)
* video_core: Arithmetic overflow fix for gl_rasterizer

- Fixed warnings, which were indicating incorrect behavior from integral
promotion rules and types larger than those in which arithmetic is
typically performed.

- Added const for variables where possible and meaningful.

* Changed the casts from C to C++ style

Changed the C-style casts to C++ casts as proposed.
Took also care about signed / unsigned behaviour.
2018-09-08 02:59:59 -04:00
bunnei
23ae7cf9db gl_rasterizer_cache: Improve accuracy of RecreateSurface for non-2D textures. 2018-09-08 02:53:39 -04:00
bunnei
fdd5c97a14 maxwell_3d: Remove assert that no longer applies. 2018-09-08 02:53:39 -04:00
bunnei
f165a85398 gl_rasterizer_cache: Partially implement several non-2D texture types. 2018-09-08 02:53:38 -04:00
bunnei
0731383124 gl_shader_decompiler: Partially implement several non-2D texture types (Subv). 2018-09-08 02:53:38 -04:00
bunnei
05f6f59ffb gl_rasterizer: Implement texture wrap mode p. 2018-09-08 02:53:38 -04:00
bunnei
ce8291f6c5 gl_rasterizer_cache: Track texture depth. 2018-09-08 02:53:38 -04:00
bunnei
9dccf7e1fa gl_rasterizer_cache: Remove impl. of FlushGLBuffer.
- Will not work for non-2d textures, and was not used anyways.
2018-09-08 02:53:37 -04:00
bunnei
030676b95d gl_rasterizer_cache: Keep track of texture type per surface. 2018-09-08 02:53:37 -04:00
bunnei
a439f7b6e1 gl_rasterizer_cache: Remove unused DownloadGLTexture. 2018-09-08 02:53:37 -04:00
bunnei
b56e5edafc gl_state: Keep track of texture target. 2018-09-08 02:53:37 -04:00
bunnei
9947c6ad59
Merge pull request #1252 from lioncash/header
video_core/CMakeLists: Add missing gl_buffer_cache.h
2018-09-06 19:19:43 -04:00
bunnei
9b50dca2bb
Merge pull request #1253 from lioncash/explicit
video_core/gl_buffer_cache: Minor tidying changes
2018-09-06 19:19:35 -04:00
bunnei
009a2cc9cc
Merge pull request #1255 from bunnei/minor-opt
gl_rasterizer: Call state.Apply only once on SetupShaders.
2018-09-06 19:19:16 -04:00
bunnei
820f646458 gl_rasterizer: Call state.Apply only once on SetupShaders. 2018-09-06 17:41:53 -04:00
bunnei
948f6c0738 gl_shader_decompiler: Implement saturate mode for IPA. 2018-09-06 17:40:03 -04:00
Lioncash
ddcdbce067 gl_buffer_cache: Default initialize member variables
Ensures that the cache always has a deterministic initial state.
2018-09-06 15:07:15 -04:00
Lioncash
8d685a29bc gl_buffer_cache: Make GetHandle() a const member function
GetHandle() internally calls GetHandle() on the stream_buffer instance,
which is a const member function, so this can be made const as well.
2018-09-06 15:07:15 -04:00
Lioncash
14230fe2af gl_buffer_cache: Remove unnecessary includes 2018-09-06 15:05:52 -04:00
Lioncash
68296d9474 gl_buffer_cache: Make constructor explicit
Implicit conversions during construction isn't desirable here.
2018-09-06 14:54:49 -04:00
Lioncash
8f4e09ba07 video_core/CMakeLists: Add missing gl_buffer_cache.h
Without this, the header file won't show up by default within IDEs such
as Visual Studio.
2018-09-06 14:49:51 -04:00
Markus Wick
a781042700 gl_shader_gen: Initialize position.
IMO the old code is fine, but nvidia raises shader compiler warnings.
Trivial fix through...
2018-09-06 13:37:50 +02:00
bunnei
77554ac773
Merge pull request #1243 from degasus/VAO_cache
gl_rasterizer: Implement a VAO cache.
2018-09-05 22:50:52 -04:00
bunnei
6f09c5b128
Merge pull request #1244 from FernandoS27/ipa
shader_decompiler: Implemented IPA Properly (Stage 1)
2018-09-05 21:20:40 -04:00
FernandoS27
e63b229f4a Implemented IPA Properly 2018-09-05 20:15:47 -04:00
Markus Wick
7f15306f78 gl_rasterizer: Skip TODO log.
This is called ~3k times per frame in SMO ingame.
My laptop spends ~3ms per frame on allocating and freeing this string.

Let's just stop printing this kind of redundant information.
2018-09-05 20:20:20 +02:00
Markus Wick
d3ad9469a1 gl_rasterizer: Implement a VAO cache.
This patch caches VAO objects instead of re-emiting all pointers per draw call.
Configuring this pointers is known as a fast task, but it yields too many GL
calls. So for better performance, just bind the VAO instead of 16 pointers.
2018-09-05 18:46:35 +02:00
Markus Wick
50a806ea67 renderer_opengl: Implement a buffer cache.
The idea of this cache is to avoid redundant uploads. So we are going
to cache the uploaded buffers within the stream_buffer and just reuse
the old pointers.
The next step is to implement a VBO cache on GPU memory, but for now,
I want to check the overhead of the cache management. Fetching the
buffer over PCI-E should be quite fast.
2018-09-05 08:03:50 +02:00
Markus Wick
99a71580c4 gl_shader_cache: Use an u32 for the binding point cache.
The std::string generation with its malloc and free requirement
was a noticeable overhead. Also switch to an ordered_map to
avoid the std::hash call. As those maps usually have a size of
two elements, the lookup time shall not matter.
2018-09-04 21:04:41 +02:00
bunnei
9a07e9f805
Merge pull request #1237 from degasus/optimizations
Optimizations
2018-09-04 12:16:06 -04:00
bunnei
26e96d16d0
Merge pull request #1232 from lioncash/copy
gl_shader_decompiler: Use used_shaders member variable directly within GenerateDeclarations()
2018-09-04 11:52:25 -04:00
Markus Wick
2081ed7db2 command_processor: Use std::array for bound_engines.
subchannel is a 3 bit field. So there must not be more than 8 bound engines.
And using a hashmap for up to 8 values is a bit overpowered.
2018-09-04 14:10:05 +02:00
Markus Wick
10bc725944 Update microprofile scopes.
Blame the subsystems which deserve the blame :)

The updated list is not complete, just the ones I've spotted on random sampling the stack trace.
2018-09-04 11:04:26 +02:00
Lioncash
18a89931a9 gl_shader_decompiler: Use used_shaders member variable directly within GenerateDeclarations()
Using the getter function intended for external code here makes an
unnecessary copy of the already-accessible used_shaders vector.
2018-09-02 13:10:11 -04:00
bunnei
325f3e0693
Merge pull request #1213 from DarkLordZach/octopath-fs
filesystem/maxwell_3d: Various changes to boot Project Octopath Traveller
2018-09-02 10:49:18 -04:00
bunnei
89be49d2f3
Merge pull request #1215 from ogniK5377/texs-nodep-assert
Added assert for TEXS nodep
2018-09-02 10:48:27 -04:00
bunnei
177c45e97d
Merge pull request #1214 from ogniK5377/ipa-assert
Added better asserts to IPA, Renamed IPA modes to match mesa
2018-09-02 10:44:43 -04:00
bunnei
9c206fe94d
Merge pull request #1216 from ogniK5377/ffma-assert
Added FFMA asserts and missing fields
2018-09-02 10:44:13 -04:00
David Marcec
60754b4728 Removed saturate assert
Unneeded as we already implement it
2018-09-01 19:33:32 +10:00
David Marcec
2edab4e840 Removed saturate assert
Saturate already implemented
2018-09-01 19:29:20 +10:00
David Marcec
2bc6abb9a1 Changed tab5980_0 default from 0 -> 1 2018-09-01 19:15:03 +10:00
David Marcec
6f8ed9508d Added FMUL asserts 2018-09-01 19:05:10 +10:00
David Marcec
b89fc407d7 Added FFMA asserts 2018-09-01 18:45:14 +10:00
David Marcec
948bc87a59 Added assert for TEXS nodep 2018-09-01 17:00:01 +10:00
David Marcec
ad3dca7e62 Added better asserts to IPA, Renamed IPA modes to match mesa
IpaMode is changed to IpaInterpMode
IpaMode is suppose to be 2 bits not 3
Added IpaSampleMode
Added Saturate

Renamed modes based on
d27c791891/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp (L2530)
2018-09-01 16:34:27 +10:00
Zach Hilman
f32e28c7b8 maxwell_3d: Use CoreTiming for query timestamp 2018-08-31 23:25:18 -04:00
Lioncash
4a587b81b2 core/core: Replace includes with forward declarations where applicable
The follow-up to e2457418da, which
replaces most of the includes in the core header with forward declarations.

This makes it so that if any of the headers the core header was
previously including change, then no one will need to rebuild the bulk
of the core, due to core.h being quite a prevalent inclusion.

This should make turnaround for changes much faster for developers.
2018-08-31 16:30:14 -04:00
bunnei
7f7eb29323 gl_rasterizer_cache: Use accurate framebuffer setting for accurate copies. 2018-08-31 13:07:28 -04:00
bunnei
123c065086 gl_rasterizer_cache: Also use reserve cache for RecreateSurface. 2018-08-31 13:07:28 -04:00
bunnei
9bc71fcc5f rasterizer_cache: Use boost::interval_map for a more accurate cache. 2018-08-31 13:07:28 -04:00
bunnei
d647d9550c gl_renderer: Cache textures, framebuffers, and shaders based on CPU address. 2018-08-31 13:07:27 -04:00
bunnei
16d65182f9 gl_rasterizer: Fix issues with the rasterizer cache.
- Use a single cached page map.
- Fix calculation of ending page.
2018-08-31 13:07:27 -04:00
greggameplayer
06578e89b2 Implement BC6H_UF16 & BC6H_SF16 (#1092)
* Implement BC6H_UF16 & BC6H_SF16
Require by ARMS

* correct coding style

* correct coding style part 2
2018-08-31 12:11:19 -04:00
bunnei
f08d24e9c0
Merge pull request #1204 from lioncash/pimpl
core: Make the main System class use the PImpl idiom
2018-08-31 11:31:20 -04:00
bunnei
6683bf50b5
Merge pull request #1207 from degasus/hotfix
Report correct shader size.
2018-08-31 11:21:15 -04:00
Lioncash
e2457418da core: Make the main System class use the PImpl idiom
core.h is kind of a massive header in terms what it includes within
itself. It includes VFS utilities, kernel headers, file_sys header,
ARM-related headers, etc. This means that changing anything in the
headers included by core.h essentially requires you to rebuild almost
all of core.

Instead, we can modify the System class to use the PImpl idiom, which
allows us to move all of those headers to the cpp file and forward
declare the bulk of the types that would otherwise be included, reducing
compile times. This change specifically only performs the PImpl portion.
2018-08-31 07:16:57 -04:00
Markus Wick
5be8b7a362 Report correct shader size.
Seems like this was an oversee in regards to 1fd979f50a
It changed GLShader::ProgramCode to a std::vector, so sizeof is wrong.
2018-08-31 09:56:37 +02:00
Hexagon12
d626bc8c62 Added predicate comparison GreaterEqualWithNan 2018-08-31 10:40:18 +03:00
Laku
915ab81ec2 gl_shader_decompiler: Implement POPC (#1203)
* Implement POPC

* implement invert
2018-08-30 21:32:58 -04:00
bunnei
d6accf96ff
Merge pull request #1200 from bunnei/improve-ipa
gl_shader_decompiler: Improve IPA for Pass mode with Position attribute.
2018-08-30 10:31:26 -04:00
tech4me
a6dd577d02 Shaders: Implemented IADD3 2018-08-29 13:44:41 -04:00
bunnei
b1ccd88434 gl_shader_decompiler: Improve IPA for Pass mode with Position attribute. 2018-08-29 00:37:29 -04:00
bunnei
4d7e1662c8
Merge pull request #1193 from lioncash/priv
gpu: Make memory_manager private
2018-08-28 12:28:57 -04:00
bunnei
eb4f2d5596
Merge pull request #1192 from lioncash/unused
gl_rasterizer: Remove unused variables
2018-08-28 12:28:13 -04:00
Lioncash
2e7dc4cac9 gl_shader_cache: Remove unused program_code vector in GetShaderAddress()
Given std::vector is a type with a non-trivial destructor, this
variable cannot be optimized away by the compiler, even if unused.
Because of that, something that was intended to be fairly lightweight,
was actually allocating 32KB and deallocating it at the end of the
function.
2018-08-28 11:20:41 -04:00
Lioncash
45fb74d262 gpu: Make memory_manager private
Makes the class interface consistent and provides accessors for
obtaining a reference to the memory manager instance.

Given we also return references, this makes our more flimsy uses of
const apparent, given const doesn't propagate through pointers in the
way one would typically expect. This makes our mutable state more
apparent in some places.
2018-08-28 11:11:50 -04:00
Lioncash
6771a18c6c gl_rasterizer: Remove unused variables 2018-08-28 10:46:29 -04:00
bunnei
b55d8111e6 renderer_opengl: Implement a new shader cache. 2018-08-27 18:26:46 -04:00
bunnei
a0e1566dc5 gl_rasterizer_cache: Update to use RasterizerCache base class. 2018-08-27 18:26:46 -04:00
bunnei
382852418b video_core: Add RasterizerCache class for common cache management code. 2018-08-27 18:26:45 -04:00
bunnei
2f5ed3877c
Merge pull request #1169 from Lakumakkara/sel
shader_bytecode: fix SEL_IMM bitstring
2018-08-27 18:24:57 -04:00
bunnei
f96ded9815
Merge pull request #1174 from lioncash/debug
debug_utils: Minor individual interface changes
2018-08-27 15:44:29 -04:00
bunnei
be2f1eabd7
Merge pull request #1173 from lioncash/batch
maxwell3d: Move FinishedPrimitiveBatch event after AcceleratedDrawBatch()
2018-08-25 10:59:54 -04:00
bunnei
23b86fd3ea
Merge pull request #1167 from lioncash/assert
gl_rasterizer: Correct assertion condition in SyncLogicOpState()
2018-08-25 10:50:59 -04:00
Lioncash
c65713832c debug_utils: Remove unused includes
Quite a bit of these aren't necessary directly within the debug_utils
header and can be removed or included where actually necessary.
2018-08-24 20:49:14 -04:00
Lioncash
1e6a209649 debug_utils: Make BreakpointObserver class' constructor explicit
Avoids implicit conversions.
2018-08-24 20:49:14 -04:00
Lioncash
b6425c0511 debug_utils: Initialize active_breakpoint member of DebugContext
Ensures that all class members are initialized.
2018-08-24 20:15:50 -04:00
Lioncash
20800f2df7 maxwell3d: Move FinishedPrimitiveBatch event after AcceleratedDrawBatch()
The start and finish events should likely not be right after one another
like this, otherwise the batch will appear to complete immediately
2018-08-24 19:58:05 -04:00
Laku
36093a3e4d fix SEL_IMM bitstring 2018-08-24 07:18:12 +03:00
Lioncash
8fd9eb71b4 gl_rasterizer: Correct assertion condition in SyncLogicOpState()
Previously the assert would always be hit, since it was the equivalent
of: array == nullptr, which is never true.
2018-08-23 23:00:54 -04:00
tech4me
ba2972bc64 Shaders: Added decodings for IADD3 instructions 2018-08-23 15:46:59 -04:00
bunnei
0dce6d7008
Merge pull request #1160 from bunnei/surface-reserve
gl_rasterizer_cache: Several improvements
2018-08-23 12:04:37 -04:00
bunnei
d65f079cc1 gl_rasterizer_cache: Blit when possible on RecreateSurface. 2018-08-23 11:27:01 -04:00
bunnei
fee8bdd90c gl_rasterizer_cache: Reserve surfaces that have already been created for later use. 2018-08-23 11:27:01 -04:00
bunnei
fde2017a3f gl_rasterizer_cache: Remove assert for RecreateSurface type. 2018-08-23 11:27:00 -04:00
bunnei
ebf5768340 gl_rasterizer_cache: Implement compressed texture copies. 2018-08-23 11:27:00 -04:00
bunnei
a4ac3bed6c gl_rasterizer: Implement stencil test.
- Used by Splatoon 2.
2018-08-23 11:08:49 -04:00
bunnei
da3da6be90 gl_rasterizer: Implement partial color clear and stencil clear. 2018-08-23 11:08:48 -04:00
bunnei
2a472ff54d maxwell_3d: Update to include additional stencil registers. 2018-08-23 11:08:47 -04:00
bunnei
c4ed0b16b1 gl_state: Update to handle stencil front/back face separately. 2018-08-23 11:08:46 -04:00
bunnei
c7f2fb2151
Merge pull request #1157 from lioncash/vec
gl_shader_gen: Use a std::vector to represent program code instead of std::array
2018-08-23 02:19:00 -04:00
bunnei
232b0d9d2a
Merge pull request #1156 from Lakumakkara/lop3
gl_shader_decompiler: Implement LOP3
2018-08-23 02:16:49 -04:00
Lioncash
12ba80a86c gl_shader_gen: Make ShaderSetup's constructor explicit
Prevents implicit conversions.
2018-08-22 17:04:44 -04:00
Lioncash
1fd979f50a gl_shader_gen: Use a std::vector to represent program code instead of std::array
While convenient as a std::array, it's also quite a large set of data as
well (32KB). It being an array also means data cannot be std::moved. Any
situation where the code is being set or relocated means that a full
copy of that 32KB data must be done.

If we use a std::vector we do need to allocate on the heap, however, it
does allow us to std::move the data we have within the std::vector into
another std::vector instance, eliminating the need to always copy the
program data (as std::move in this case would just transfer the pointers
and bare necessities over to the new vector instance).
2018-08-22 17:04:44 -04:00
Laku
b2ca8089ce more fixes 2018-08-23 00:01:40 +03:00
Laku
e70a3c5a5d fixes 2018-08-22 21:33:32 +03:00
Lioncash
dd35b4b18a renderer_opengl: Namespace OpenGL code
Namespaces all OpenGL code under the OpenGL namespace.

Prevents polluting the global namespace and allows clear distinction
between other renderers' code in the future.
2018-08-22 06:14:47 -04:00
Laku
4877e6c2f6 remove debug logging 2018-08-22 11:45:28 +03:00
Laku
8e8326595f implement lop3 2018-08-22 10:09:44 +03:00
bunnei
cea627b0fc
Merge pull request #840 from FearlessTobi/port-3353
Port #3353 from Citra: "citra-qt: Add customizable speed limit target "
2018-08-22 01:19:50 -04:00
bunnei
5abf71fe65
Merge pull request #1154 from OatmealDome/topology-lines
maxwell_to_gl: Implement PrimitiveTopology::Lines
2018-08-22 01:08:34 -04:00
bunnei
125d7122ac
Merge pull request #1124 from Subv/logic_ops
GPU: Implemented logic ops.
2018-08-22 01:05:25 -04:00
OatmealDome
ad1220e1b3
maxwell_to_gl: Implement PrimitiveTopology::Lines
Used by Splatoon 2's debug menu.
2018-08-22 01:01:06 -04:00
bunnei
d63b1d21f1 Revert "Shader: Use the right sampler type in the TEX, TEXS and TLDS instructions."
- This reverts commit 3ef4b3d4b4.
- This commit had broken a lot of games. We really should do a full implementation of this in one change.
2018-08-21 20:07:40 -04:00
Lioncash
a0e2bd85a5 shader_bytecode: Parenthesize conditional expression within GetTextureType()
Resolves a -Wlogical-op-parentheses warning.
2018-08-21 15:08:35 -04:00
bunnei
bf89a99839
Merge pull request #1123 from lioncash/screen
rasterizer_interface: Remove renderer-specific ScreenInfo type from AccelerateDraw() in RasterizerInterface
2018-08-21 01:18:34 -04:00
bunnei
b0f7713fce
Merge pull request #1132 from Subv/gl_FragDepth
Shaders: Implement depth writing in fragment shaders.
2018-08-21 01:17:53 -04:00
bunnei
8c9abe1d41
Merge pull request #1134 from lioncash/log
renderer_opengl: Use LOG_DEBUG for GL_DEBUG_SEVERITY_NOTIFICATION and GL_DEBUG_SEVERITY_LOW logs
2018-08-21 01:17:31 -04:00
bunnei
ca58929eb0
Merge pull request #1121 from Subv/tex_reinterpret
Rasterizer: Use PBOs to reinterpret texture formats when games re-use the same memory.
2018-08-21 01:06:40 -04:00
Lioncash
523e4be02c renderer_opengl: Use LOG_DEBUG for GL_DEBUG_SEVERITY_NOTIFICATION and GL_DEBUG_SEVERITY_LOW logs
LOG_TRACE is only enabled on debug builds which can be quite slow when
trying to debug graphics issues. Instead we can log the messages to the
debug log, which is available on both release and debug builds.
2018-08-21 00:23:09 -04:00
bunnei
fde3b1b6f2
Merge pull request #1133 from lioncash/guard
gl_stream_buffer: Add missing header guard
2018-08-20 23:37:55 -04:00
Lioncash
93a4097e9d gl_stream_buffer: Add missing header guard
Prevents potential compilation errors from occuring due to multiple
inclusions
2018-08-20 23:25:08 -04:00
Subv
e3bddf8137 Shaders: Implement depth writing in fragment shaders.
We'll write <last color output reg + 2> to gl_FragDepth.
2018-08-20 21:57:56 -05:00
bunnei
e33452f7e8
Merge pull request #1131 from bunnei/impl-tex3d-texcube
gl_shader_decompiler: Implement TextureCube/Texture3D for TEX/TEXS.
2018-08-20 22:15:18 -04:00
bunnei
5aaee2ff8d
Merge pull request #1106 from Subv/multiple_rendertargets
Shaders: Write all the enabled color outputs when a fragment shader exits.
2018-08-20 21:56:06 -04:00
bunnei
2ae88feea7 shader_bytecode: Replace some UNIMPLEMENTED logs. 2018-08-20 21:53:49 -04:00
bunnei
16db8b9d9f gl_shader_decompiler: Implement Texture3D for TEXS. 2018-08-20 21:53:18 -04:00
bunnei
948002635f gl_shader_decompiler: Implement TextureCube for TEX. 2018-08-20 21:53:00 -04:00
Subv
eac3cf301c Shaders: Fixed the coords in TEX with Texture2D.
The X and Y coordinates should be in gpr8 and gpr8+1, respectively.

This fixes the cutscene rendering in Sonic Mania.
2018-08-20 20:45:46 -05:00
Subv
fc5b489b0f Shaders: Log and crash when using an unimplemented texture type in a texture sampling instruction. 2018-08-20 20:44:56 -05:00
Subv
2b9eee4d1e GPU: Implemented the logic op functionality of the GPU.
This will ASSERT if blending is enabled at the same time as logic ops.
2018-08-20 18:44:47 -05:00
Subv
f24ab6d9e6 GLState: Allow enabling/disabling GL_COLOR_LOGIC_OP independently from blending. 2018-08-20 18:43:11 -05:00
Lioncash
46ef072cf9 rasterizer_interface: Remove ScreenInfo from AccelerateDraw()'s signature
This is an OpenGL renderer-specific data type. Given that, this type
shouldn't be used within the base interface for the rasterizer. Instead,
we can pass this information to the rasterizer via reference.
2018-08-20 19:43:05 -04:00
Subv
6bcdf37d4f GPU: Added registers for the logicop functionality. 2018-08-20 18:42:36 -05:00
Lioncash
bc16f7f3cc renderer_base: Make creation of the rasterizer, the responsibility of the renderers themselves
Given we use a base-class type within the renderer for the rasterizer
(RasterizerInterface), we want to allow renderers to perform more
complex initialization if they need to do such a thing. This makes it
important to reserve type information.

Given the OpenGL renderer is quite simple settings-wise, this is just a
simple shuffling of the initialization code. For something like Vulkan
however this might involve doing something like:

// Initialize and call rasterizer-specific function that requires
// the full type of the instance created.
auto raster = std::make_unique<VulkanRasterizer>(some, params);
raster->CallSomeVulkanRasterizerSpecificFunction();

// Assign to base class variable
rasterizer = std::move(raster)
2018-08-20 19:28:00 -04:00
fearlessTobi
ba8ff096fd Port #3353 from Citra 2018-08-21 01:14:06 +02:00
Subv
7784ce1854 Shaders: Write all the enabled color outputs when a fragment shader exits.
We were only writing to the first render target before.
Note that this is only the GLSL side of the implementation, supporting multiple render targets requires more changes in the OpenGL renderer.

Dual Source blending is not implemented and stuff that uses it might not work at all.
2018-08-20 17:31:25 -05:00
Subv
d7c68fbb12 Rasterizer: Reinterpret the raw texture bytes instead of blitting (and thus doing format conversion) to a new texture when a game requests an old texture address with a different format. 2018-08-20 15:20:35 -05:00
Subv
3fe77be392 Rasterizer: Don't attempt to copy over the old texture's data when doing a format reinterpretation if we're only going to clear the framebuffer. 2018-08-20 15:20:35 -05:00
bunnei
028d90eb79
Merge pull request #1104 from Subv/instanced_arrays
GLRasterizer: Implemented instanced vertex arrays.
2018-08-20 14:32:50 -04:00
bunnei
296e57fa0e
Merge pull request #1115 from Subv/texs_mask
Shaders/TEXS: Write to the correct output register when swizzling.
2018-08-20 14:31:33 -04:00
bunnei
b20ed93884
Merge pull request #1112 from Subv/sampler_types
Shaders: Use the correct shader type when sampling textures.
2018-08-20 14:30:45 -04:00
David Marcec
23d45715dc Implemented RGBA8_UINT
Needed by kirby
2018-08-20 22:26:54 +10:00
Subv
6cf719a4ab Shaders/TEXS: Fixed the component mask in the TEXS instruction.
Previously we could end up with a TEXS that didn't write any outputs, this was wrong.
2018-08-19 17:09:40 -05:00
bunnei
51ddb130c5
Merge pull request #1089 from Subv/neg_bits
Shaders: Corrected the 'abs' and 'neg' bit usage in the float arithmetic instructions.
2018-08-19 17:01:48 -04:00
bunnei
9b17486be6
Merge pull request #1105 from Subv/convert_neg
Shader: Remove an unneeded assert, the negate bit is implemented for conversion instructions.
2018-08-19 17:01:20 -04:00
bunnei
0a1d4fbc5c
Merge pull request #1113 from Subv/texs_mask
Shaders/TEXS: Fixed the component mask in the TEXS instruction.
2018-08-19 17:00:59 -04:00
Subv
f7edbcd7a3 Shaders/TEXS: Fixed the component mask in the TEXS instruction.
Previously we could end up with a TEXS that didn't write any outputs, this was wrong.
2018-08-19 14:00:12 -05:00
bunnei
b0eb580931
Merge pull request #1102 from ogniK5377/mirror-clamp-edge
Added WrapMode MirrorOnceClampToEdge
2018-08-19 13:59:41 -04:00
bunnei
85da529f15
Merge pull request #1101 from Subv/ssy_stack
Shaders: Implemented a stack for the SSY/SYNC instructions.
2018-08-19 13:58:45 -04:00
Subv
7fb406c3fc Shader: Implemented the TLD4 and TLD4S opcodes using GLSL's textureGather.
It is unknown how TLD4S determines the sampler type, more research is needed.
2018-08-19 12:57:58 -05:00
Subv
3ef4b3d4b4 Shader: Use the right sampler type in the TEX, TEXS and TLDS instructions.
Different sampler types have their parameters in different registers.
2018-08-19 12:57:54 -05:00
Subv
73b937b190 Shader: Added bitfields for the texture type of the various sampling instructions. 2018-08-19 12:57:51 -05:00
Subv
656758fd81 Shaders: Added decodings for TLD4 and TLD4S 2018-08-19 12:57:08 -05:00
bunnei
29d4f8c2dd
Merge pull request #1109 from Subv/ldg_decode
Shaders: Added decodings for  the LDG and STG instructions.
2018-08-19 13:31:19 -04:00
bunnei
9baf5de90c
Merge pull request #1108 from Subv/front_facing
Shaders: Implemented the gl_FrontFacing input attribute (attr 63).
2018-08-19 13:21:14 -04:00
Subv
1b92ae136f Shaders: Added decodings for the LDG and STG instructions. 2018-08-19 00:46:34 -05:00
Subv
731701a2d2 Shaders: Implemented the gl_FrontFacing input attribute (attr 63). 2018-08-19 00:14:34 -05:00
Subv
9b1c49a9cf Shader: Remove an unneeded assert, the negate bit is implemented for conversion instructions. 2018-08-18 14:48:05 -05:00
Subv
e0f66c1fbf GLRasterizer: Implemented instanced vertex arrays.
Before each draw call, for every enabled vertex array configured as instanced, we take the current instance id and divide it by its configured divisor, then we multiply that by the corresponding stride and increment the start address by the resulting amount. This way we can simulate the vertex array being incremented once per instance without actually using OpenGL's instancing functions.
2018-08-18 14:42:26 -05:00
Subv
8335b2f115 Shader: Implemented the predicate and mode arguments of LOP.
The mode can be used to set the predicate to true depending on the result of the logic operation. In some cases, this means discarding the result (writing it to register 0xFF (Zero)).

This is used by Super Mario Odyssey.
2018-08-18 14:36:37 -05:00
David Marcec
71cc482bbd Added WrapMode MirrorOnceClampToEdge
Used by splatoon 2
2018-08-19 02:26:50 +10:00
Subv
ff358d97e8 Shaders: Implemented a stack for the SSY/SYNC instructions.
The SSY instruction pushes an address into the stack, and the SYNC instruction pops it. The current stack depth is 20, we should figure out if this is enough or not.
2018-08-18 10:48:12 -05:00
Subv
2e95ba2e9c Shaders: Corrected the 'abs' and 'neg' bit usage in the float arithmetic instructions.
We should definitely audit our shader generator for more errors like this.
2018-08-18 10:22:42 -05:00
David Marcec
63dff47e22 Added predcondition GreaterThanWithNan 2018-08-18 17:49:59 +10:00
bunnei
504cff2b7a
Merge pull request #1096 from bunnei/supported-blits
gl_rasterizer_cache: Remove asserts for supported blits.
2018-08-17 22:41:53 -04:00
bunnei
e341d868ee gl_rasterizer_cache: Remove asserts for supported blits. 2018-08-17 00:10:08 -04:00
bunnei
da7226442f renderer_opengl: Treat OpenGL errors as critical. 2018-08-17 00:09:27 -04:00
bunnei
727136a9c9
Merge pull request #1019 from Subv/vertex_divisor
Rasterizer: Manually implemented instanced rendering.
2018-08-17 00:07:06 -04:00
bunnei
89c3d6a2a3 gl_rasterizer_cache: Treat Depth formats differently from DepthStencil. 2018-08-15 21:24:04 -04:00
Subv
91140f6c0a Shader/Conversion: Implemented the negate bit in F2F and I2I instructions. 2018-08-15 09:27:43 -05:00
Subv
38592a3b5e Shader/I2F: Implemented the negate I2F_C instruction variant. 2018-08-15 09:25:02 -05:00
Subv
40ecdda19e Shader/F2I: Implemented the negate bit in the I2F instruction 2018-08-15 09:18:55 -05:00
Subv
5ef447cc0e Shader/F2I: Implemented the F2I_C instruction variant. 2018-08-15 09:16:35 -05:00
Subv
11c221cc62 Shader/F2I: Implemented the negate bit in the F2I instruction. 2018-08-15 09:15:55 -05:00
bunnei
40f83fee6a
Merge pull request #1077 from bunnei/rgba16u
gl_rasterizer_cache: Add RGBA16U to PixelFormatFromTextureFormat.
2018-08-15 09:25:15 -04:00
bunnei
b1148d269d gl_rasterizer_cache: Cleanup some PixelFormat names and logging. 2018-08-14 23:31:45 -04:00
Subv
c5284efd4f Rasterizer: Implemented instanced rendering.
We keep track of the current instance and update an uniform in the shaders to let them know which instance they are.

Instanced vertex arrays are not yet implemented.
2018-08-14 22:25:07 -05:00
bunnei
8599e1e4fc gl_rasterizer_cache: Add RGBA16U to PixelFormatFromTextureFormat.
- Used by Breath of the Wild.
2018-08-14 23:18:34 -04:00
bunnei
3aad82b1a3
Merge pull request #1069 from bunnei/vtx-sz
maxwell_to_gl: Properly handle UnsignedInt/SignedInt sizes.
2018-08-14 23:14:44 -04:00
bunnei
2a42dea568
Merge pull request #1070 from bunnei/cbuf-sz
gl_rasterizer: Fix upload size for constant buffers.
2018-08-14 23:14:24 -04:00
bunnei
c8cd1785e6
Merge pull request #1071 from bunnei/fix-ldc
gl_shader_decompiler: Several fixes for indirect constant buffer loads.
2018-08-14 23:14:09 -04:00
bunnei
991eb4824c
Merge pull request #1068 from bunnei/g8r8s
gl_rasterizer_cache: Implement G8R8S format.
2018-08-14 23:13:43 -04:00
greggameplayer
6eda9ebbdb Implement Z16_UNORM in PixelFormatFromTextureFormat function
Require by Zelda Breath Of The Wild
2018-08-15 04:14:15 +02:00
bunnei
5e66a24423 gl_shader_decompiler: Several fixes for indirect constant buffer loads. 2018-08-14 20:47:50 -04:00
bunnei
290439a6a5 gl_rasterizer: Fix upload size for constant buffers. 2018-08-14 20:44:19 -04:00
bunnei
dc876fd63a maxwell_to_gl: Properly handle UnsignedInt/SignedInt sizes. 2018-08-14 20:43:02 -04:00
bunnei
d8fd3ef4fe gl_rasterizer_cache: Implement G8R8S format.
- Used by Super Mario Odyssey.
2018-08-14 20:41:49 -04:00
bunnei
4dacb8a4b1
Merge pull request #1058 from greggameplayer/BC7U_Fix
Fix BC7U
2018-08-14 08:03:07 -04:00
greggameplayer
6bfcf13187 Fix BC7U 2018-08-14 02:36:00 +02:00
bunnei
6e52f37d5b renderer_opengl: Implement RenderTargetFormat::RGBA16_UNORM.
- Used by Breath of the Wild.
2018-08-13 18:20:07 -04:00
bunnei
46fbf6dd92
Merge pull request #1052 from ogniK5377/xeno
Implement RG32UI and R32UI
2018-08-13 12:31:39 -04:00
David Marcec
45cc022ea9 Implement RG32UI and R32UI
Needed for xenoblade
2018-08-13 22:55:16 +10:00
bunnei
41b77c4e0a maxwell_to_gl: Implement VertexAttribute::Size::Size_8.
- Used by Breath of the Wild.
2018-08-13 01:34:21 -04:00
bunnei
bdf17fe0cc renderer_opengl: Implement RenderTargetFormat::RGBA16_UINT.
- Used by Breath of the Wild.
2018-08-13 00:06:22 -04:00
bunnei
54ef9302a2
Merge pull request #1045 from bunnei/rg8-unorm
renderer_opengl: Implement RenderTargetFormat::RG8_UNORM.
2018-08-13 00:05:25 -04:00
bunnei
8fe118bcaa maxwell_to_gl: Implement PrimitiveTopology::LineStrip.
- Used by Breath of the Wild.
2018-08-12 23:09:32 -04:00
bunnei
c56a0e3c34 renderer_opengl: Implement RenderTargetFormat::RG8_UNORM.
- Used by Breath of the Wild.
2018-08-12 23:08:50 -04:00
bunnei
534abf9d97 gl_shader_decompiler: Implement XMAD instruction. 2018-08-12 18:30:24 -04:00
Markus Wick
0eb39922f6 gl_rasterizer: Use a shared helper to upload from CPU memory. 2018-08-12 16:10:26 +02:00
Markus Wick
0af7e93763 gl_state: Don't track constant buffer mappings. 2018-08-12 16:10:26 +02:00
Markus Wick
6ff7906ddc gl_rasterizer: Use the stream buffer for constant buffers. 2018-08-12 16:10:26 +02:00
Markus Wick
ce722e317b gl_rasterizer: Use the streaming buffer itself for the constant buffer.
Don't emut copies, especially not for data, which is used once. They just end in a huge GPU overhead.
2018-08-12 15:48:59 +02:00
Markus Wick
6f6bba3ff1 gl_rasterizer: Use a helper for aligning the buffer. 2018-08-12 15:47:35 +02:00
Markus Wick
d7298ec262 Update the stream_buffer helper from Citra.
Please see https://github.com/citra-emu/citra/pull/3666 for more details.
2018-08-12 15:47:35 +02:00
bunnei
639ebb39f6 gl_shader_decompiler: Fix SetOutputAttributeToRegister empty check. 2018-08-12 02:22:42 -04:00
bunnei
c68aa65226 gl_shader_decompiler: Fix GLSL compiler error with KIL instruction. 2018-08-12 00:06:48 -04:00
bunnei
ee07041b3a
Merge pull request #1020 from lioncash/namespace
core: Namespace EmuWindow
2018-08-11 22:40:08 -04:00
bunnei
9c977d2215
Merge pull request #1021 from lioncash/warn
gl_rasterizer: Silence implicit truncation warning in SetupShaders()
2018-08-11 22:39:46 -04:00
bunnei
f2c7b5dcd6
Merge pull request #1024 from Subv/blend_gl
GPU/Maxwell3D: Implemented an alternative set of blend factors.
2018-08-11 22:39:02 -04:00
bunnei
d37da52cb3
Merge pull request #1023 from Subv/invalid_attribs
RasterizerGL: Ignore invalid/unset vertex attributes.
2018-08-11 22:18:40 -04:00
Subv
969326bd58 GPU/Maxwell3D: Implemented an alternative set of blend factors.
These are used by nouveau and some games like SMO.
2018-08-11 20:57:16 -05:00