Skip to content
This repository was archived by the owner on Dec 25, 2023. It is now read-only.

Commit 53a9cdc

Browse files
Updated readme. Error message fix for devices that do not support id3d12device8, which is required for sampler feedback.
1 parent a3178f4 commit 53a9cdc

File tree

2 files changed

+69
-61
lines changed

2 files changed

+69
-61
lines changed

README.md

Lines changed: 56 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -1,91 +1,73 @@
11
# Sampler Feedback Streaming
22

3-
This repository contains a demo of `DirectX12 Sampler Feedback Streaming`, a technique using [DirectX12 Sampler Feedback](https://microsoft.github.io/DirectX-Specs/d3d/SamplerFeedback.html) to guide continuous loading and eviction of small portions (tiles) of assets. Sampler Feedback Streaming allows scenes consisting of 100s of gigabytes of resources to be drawn on GPUs containing much less physical memory. The scene below uses just ~200MB of a 1GB heap, despite over 350GB of total texture resources.
3+
## Introduction
44

5-
The demo requires **`Windows 10 20H1 (aka May 2020 Update, build 19041)`** or later and a GPU with Sampler Feedback Support.
5+
This repository contains an [MIT licensed](LICENSE) demo of _DirectX12 Sampler Feedback Streaming_, a technique using [DirectX12 Sampler Feedback](https://microsoft.github.io/DirectX-Specs/d3d/SamplerFeedback.html) to guide continuous loading and eviction of small portions (tiles) of assets. Sampler Feedback Streaming allows scenes consisting of 100s of gigabytes of resources to be drawn on GPUs containing much less physical memory. The scene below uses just ~200MB of a 1GB heap, despite over 350GB of total texture resources.
6+
7+
The demo requires ***Windows 10 20H1 (aka May 2020 Update, build 19041)*** or later and a GPU with Sampler Feedback Support, such as Intel Iris Xe Graphics as found in 11th Generation Intel® Core™ processors and discrete GPUs (driver version **[30.0.100.9667](https://downloadcenter.intel.com/product/80939/Graphics) or later**).
8+
9+
This repository will be updated when DirectStorage for Windows® becomes available.
610

711
See also:
812

913
* [GDC 2021 video](https://software.intel.com/content/www/us/en/develop/events/gdc.html?videoid=6264595860001) [(alternate link)](https://www.youtube.com/watch?v=VDDbrfZucpQ) which provides an overview of Sampler Feedback and discusses this sample starting at about 15:30.
1014

11-
* [GDC 2021 presentation](https://software.intel.com/content/dam/develop/external/us/en/documents/pdf/july-gdc-2021-sampler-feedback-texture-space-shading-direct-storage.pdf)
12-
13-
Sampler Feedback is supported in hardware on Intel Iris Xe Graphics, as can be found in 11th Generation Intel® Core™ processors and discrete GPUs. This sample requires driver version ***[30.0.100.9667](https://downloadcenter.intel.com/product/80939/Graphics) or later***.
15+
* [GDC 2021 presentation](https://software.intel.com/content/dam/develop/external/us/en/documents/pdf/july-gdc-2021-sampler-feedback-texture-space-shading-direct-storage.pdf) in PDF form
1416

1517
![Sample screenshot](./readme-images/sampler-feedback-streaming.jpg "Sample screenshot")
1618
Textures derived from [Hubble Images](https://www.nasa.gov/mission_pages/hubble/multimedia/index.html), see the [Hubble Copyright](https://hubblesite.org/copyright)
1719

18-
## License
19-
20-
Copyright 2021 Intel Corporation
21-
22-
Permission is hereby granted, free of charge, to any person obtaining a copy of
23-
this software and associated documentation files (the "Software"), to deal in
24-
the Software without restriction, including without limitation the rights to
25-
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
26-
of the Software, and to permit persons to whom the Software is furnished to do
27-
so, subject to the following conditions:
2820

29-
The above copyright notice and this permission notice shall be included in all
30-
copies or substantial portions of the Software.
21+
Note the textures shown above, which total over 13GB, are not part of the repo. A few 16k x 16k textures are available as a [release](https://github.com/GameTechDev/SamplerFeedbackStreaming/releases/tag/1)
3122

32-
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
33-
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
34-
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
35-
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
36-
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
37-
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
38-
SOFTWARE.
39-
40-
## Requirements
41-
42-
The demo requires **`Windows 10 20H1 (aka May 2020 Update, build 19041)`** or later and a GPU with Sampler Feedback Support.
43-
44-
Intel Iris Xe Graphics, as can be found in 11th Generation Intel® Core™ processors and future discrete GPUs, requires BETA driver [30.0.100.9667](https://downloadcenter.intel.com/product/80939/Graphics) or later.
45-
46-
Note this repository does not contain the textures shown above, which total over 13GB. A link to these textures will hopefully be provided soon. Test textures are provided, as is a mechanism to convert from BCx format DDS files into the custom .XET format. UPDATE: The first "release" package (labeled "1") contains a .zip with a few hi-res textures.
47-
48-
This repository will be updated when DirectStorage for Windows® becomes available.
23+
Test textures are provided, as is a mechanism to convert from BCx format DDS files into the custom .XET format.
4924

5025
## Build Instructions
5126

52-
Download the source. Build the solution file with Visual Studio 2019.
53-
54-
## Running
27+
Download the source. Build the solution file [SamplerFeedbackStreaming.sln](SamplerFeedbackStreaming.sln) (tested with Visual Studio 2019).
5528

56-
All executables and .bat files will be found in the x64/Release or x64/Debug directories.
29+
All executables, scripts, configurations, and media files will be found in the x64/Release or x64/Debug directories.
5730

5831
To run within Visual Studio, change the working directory to $(TargetDir) under Properties/Debugging:
5932

6033
![set working directory to $(TargetDir)](./readme-images/project-settings.png "set Working Directory to \$(TargetDir)")
6134

62-
By default (no command line options) the application starts looking at a single object, "terrain", which allows for exploring sampler feedback streaming. In the top right find 2 windows: on the left is the raw GPU min mip feedback, on the right is the min mip map generated by the application. Across the bottom are the mips of the texture, with mip 0 in the bottom left. Left-click drag the terrain to see sampler feedback streaming in action.
35+
Or cd to the build directory (x64/Release or x64/Debug) and run from the command line:
36+
6337

6438
c:\SamplerFeedbackStreaming\x64\Release> expanse.exe
6539

40+
By default (no command line options) there will be a single object, "terrain", which allows for exploring sampler feedback streaming. In the top right find 2 windows: on the left is the raw GPU min mip feedback, on the right is the min mip map "residency map" generated by the application. Across the bottom are the mips of the texture, with mip 0 in the bottom left. Left-click drag the terrain to see sampler feedback streaming in action.
6641
![default startup](./readme-images/default-startup.jpg "default startup")
6742

68-
The batch file `demo.bat` starts in a more interesting state. Note keyboard controls are inactive while `Camera` animation is non-zero.
43+
The batch file _demo.bat_ starts in a more interesting state. Note keyboard controls are inactive while _Camera_ animation is non-zero.
6944

7045
c:\SamplerFeedbackStreaming\x64\Release> demo.bat
7146

7247
![demo batch file](./readme-images/demo-bat.jpg "demo.bat")
7348

74-
The textures in the first "release" package, hubble-16k.zip, work with "demo-hubble.bat". Make sure the mediadir in the batch file is set properly, or override it on the command line as follows:
49+
The high-resolution textures in the first "release" package, [hubble-16k.zip](https://github.com/GameTechDev/SamplerFeedbackStreaming/releases/tag/1), work with "demo-hubble.bat", including a sky and earth. Make sure the mediadir in the batch file is set properly, or override it on the command line as follows:
7550

7651
c:\SamplerFeedbackStreaming\x64\Release> demo-hubble.bat -mediadir c:\hubble-16k
7752

7853
## Configurations
7954

80-
By default, the application loads `config.json`.
55+
By default, the application loads [config.json](config/config.json).
8156

8257
However, it has been observed that performance decays over time on earlier nvidia devices/drivers (as the tiles in the heap become fragmented relative to resources). Specifically, the CPU time for [UpdateTileMappings](https://docs.microsoft.com/en-us/windows/win32/api/d3d12/nf-d3d12-id3d12commandqueue-updatetilemappings) limits the system throughput.
8358

84-
If you observe this issue (most obvious with stress.bat using large textures), run the included batch files with the addition of `-config nvidia.json`, which distributes resources across many small heaps. E.g.:
59+
If you observe this issue, obvious in demo mode or with [stress.bat](scripts/stress.bat), add `-config nvidia.json` to the command line.
60+
61+
E.g.:
8562

8663
c:\SamplerFeedbackStreaming\x64\Release> demo.bat -config nvidia.json
8764
c:\SamplerFeedbackStreaming\x64\Release> stress.bat -mediadir c:\hubble-16k -config nvidia.json
8865

66+
The main differences in [nvidia.json](configs/nvidia.json) distribute textures across 127 heaps each sized 32MB (512 tiles * 64KB per tile), for a total of ~4GB of GPU physical memory. Since this implementation restricts resources to a single heap, it is possible for the small heaps to fill resulting in visual artifacts. However, mapping and unmapping of small heaps appears to be significantly faster on some GPUs.
67+
68+
"heapSizeTiles": 512, // size for each heap. 64KB per tile * 16384 tiles -> 1GB heap
69+
"numHeaps": 127, // number of heaps. objects will be distributed among heaps
70+
8971
## Keyboard controls
9072

9173
There are a lot of keyboard controls - a function of giving many demos:
@@ -107,7 +89,9 @@ There are a lot of keyboard controls - a function of giving many demos:
10789

10890
## JSON configuration files and command lines
10991

110-
For a full list of command line options, pass the command line "?"
92+
For a full list of command line options, pass the command line "?", e.g.
93+
94+
c:> expanse.exe ?
11195

11296
Most of the detailed controls for the system can be find in a *json* file. The options in the json have corresponding command lines, e.g.:
11397

@@ -125,7 +109,7 @@ The executable `DdsToXet.exe` converts BCn DDS textures to the custom XET format
125109

126110
c:> ddstoxet.xet -in myfile.dds -out myfile.xet
127111

128-
The batch file `convert.bat` will read all the DDS files in one directory and write XET files to a second directory. The output directory must exist.
112+
The batch file [convert.bat](scripts/convert.bat) will read all the DDS files in one directory and write XET files to a second directory. The output directory must exist.
129113

130114
c:> convert c:\myDdsFiles c:\myXetFiles
131115

@@ -151,7 +135,7 @@ There are also a few known bugs:
151135

152136
This implementation of Sampler Feedback Streaming uses DX12 Sampler Feedback in combination with DX12 Reserved Resources, aka Tiled Resources. A multi-threaded CPU library processes feedback from the GPU, makes decisions about which tiles to load and evict, loads data from disk storage, and submits mapping and uploading requests via GPU copy queues. There is no explicit GPU-side synchronization between the queues, so rendering frame rate is not dependent on completion of copy commands (on GPUs that support concurrent multi-queue operation). The CPU threads run continuously and asynchronously from the GPU (pausing when there's no work to do), polling fence completion states to determine when feedback is ready to process or copies and memory mapping has completed.
153137

154-
All the magic can be found in the **TileUpdateManager** library (see TileUpdateManager.h), which abstracts the creation of StreamingResources and heaps while internally managing feedback resources, file I/O, and GPU memory mapping.
138+
All the magic can be found in the **TileUpdateManager** library (see [TileUpdateManager.h](TileUpdateManager/TileUpdateManager.h)), which abstracts the creation of StreamingResources and heaps while internally managing feedback resources, file I/O, and GPU memory mapping.
155139

156140
The technique works as follows:
157141

@@ -184,7 +168,7 @@ Below, the Visualization mode was set to "Color = Mip" and labels were added. Ti
184168

185169
To reduce GPU memory, a single combined buffer contains all the residency maps for all the resources. The pixel shader samples the corresponding residency map to clamp the sampling function to the minimum available texture data available, thereby avoiding sampling tiles that have not been mapped.
186170

187-
We can see this in the shader "terrainPS.hlsl". Resources are defined at the top of the shader, including the reserved buffer, the residency resource, and the sampler:
171+
We can see this in the shader [terrainPS.hlsl](src/shaders/terrainPS.hlsl). Resources are defined at the top of the shader, including the reserved buffer, the residency resource, and the sampler:
188172

189173
```cpp
190174
Texture2D g_streamingTexture : register(t0);
@@ -205,16 +189,22 @@ The sampling operation is clamped to the minimum mip resident (mipLevel).
205189

206190
### 4. Draw Objects While Recording Feedback
207191

208-
For expanse, there is a "normal" non-feedback shader named terrainPS.hlsl and a "feedback-enabled" version of the same shader, terrainPS-FB.hlsl. The latter simply writes feedback using [WriteSamplerFeedback](https://microsoft.github.io/DirectX-Specs/d3d/SamplerFeedback.html) HLSL intrinsic, using the same sampler and texture coordinates, then calls the prior shader. Compare the WriteSamplerFeedback() call below to to the Sample() call above.
192+
For expanse, there is a "normal" non-feedback shader named [terrainPS.hlsl](src/shaders/terrainPS.hlsl) and a "feedback-enabled" version of the same shader, [terrainPS-FB.hlsl](src/shaders/terrainPS-FB.hlsl). The latter simply writes feedback using [WriteSamplerFeedback](https://microsoft.github.io/DirectX-Specs/d3d/SamplerFeedback.html) HLSL intrinsic, using the same sampler and texture coordinates, then calls the prior shader. Compare the WriteSamplerFeedback() call below to to the Sample() call above.
193+
194+
To add feedback to an existing shader:
195+
196+
1. include the original shader hlsl
197+
2. add binding for the paired feedback resource
198+
3. call the WriteSamplerFeedback intrinsic with the resource and sampler defined in the original shader
199+
4. call the original shader
209200

210-
Include the normal pixel shader:
211201
```cpp
212202
#include "terrainPS.hlsl"
203+
213204
FeedbackTexture2D<SAMPLER_FEEDBACK_MIN_MIP> g_feedback : register(u0);
214205

215206
float4 psFB(VS_OUT input) : SV_TARGET0
216207
{
217-
218208
g_feedback.WriteSamplerFeedback(g_streamingTexture, g_sampler, input.tex.xy);
219209

220210
return ps(input);
@@ -225,19 +215,22 @@ Resolving feedback for one resource is inexpensive, but adds up when there are 1
225215
226216
As an optimization, Expanse tells streaming resources to evict all tiles if they are behind the camera. This could potentially be improved to include any object not in the view frustum.
227217
228-
You can find the time limit estimation, the eviction optimization, and the request to gather sampler feedback by searching Scene.cpp for the following:
218+
You can find the time limit estimation, the eviction optimization, and the request to gather sampler feedback by searching [Scene.cpp](src/Scene.cpp) for the following:
229219
230-
* DetermineMaxNumFeedbackResolves
231-
* QueueEviction
232-
* SetFeedbackEnabled
220+
- **DetermineMaxNumFeedbackResolves** determines how many resources to gather feedback for
221+
- **QueueEviction** tell runtime to evict tiles for this resource (as soon as possible)
222+
- **SetFeedbackEnabled** results in 2 actions:
223+
1. tell the runtime to collect feedback for this object via TileUpdateManager::QueueFeedback(), which results in clearing and resolving the feedback resource for this resource for this frame
224+
2. use the feedback-enabled pixel shader for this object
233225
234226
### 5. Determine Which Tiles to Load & Evict
235227
236228
Once the draw command is complete, the feedback is ready to read on the CPU - either by copying the feedback to a readback resource, or by resolving directly to a readback resource.
237229
238230
Min mip feedback tells us the minimum mip tile that should be loaded. The min mip feedback is traversed, updating an internal reference count for each tile. If a tile previously was unused (ref count = 0), it is queued for loading from the bottom (highest mip) up. If a tile is not needed for a particular region, its ref count is decreased (from the top down). When its ref count reaches 0, it might be ready to evict.
239231
240-
Data structures for tracking reference count, residency state, and heap usage can be found in StreamingResource.cpp/h, look for TileMappingState. This class also has methods for interpreting the feedback buffer (ProcessFeedback) and updating the residency map (UpdateMinMipMap).
232+
Data structures for tracking reference count, residency state, and heap usage can be found in [StreamingResource.cpp](TileUpdateManager/StreamingResource.cpp) and [StreamingResource.h](TileUpdateManager/StreamingResource.h), look for TileMappingState. This class also has methods for interpreting the feedback buffer (ProcessFeedback) and updating the residency map (UpdateMinMipMap), which execute concurrently in separate CPU threads.
233+
241234
```cpp
242235
class TileMappingState
243236
{
@@ -255,15 +248,21 @@ Tiles can only be evicted if there are no lower-mip-level tiles that depend on t
255248

256249
A tile also cannot be evicted if it is being used by an outstanding draw command. We prevent this by delaying evictions a frame or two depending on double or triple buffering of the swap chain. If a tile is needed before the delay completes, the tile is simply rescued from the pending eviction data structure instead of being re-loaded.
257250

258-
The mechanics of loading, mapping, and unmapping tiles is all contained within the DataUploader class, which depends on a FileStreamer class to do the actual tile loads. The latter implementation (FileStreamerReference) can easily be exchanged with DirectStorage for Windows.
251+
The mechanics of loading, mapping, and unmapping tiles is all contained within the DataUploader class, which depends on a [FileStreamer](TileUpdateManager/FileStreamer.h) class to do the actual tile loads. The latter implementation ([FileStreamerReference](TileUpdateManager/FileStreamerReference.h)) can easily be exchanged with DirectStorage for Windows.
259252

260253
### 6. Putting it all Together
261254

262-
There is some work that needs to be done before drawing objects that use feedback (clearing feedback resources), and some work that needs to be done after (resolving feedback resources). TileUpdateManager creates theses commands, but does not execute them. Each frame, these command lists must be built and submitted with application draw commands, which you can find just before the call to Present() as follows:
255+
There is some work that needs to be done before drawing objects that use feedback (clearing feedback resources), and some work that needs to be done after (resolving feedback resources). TileUpdateManager creates theses commands, but does not execute them. Each frame, these command lists must be built and submitted with application draw commands, which you can find just before the call to Present() in [Scene.cpp](src/Scene.cpp) as follows:
263256

264257
```cpp
265258
auto commandLists = m_pTileUpdateManager->EndFrame();
266259

267260
ID3D12CommandList* pCommandLists[] = { commandLists.m_beforeDrawCommands, m_commandList.Get(), commandLists.m_afterDrawCommands };
268261
m_commandQueue->ExecuteCommandLists(_countof(pCommandLists), pCommandLists);
269262
```
263+
264+
## License
265+
266+
Sample and its code provided under MIT license, please see [LICENSE](/LICENSE). All third-party source code provided under their own respective and MIT-compatible Open Source licenses.
267+
268+
Copyright (C) 2021, Intel Corporation

0 commit comments

Comments
 (0)