You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Dec 25, 2023. It is now read-only.
Copy file name to clipboardExpand all lines: README.md
+46-26Lines changed: 46 additions & 26 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
3
3
## Introduction
4
4
5
-
This repository contains an [MIT licensed](LICENSE) demo of _DirectX12 Sampler Feedback Streaming_, a technique using [DirectX12 Sampler Feedback](https://microsoft.github.io/DirectX-Specs/d3d/SamplerFeedback.html) to guide continuous loading and eviction of small portions (tiles) of assets. Sampler Feedback Streaming allows scenes consisting of 100s of gigabytes of resources to be drawn on GPUs containing much less physical memory. The scene below uses just ~200MB of a 1GB heap, despite over 350GB of total texture resources.
5
+
This repository contains an [MIT licensed](LICENSE) demo of _DirectX12 Sampler Feedback Streaming_, a technique using [DirectX12 Sampler Feedback](https://microsoft.github.io/DirectX-Specs/d3d/SamplerFeedback.html) to guide continuous loading and eviction of small portions (tiles) of assets allowing for much higher visual quality than previously possible by making better use of GPU memory capacity. Sampler Feedback Streaming allows scenes consisting of 100s of gigabytes of resources to be drawn on GPUs containing much less physical memory. The scene below uses just ~200MB of a 1GB heap, despite over 350GB of total texture resources.
6
6
7
7
The demo requires ***Windows 10 20H1 (aka May 2020 Update, build 19041)*** or later and a GPU with Sampler Feedback Support, such as Intel Iris Xe Graphics as found in 11th Generation Intel® Core™ processors and discrete GPUs (driver version **[30.0.100.9667](https://downloadcenter.intel.com/product/80939/Graphics) or later**).
8
8
@@ -37,6 +37,10 @@ Or cd to the build directory (x64/Release or x64/Debug) and run from the command
On nvidia drivers **prior to 496.13**, it is recommended to add `-config nvidia.json` to the command line. See the below description of json files and configurations.
By default (no command line options) there will be a single object, "terrain", which allows for exploring sampler feedback streaming. In the top right find 2 windows: on the left is the raw GPU min mip feedback, on the right is the min mip map "residency map" generated by the application. Across the bottom are the mips of the texture, with mip 0 in the bottom left. Left-click drag the terrain to see sampler feedback streaming in action.
By default, the application loads [config.json](config/config.json).
56
-
57
-
However, it has been observed that performance decays over time on earlier nvidia devices/drivers (as the tiles in the heap become fragmented relative to resources). Specifically, the CPU time for [UpdateTileMappings](https://docs.microsoft.com/en-us/windows/win32/api/d3d12/nf-d3d12-id3d12commandqueue-updatetilemappings) limits the system throughput.
58
-
59
-
If you observe this issue, obvious in demo mode or with [stress.bat](scripts/stress.bat), add `-config nvidia.json` to the command line.
The main differences in [nvidia.json](configs/nvidia.json) distribute textures across 127 heaps each sized 32MB (512 tiles * 64KB per tile), for a total of ~4GB of GPU physical memory. Since this implementation restricts resources to a single heap, it is possible for the small heaps to fill resulting in visual artifacts. However, mapping and unmapping of small heaps appears to be significantly faster on some GPUs.
67
-
68
-
"heapSizeTiles": 512, // size for each heap. 64KB per tile * 16384 tiles -> 1GB heap
69
-
"numHeaps": 127, // number of heaps. objects will be distributed among heaps
70
-
71
57
## Keyboard controls
72
58
73
59
There are a lot of keyboard controls - a function of giving many demos:
@@ -87,13 +73,15 @@ There are a lot of keyboard controls - a function of giving many demos:
87
73
*`insert` : toggles frustum. This behaves a little wonky.
88
74
*`esc` : while windowed, exit. while full-screen, return to windowed mode
89
75
90
-
## JSON configuration files and command lines
76
+
## Configuration files and command lines
91
77
92
78
For a full list of command line options, pass the command line "?", e.g.
93
79
94
80
c:> expanse.exe ?
95
81
96
-
Most of the detailed controls for the system can be find in a *json* file. The options in the json have corresponding command lines, e.g.:
82
+
Most of the detailed controls for the system can be find in a *json* file. By default, the application loads [config.json](config/config.json).
83
+
84
+
The options in the json have corresponding command lines, e.g.:
97
85
98
86
json:
99
87
@@ -103,19 +91,48 @@ command line:
103
91
104
92
-mediadir c:\myMedia
105
93
94
+
95
+
On nvidia devices using drivers prior to 496.13, it is recommended to add `-config nvidia.json` to the command line, e.g.:
This config works around an issue where performance decays over time as the tiles in the heap become fragmented relative to resources. Specifically, the CPU time for [UpdateTileMappings](https://docs.microsoft.com/en-us/windows/win32/api/d3d12/nf-d3d12-id3d12commandqueue-updatetilemappings) limits the system throughput. The workaround distribute textures across many small heaps, which can result in artifacts if the small heaps fill.
103
+
106
104
## Creating Your Own Textures
107
105
108
106
The executable `DdsToXet.exe` converts BCn DDS textures to the custom XET format. Only BC1 and BC7 textures have been tested. Usage:
109
107
110
-
c:> ddstoxet.xet -in myfile.dds -out myfile.xet
108
+
c:> ddstoxet.exe -in myfile.dds -out myfile.xet
111
109
112
110
The batch file [convert.bat](scripts/convert.bat) will read all the DDS files in one directory and write XET files to a second directory. The output directory must exist.
113
111
114
112
c:> convert c:\myDdsFiles c:\myXetFiles
115
113
116
114
## TileUpdateManager: a library for streaming textures
117
115
118
-
Within the source, there is a *TileUpdateManager* library that aspires to be stand-alone. The central object, *TileUpdateManager*, allows for the creation of streaming textures and heaps to contain them. These objects handle all the feedback resource creation, readback, processing, and file/IO.
116
+
The sample includes a library *TileUpdateManager* with a minimal set of APIs defined in [SamplerFeedbackStreaming.h](TileUpdateManager/SamplerFeedbackStreaming.h). The central object, *TileUpdateManager*, allows for the creation of streaming textures and heaps to contain them. These objects handle all the feedback resource creation, readback, processing, and file/IO.
117
+
118
+
The application creates a TileUpdateManager and 1 or more heaps in Scene.cpp:
Each SceneObject creates its own StreamingResource. Note **a StreamingResource can be used by multiple objects**, but this sample was designed to emphasize the ability to manage many resources and so objects are 1:1 with StreamingResources.
@@ -135,7 +152,7 @@ There are also a few known bugs:
135
152
136
153
This implementation of Sampler Feedback Streaming uses DX12 Sampler Feedback in combination with DX12 Reserved Resources, aka Tiled Resources. A multi-threaded CPU library processes feedback from the GPU, makes decisions about which tiles to load and evict, loads data from disk storage, and submits mapping and uploading requests via GPU copy queues. There is no explicit GPU-side synchronization between the queues, so rendering frame rate is not dependent on completion of copy commands (on GPUs that support concurrent multi-queue operation). The CPU threads run continuously and asynchronously from the GPU (pausing when there's no work to do), polling fence completion states to determine when feedback is ready to process or copies and memory mapping has completed.
137
154
138
-
All the magic can be found in the **TileUpdateManager** library (see [TileUpdateManager.h](TileUpdateManager/TileUpdateManager.h)), which abstracts the creation of StreamingResources and heaps while internally managing feedback resources, file I/O, and GPU memory mapping.
155
+
All the magic can be found in the **TileUpdateManager** library (see the internal file [TileUpdateManager.h](TileUpdateManager/TileUpdateManager.h) - applications should include [SamplerFeedbackStreaming.h](TileUpdateManager/SamplerFeedbackStreaming.h)), which abstracts the creation of StreamingResources and heaps while internally managing feedback resources, file I/O, and GPU memory mapping.
139
156
140
157
The technique works as follows:
141
158
@@ -168,21 +185,24 @@ Below, the Visualization mode was set to "Color = Mip" and labels were added. Ti
168
185
169
186
To reduce GPU memory, a single combined buffer contains all the residency maps for all the resources. The pixel shader samples the corresponding residency map to clamp the sampling function to the minimum available texture data available, thereby avoiding sampling tiles that have not been mapped.
170
187
171
-
We can see this in the shader [terrainPS.hlsl](src/shaders/terrainPS.hlsl). Resources are defined at the top of the shader, including the reserved buffer, the residency resource, and the sampler:
188
+
We can see the lookup into the residency map in the pixel shader [terrainPS.hlsl](src/shaders/terrainPS.hlsl). Resources are defined at the top of the shader, including the reserved (tiled) resource g_streamingTexture, the residency map g_minmipmap, and the sampler:
172
189
173
190
```cpp
174
191
Texture2D g_streamingTexture : register(t0);
175
192
Buffer<uint> g_minmipmap: register(t1);
176
193
SamplerState g_sampler : register(s0);
177
194
```
178
195
179
-
The shader offsets into its region of the residency buffer (g_minmipmapOffset) and loads the minimum mip value for the region to be sampled.
196
+
The shader offsets into its region of the residency map (g_minmipmapOffset) and loads the minimum mip value for the region to be sampled.
197
+
180
198
```cpp
181
199
int2 uv = input.tex * g_minmipmapDim;
182
200
uint index = g_minmipmapOffset + uv.x + (uv.y * g_minmipmapDim.x);
183
201
uint mipLevel = g_minmipmap.Load(index);
184
202
```
203
+
185
204
The sampling operation is clamped to the minimum mip resident (mipLevel).
205
+
186
206
```cpp
187
207
float3 color = g_streamingTexture.Sample(g_sampler, input.tex, 0, mipLevel).rgb;
0 commit comments