Skip to content

Commit 45cdc31

Browse files
authored
Profile layer: Add per-frame counter sampling (#164)
Adds per-frame performance measurement capabilities to layer_gpu_profile, including support for VK_EXT_frame_boundary to manually demarcate frame boundaries. In addition, per-frame samples have configurable support for serialization around sample points, allowing users to select between either lower overhead or stronger data isolation.
1 parent 3c694f5 commit 45cdc31

19 files changed

+669
-157
lines changed

.github/workflows/native_test.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,12 +11,14 @@ on:
1111
paths-ignore:
1212
- 'lglpy/**'
1313
- '**/*.md'
14+
- '**/*.json'
1415
pull_request:
1516
branches:
1617
- main
1718
paths-ignore:
1819
- 'lglpy/**'
1920
- '**/*.md'
21+
- '**/*.json'
2022

2123
env:
2224
CMAKE_BUILD_PARALLEL_LEVEL: '8'

.github/workflows/python_test.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,13 @@ on:
1010
- '*'
1111
paths-ignore:
1212
- '**/*.md'
13+
- '**/*.json'
1314
pull_request:
1415
branches:
1516
- main
1617
paths-ignore:
1718
- '**/*.md'
19+
- '**/*.json'
1820

1921
jobs:
2022
python-test:

layer_gpu_profile/README_LAYER.md

Lines changed: 31 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -113,19 +113,40 @@ application under test and the capture process. For full instructions see the
113113

114114
## Layer configuration
115115

116-
The current layer supports two `sampling_mode` values:
116+
### Setting frame selection mode
117117

118-
* `periodic_frame`: Sample every N frames.
119-
* `frame_list`: Sample specific frames.
118+
The current layer supports the following ways to select frames to profile using
119+
the `frame_mode` config option:
120120

121-
When `mode` is `periodic_frame` the integer value of the `periodic_frame` key
122-
defines the frame sampling period. The integer value of the
123-
`periodic_min_frame` key defines the first possible frame that could be
124-
profiled, allowing profiles to skip over any loading frames. By default frame 0
125-
is ignored.
121+
* `disabled`: Sampling is disabled.
122+
* `periodic`: Sample every N frames.
123+
* `list`: Sample specific frames.
126124

127-
When `mode` is `frame_list` the value of the `frame_list` key defines a list
128-
of integers giving the specific frames to capture.
125+
When frame selection mode is `periodic` the integer value of the
126+
`periodic_frame` key defines the frame sampling period. The integer value of
127+
the `periodic_min_frame` key defines the first possible frame that could be
128+
profiled, allowing profiles to skip over any loading frames. By default frame
129+
0 is ignored.
130+
131+
When frame selection mode is `list` the value of the `frame_list` key defines
132+
a list of integers giving the specific frames to capture.
133+
134+
### Setting counter sampling mode
135+
136+
The current layer supports the following ways to select how to sample counters
137+
to profile using the `sample_mode` config option:
138+
139+
* `disabled`: Sampling is disabled.
140+
* `workload`: Sample every workload in each frame of interest.
141+
* `frame`: Sample at the end of each frame of interest.
142+
143+
By default per-frame samples are isolated from other frames by inserting a
144+
`vkDeviceWaitIdle()` before and after the frame to ensure that workload
145+
in the sampled region does not overlap neighboring frames. Setting the
146+
`frame_serialization` config option to `false` will allow frames to overlap
147+
without serialization, but can add noise to the returned counter values. This
148+
option has no effect for per-workload sampling, which must always use
149+
serialization.
129150

130151
## Layer counters
131152

layer_gpu_profile/android_build.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ cmake \
6868
-DCMAKE_WARN_DEPRECATED=OFF \
6969
..
7070

71-
cmake --build . -j4
71+
cmake --build .
7272

7373
popd
7474

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
{
22
"layer": "VK_LAYER_LGL_gpu_profile",
3-
"sample_mode": "periodic_frame",
3+
"frame_mode": "periodic",
4+
"sample_mode": "frame",
45
"periodic_min_frame": 1,
56
"periodic_frame": 600,
6-
"frame_list": []
7+
"frame_list": [],
8+
"frame_serialization": true
79
}

layer_gpu_profile/source/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@ add_library(
5555
layer_device_functions_render_pass.cpp
5656
layer_device_functions_trace_rays.cpp
5757
layer_device_functions_transfer.cpp
58+
layer_instance_functions.cpp
5859
submit_visitor.cpp)
5960

6061
target_include_directories(

layer_gpu_profile/source/device_utils.hpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,8 @@
5858
VkCommandBuffer commandBuffer
5959
) {
6060
// Don't instrument outside of active frame of interest
61-
if(!layer.isFrameOfInterest)
61+
bool isEnabled = layer.instance->config.isSamplingWorkloads();
62+
if(!layer.isFrameOfInterest || !isEnabled)
6263
{
6364
return;
6465
}

layer_gpu_profile/source/instance.cpp

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,9 @@ const std::vector<std::string> Instance::requiredDriverExtensions {
4646
const std::vector<std::pair<std::string, uint32_t>> Instance::injectedInstanceExtensions {};
4747

4848
/* See header for documentation. */
49-
std::vector<std::pair<std::string, uint32_t>> Instance::injectedDeviceExtensions {};
49+
std::vector<std::pair<std::string, uint32_t>> Instance::injectedDeviceExtensions {
50+
{VK_EXT_FRAME_BOUNDARY_EXTENSION_NAME, VK_EXT_FRAME_BOUNDARY_SPEC_VERSION}
51+
};
5052

5153
/* See header for documentation. */
5254
void Instance::store(VkInstance handle, std::unique_ptr<Instance>& instance)

layer_gpu_profile/source/layer_config.cpp

Lines changed: 77 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -43,45 +43,78 @@
4343
/* See header for documentation. */
4444
void LayerConfig::parseSamplingOptions(const json& config)
4545
{
46-
// Decode top level options
47-
std::string rawMode = config.at("sample_mode");
46+
// Decode frame selection mode
47+
std::string rawFrameMode = config.at("frame_mode");
4848

49-
if (rawMode == "disabled")
49+
if (rawFrameMode == "disabled")
5050
{
51-
mode = MODE_DISABLED;
51+
frameMode = FRAME_SELECTION_DISABLED;
5252
}
53-
else if (rawMode == "periodic_frame")
53+
else if (rawFrameMode == "periodic")
5454
{
55-
mode = MODE_PERIODIC_FRAME;
55+
frameMode = FRAME_SELECTION_PERIODIC;
5656
periodicFrame = config.at("periodic_frame");
5757
periodicMinFrame = config.at("periodic_min_frame");
5858
}
59-
else if (rawMode == "frame_list")
59+
else if (rawFrameMode == "list")
6060
{
61-
mode = MODE_FRAME_LIST;
61+
frameMode = FRAME_SELECTION_LIST;
6262
specificFrames = config.at("frame_list").get<std::vector<uint64_t>>();
6363
}
6464
else
6565
{
66-
LAYER_ERR("Unknown counter sample_mode: %s", rawMode.c_str());
67-
rawMode = "disabled";
66+
LAYER_ERR("Unknown frame_mode: %s", rawFrameMode.c_str());
67+
frameMode = FRAME_SELECTION_DISABLED;
68+
rawFrameMode = "disabled";
6869
}
6970

71+
// Decode counter sampling mode
72+
std::string rawSampleMode = config.at("sample_mode");
73+
74+
if (rawSampleMode == "disabled")
75+
{
76+
samplingMode = COUNTER_SAMPLING_DISABLED;
77+
}
78+
else if (rawSampleMode == "frame")
79+
{
80+
samplingMode = COUNTER_SAMPLING_FRAMES;
81+
}
82+
else if (rawSampleMode == "workload")
83+
{
84+
samplingMode = COUNTER_SAMPLING_WORKLOADS;
85+
}
86+
else
87+
{
88+
LAYER_ERR("Unknown sample_mode: %s", rawSampleMode.c_str());
89+
samplingMode = COUNTER_SAMPLING_DISABLED;
90+
rawSampleMode = "disabled";
91+
}
92+
93+
// Decode frame serialization mode
94+
frameSerialization = config.at("frame_serialization");
95+
7096
LAYER_LOG("Layer sampling configuration");
7197
LAYER_LOG("============================");
72-
LAYER_LOG(" - Sample mode: %s", rawMode.c_str());
98+
LAYER_LOG(" - Frame selection mode: %s", rawFrameMode.c_str());
7399

74-
if (mode == MODE_PERIODIC_FRAME)
100+
if (frameMode == FRAME_SELECTION_PERIODIC)
75101
{
76102
LAYER_LOG(" - Frame period: %" PRIu64, periodicFrame);
77103
LAYER_LOG(" - Minimum frame: %" PRIu64, periodicMinFrame);
78104
}
79-
else if (mode == MODE_FRAME_LIST)
105+
else if (frameMode == FRAME_SELECTION_LIST)
80106
{
81107
std::stringstream result;
82108
std::copy(specificFrames.begin(), specificFrames.end(), std::ostream_iterator<uint64_t>(result, " "));
83109
LAYER_LOG(" - Frames: %s", result.str().c_str());
84110
}
111+
112+
LAYER_LOG(" - Counter sampling mode: %s", rawSampleMode.c_str());
113+
114+
if (samplingMode == COUNTER_SAMPLING_FRAMES)
115+
{
116+
LAYER_LOG(" - Frame serialization: %u", frameSerialization);
117+
}
85118
}
86119

87120
/* See header for documentation. */
@@ -131,18 +164,45 @@ LayerConfig::LayerConfig()
131164
bool LayerConfig::isFrameOfInterest(
132165
uint64_t frameID
133166
) const {
134-
switch(mode)
167+
switch(frameMode)
135168
{
136-
case MODE_DISABLED:
169+
case FRAME_SELECTION_DISABLED:
137170
return false;
138-
case MODE_PERIODIC_FRAME:
171+
case FRAME_SELECTION_PERIODIC:
139172
return (frameID >= periodicMinFrame) &&
140173
((frameID % periodicFrame) == 0);
141-
case MODE_FRAME_LIST:
174+
case FRAME_SELECTION_LIST:
142175
return isIn(frameID, specificFrames);
143176
}
144177

145178
// Should never reach here
146179
return false;
147180
}
148181

182+
/* See header for documentation. */
183+
bool LayerConfig::isSamplingWorkloads() const
184+
{
185+
return frameMode != FRAME_SELECTION_DISABLED &&
186+
samplingMode == COUNTER_SAMPLING_WORKLOADS;
187+
}
188+
189+
/* See header for documentation. */
190+
bool LayerConfig::isSamplingFrames() const
191+
{
192+
return frameMode != FRAME_SELECTION_DISABLED &&
193+
samplingMode == COUNTER_SAMPLING_FRAMES;
194+
}
195+
196+
/* See header for documentation. */
197+
bool LayerConfig::isSamplingAny() const
198+
{
199+
return frameMode != FRAME_SELECTION_DISABLED &&
200+
samplingMode != COUNTER_SAMPLING_DISABLED;
201+
}
202+
203+
/* See header for documentation. */
204+
bool LayerConfig::isSerializingFrames() const
205+
{
206+
return isSamplingWorkloads() ||
207+
(isSamplingFrames() && frameSerialization);
208+
};

layer_gpu_profile/source/layer_config.hpp

Lines changed: 56 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -54,19 +54,57 @@ class LayerConfig
5454
*
5555
* @param frameID The index of the next frame.
5656
*
57-
* @return True if profiling should be enabled, False otherwise.
57+
* @return @c true if profiling should be enabled, @c false otherwise.
5858
*/
5959
bool isFrameOfInterest(uint64_t frameID) const;
6060

61+
/**
62+
* @brief Test if we are sampling workloads.
63+
*
64+
* @return @c true if profiling workloads, @c false otherwise.
65+
*/
66+
bool isSamplingWorkloads() const;
67+
68+
/**
69+
* @brief Test if we are sampling frames.
70+
*
71+
* @return @c true if profiling frames, @c false otherwise.
72+
*/
73+
bool isSamplingFrames() const;
74+
75+
/**
76+
* @brief Test if any kind of sampling is active.
77+
*
78+
* @return @c true if profiling, @c false otherwise.
79+
*/
80+
bool isSamplingAny() const;
81+
82+
/**
83+
* @brief Test if we are serializing frames.
84+
*
85+
* @return @c true if serializing, @c false otherwise.
86+
*/
87+
bool isSerializingFrames() const;
88+
6189
private:
6290
/**
63-
* @brief Supported sampling modes.
91+
* @brief Supported frame selection modes.
92+
*/
93+
enum FrameSelectionMode
94+
{
95+
FRAME_SELECTION_DISABLED,
96+
FRAME_SELECTION_LIST,
97+
FRAME_SELECTION_PERIODIC
98+
};
99+
100+
/**
101+
* @brief Supported counter sampling modes.
64102
*/
65-
enum SamplingMode
103+
enum CounterSamplingMode
66104
{
67-
MODE_DISABLED,
68-
MODE_FRAME_LIST,
69-
MODE_PERIODIC_FRAME
105+
COUNTER_SAMPLING_DISABLED,
106+
COUNTER_SAMPLING_WORKLOADS,
107+
COUNTER_SAMPLING_FRAMES
70108
};
71109

72110
/**
@@ -79,9 +117,19 @@ class LayerConfig
79117
void parseSamplingOptions(const json& config);
80118

81119
/**
82-
* @brief The sampling mode.
120+
* @brief The frame selection mode.
121+
*/
122+
FrameSelectionMode frameMode {FRAME_SELECTION_DISABLED};
123+
124+
/**
125+
* @brief The counter sampling mode.
126+
*/
127+
CounterSamplingMode samplingMode {COUNTER_SAMPLING_DISABLED};
128+
129+
/**
130+
* @brief The frame sample serialization mode.
83131
*/
84-
SamplingMode mode {MODE_DISABLED};
132+
bool frameSerialization {true};
85133

86134
/**
87135
* @brief The sampling period in frames, or 0 if disabled.

0 commit comments

Comments
 (0)