You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
add control to force non-profiling command queues (#162)
This can be used for performance analysis, but may cause errors if
an application requires event profiling. If this control is set
along with another control that requires event profiling, such as
DevicePerformanceTiming, then the other control will override this
control, but this behavior may change in the future and should not
be relied upon!
Copy file name to clipboardExpand all lines: docs/controls.md
+4Lines changed: 4 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -584,6 +584,10 @@ If set to a nonzero value, the Intercept Layer for OpenCL Applications inserts a
584
584
585
585
If set to a nonzero value, the Intercept Layer for OpenCL Applications will force all queues to be created in-order. This can be used for performance analysis, but may lead to deadlocks in some cases.
586
586
587
+
##### `NoProfilingQueue` (bool)
588
+
589
+
If set to a nonzero value, the Intercept Layer for OpenCL Applications will force all queues to be created without event profiling support. This can be used for performance analysis, but may lead to errors if the application requires event profiling.
590
+
587
591
##### `NullEnqueue` (bool)
588
592
589
593
If set to a nonzero value, the Intercept Layer for OpenCL Applications will silently ignore any enqueue. This can be used for performance analysis, but will likely cause errors if the application relies on any sort of information from OpenCL events and should be used carefully.
Copy file name to clipboardExpand all lines: intercept/src/controls.h
+1Lines changed: 1 addition & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -156,6 +156,7 @@ CLI_CONTROL( bool, FinishAfterEnqueue, false, "If s
156
156
CLI_CONTROL( bool, FlushAfterEnqueue, false, "If set to a nonzero value, the Intercept Layer for OpenCL Applications inserts a call to clFlush() after every enqueue. The command queue that the command was just enqueued to is passed to clFlush(). This can also be used to debug possible timing or resource management issues and is slightly less obtrusive than FinishAfterEnqueue but still will likely impact performance. If both FinishAfterEnqueue and FlushAfterEnqueue are nonzero then the Intercept Layer for OpenCL Applications will only insert a call to clFinish() after every enqueue, because clFinish() implies clFlush()." )
157
157
CLI_CONTROL( bool, FlushAfterEnqueueBarrier, false, "If set to a nonzero value, the Intercept Layer for OpenCL Applications inserts a call to clFlush() after every barrier enqueue. The command queue that the command was just enqueued to is passed to clFlush(). This has been useful to debug out-of-order queue issues." )
158
158
CLI_CONTROL( bool, InOrderQueue, false, "If set to a nonzero value, the Intercept Layer for OpenCL Applications will force all queues to be created in-order. This can be used for performance analysis, but may lead to deadlocks in some cases." )
159
+
CLI_CONTROL( bool, NoProfilingQueue, false, "If set to a nonzero value, the Intercept Layer for OpenCL Applications will force all queues to be created without event profiling support. This can be used for performance analysis, but may lead to errors if the application requires event profiling." )
159
160
CLI_CONTROL( bool, NullEnqueue, false, "If set to a nonzero value, the Intercept Layer for OpenCL Applications will silently ignore any enqueue. This can be used for performance analysis, but will likely cause errors if the application relies on any sort of information from OpenCL events and should be used carefully." )
160
161
CLI_CONTROL( bool, NullLocalWorkSize, false, "If set to a nonzero value, the Intercept Layer for OpenCL Applications will force the local work size argument to clEnqueueNDRangeKernel() to be NULL, which causes the OpenCL implementation to pick the local work size. Note that this control takes effect before NullLocalWorkSizeX / NullLocalWorkSizeY / NullLocalWorkSizeZ (see below), so enabling both controls will have the effect of forcing a specific local work size." )
161
162
CLI_CONTROL( size_t, NullLocalWorkSizeX, 0, "If set to a nonzero value, the Intercept Layer for OpenCL Applications will set the local work size that will be used if an application passes NULL as the local work size to clEnqueueNDRangeKernel(). 1D dispatches will only look at NullLocalWorkSizeX, 2D dispatches will only look at NullLocalWorkSizeX and NullLocalWorkSizeY, while 3D dispatches will look at NullLocalWorkSizeX, NullLocalWorkSizeY, and NullLocalWorkSizeZ. If the specified values for NullLocalWorkSize do not evenly divide the global work size then the specified values of NullLocalWorkSize will not take effect." )
0 commit comments