diff --git a/sycl/doc/EnvironmentVariables.md b/sycl/doc/EnvironmentVariables.md index 6c4bd8eab45e0..503fae5050338 100644 --- a/sycl/doc/EnvironmentVariables.md +++ b/sycl/doc/EnvironmentVariables.md @@ -272,6 +272,7 @@ older hardware or when SYCL_UR_USE_LEVEL_ZERO_V2=0 is set. | Environment variable | Values | Description | Adapter Support | | -------------------- | ------ | ----------- | --------------- | | `UR_L0_V2_FORCE_DISABLE_COPY_OFFLOAD` | Integer | By default, copy operations submitted to any queue can be offloaded to dedicated copy engines. Setting this variable instructs the driver to keep all copy operations on the engine behind the original queue. The default value is 0. | V2 | +| `UR_L0_V2_DISABLE_ZE_LAUNCH_KERNEL_WITH_ARGS` | Integer | By default, `ZeCommandListAppendLaunchKernelWithArguments()` will be called. Setting this variable instructs the adapter to not call `ZeCommandListAppendLaunchKernelWithArguments()` and use the old path using `ZeCommandListAppendLaunchKernel()`. The default value is 0. | V2 | | `SYCL_PI_LEVEL_ZERO_SINGLE_THREAD_MODE` | Integer | A single-threaded app has an opportunity to enable this mode to avoid overhead from mutex locking in the Level Zero adapter. A value greater than 0 enables single thread mode. A value of 0 disables single thread mode. The default is 0. | Legacy | | `SYCL_PI_LEVEL_ZERO_USM_ALLOCATOR` | [EnableBuffers][;[MaxPoolSize][;[host\|device\|shared:][MaxPoolableSize][,[Capacity][,SlabMinSize]]]...] | EnableBuffers enables pooling for SYCL buffers, default 1, set to 0 to disable. MaxPoolSize is the maximum size of the pool, by default there is no size limit. MemType is host, device, shared or read_only_shared. Other parameters are values specified as positive integers with optional K, M or G suffix. MaxPoolableSize is the maximum allocation size that may be pooled, default 0 for shared, 2MB for host, 4MB for device and read_only_shared. Capacity is the number of allocations in each size range freed by the program but retained in the pool for reallocation, default 4. Size ranges follow this pattern: 64, 96, 128, 192, and so on, i.e., powers of 2, with one range in between. SlabMinSize is the minimum allocation size, 64KB for host and device, 2MB for shared and read_only_shared. Example: SYCL_PI_LEVEL_ZERO_USM_ALLOCATOR=1;32M;host:1M,4,64K;device:1M,4,64K;shared:0,0,2M| Legacy and V2 | | `SYCL_PI_LEVEL_ZERO_BATCH_SIZE` | Integer | Sets a preferred number of compute commands to batch into a command list before executing the command list. A value of 0 causes the batch size to be adjusted dynamically. A value greater than 0 specifies fixed size batching, with the batch size set to the specified value. The default is 0. | Legacy | diff --git a/unified-runtime/source/adapters/level_zero/platform.cpp b/unified-runtime/source/adapters/level_zero/platform.cpp index ef7590bf9f42c..4fd04c5962843 100644 --- a/unified-runtime/source/adapters/level_zero/platform.cpp +++ b/unified-runtime/source/adapters/level_zero/platform.cpp @@ -538,6 +538,10 @@ ur_result_t ur_platform_handle_t_::initialize() { .DriverSupportsCooperativeKernelLaunchWithArgs = this->isDriverVersionNewerOrSimilar(1, 6, 35005); + ZeCommandListAppendLaunchKernelWithArgumentsExt + .DisableZeLaunchKernelWithArgs = + getenv_tobool("UR_L0_V2_DISABLE_ZE_LAUNCH_KERNEL_WITH_ARGS", false); + return UR_RESULT_SUCCESS; } diff --git a/unified-runtime/source/adapters/level_zero/platform.hpp b/unified-runtime/source/adapters/level_zero/platform.hpp index 81d7528cfb250..fb9b9024abb8a 100644 --- a/unified-runtime/source/adapters/level_zero/platform.hpp +++ b/unified-runtime/source/adapters/level_zero/platform.hpp @@ -166,5 +166,6 @@ struct ur_platform_handle_t_ : ur::handle_base, struct ZeCommandListAppendLaunchKernelWithArgumentsExtension { bool Supported = false; bool DriverSupportsCooperativeKernelLaunchWithArgs = false; + bool DisableZeLaunchKernelWithArgs = false; } ZeCommandListAppendLaunchKernelWithArgumentsExt; }; diff --git a/unified-runtime/source/adapters/level_zero/v2/command_list_manager.cpp b/unified-runtime/source/adapters/level_zero/v2/command_list_manager.cpp index b56c52bdda0c7..3cd3d6a74ea71 100644 --- a/unified-runtime/source/adapters/level_zero/v2/command_list_manager.cpp +++ b/unified-runtime/source/adapters/level_zero/v2/command_list_manager.cpp @@ -1256,8 +1256,11 @@ ur_result_t ur_command_list_manager::appendKernelLaunchWithArgsExp( bool CooperativeCompatible = hPlatform->ZeCommandListAppendLaunchKernelWithArgumentsExt .DriverSupportsCooperativeKernelLaunchWithArgs; + bool DisableZeLaunchKernelWithArgs = + hPlatform->ZeCommandListAppendLaunchKernelWithArgumentsExt + .DisableZeLaunchKernelWithArgs; bool RunNewPath = - KernelWithArgsSupported && + !DisableZeLaunchKernelWithArgs && KernelWithArgsSupported && (!cooperativeKernelLaunchRequested || (cooperativeKernelLaunchRequested && CooperativeCompatible)); if (RunNewPath) {