Skip to content

Conversation

@akroviakov
Copy link
Contributor

@akroviakov akroviakov commented Nov 26, 2024

Consider these devices visible to OpenCL (CPU and GPU):

Platform: Intel(R) OpenCL
  Device: Intel(R) Xeon(R) Gold 6438Y+
Platform: Intel(R) OpenCL Graphics
  Device: Intel(R) Data Center GPU Max 1100

If you try to run the mlp.mlir example in the current main, it would display an error (-1) for CPU, but would still run the test. The error is displayed due to reading all devices using CL_DEVICE_TYPE_GPU. Instead, we can first read all devices using CL_DEVICE_TYPE_ALL and then use CL_DEVICE_TYPE to select GPUs, this way, no error is displayed.

@akroviakov akroviakov force-pushed the akroviak/gc-gpu-rt-device-type branch from 80f8eaf to afc9141 Compare November 26, 2024 14:06
@AndreyPavlenko
Copy link
Contributor

AndreyPavlenko commented Nov 26, 2024

Does it mean, that clGetDeviceIDs(platform, CL_DEVICE_TYPE_ALL, 0, nullptr, &numDevices) returns error? Could it be an environment configuration issue?
Btw, mlp.mlir is running by CI and it does not fail.

@AndreyPavlenko
Copy link
Contributor

Sorry, I closed it accidently.

@kurapov-peter
Copy link
Contributor

What's the use case for running CPU through opencl anyway? The runtime is not supposed to be called during CPU execution.

@akroviakov
Copy link
Contributor Author

I build like this:

git clone ...
./scripts/compile.sh --dev --imex
cd build
cmake --build . --target gc-check

then I run mlp.mlir like this:

/home/.../graph-compiler/build/bin/gc-gpu-runner --shared-libs=/home/.../graph-compiler/externals/llvm-project/build/lib/libmlir_runner_utils.so /home/.../graph-compiler/test/mlir/test/gc/gpu-runner/mlp.mlir

Assuming I am using gpu-runner in a way I am not supposed to, what would be the correct command to run a single arbitrary .mlir test for GPU then?

@AndreyPavlenko
Copy link
Contributor

Assuming I am using gpu-runner in a way I am not supposed to, what would be the correct command to run a single arbitrary .mlir test for GPU then?

You are using it in the right way and the same command works for me:

$ ./bin/gc-gpu-runner --shared-libs=../../../graph-compiler/externals/llvm-project/build/Release/lib/libmlir_runner_utils.so ../../../graph-compiler/test/mlir/test/gc/gpu-runner/mlp.mlir
Unranked Memref base@ = 0x55c191cd5180 rank = 2 offset = 0 sizes = [1, 10] strides = [10, 1] data =
[[0.1,   0.1,   0.1,   0.1,   0.1,   0.1,   0.1,   0.1,   0.1,   0.1]]

Does this command fails for you? What's the failure?

@akroviakov
Copy link
Contributor Author

akroviakov commented Nov 27, 2024

As I have mentioned in the beginning, the test runs:

~/graph-compiler/build$ /home/.../graph-compiler/build/bin/gc-gpu-runner --shared-libs=/home/.../graph-compiler/externals/llvm-project/build/lib/libmlir_runner_utils.so /home/.../graph-compiler/test/mlir/test/gc/gpu-runner/mlp.mlir 
[ERROR] [/home/.../graph-compiler/lib/gc/ExecutionEngine/GPURuntime/ocl/GpuOclRuntime.cpp:357] Failed to get the number of devices on the platform.0x55bc84c46970 Error: -1
Unranked Memref base@ = 0x55bc85e94480 rank = 2 offset = 0 sizes = [1, 10] strides = [10, 1] data = 
[[0.1,   0.1,   0.1,   0.1,   0.1,   0.1,   0.1,   0.1,   0.1,   0.1]]

However, it also displays an error for one of the platforms. This error is displayed because we ask the following platform:

Platform: Intel(R) OpenCL
  Device: Intel(R) Xeon(R) Gold 6438Y+

for a device ID of type CL_DEVICE_TYPE_GPU which triggers:

    if (err != CL_SUCCESS) {
      gcLogE("Failed to get the number of devices on the platform.", platform,
             " Error: ", err);
      continue;
    }

So is it ok for GpuOclRuntime to see platforms that are not OpenCL Graphics (e.g., by simply iterating over them)? If yes, does that mean that I can simply ignore the error?

@AndreyPavlenko
Copy link
Contributor

does that mean that I can simply ignore the error?

Got it. The error message is redundant here. It should be either replaced with a debug message or removed.

@akroviakov akroviakov closed this Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants