[XPU] Changed how XPU discovery works during setup.py (#720)

Egor-Krivov · web-flow · commit d71297f3ddd4 · 2025-05-21T10:25:58.000-07:00
## Summary Right now we check `xpu-smi` during installation to find out if machine has XPU device. But `xpu-smi` is often missing from user devices, so we end up discovering incorrect platform (`cpu`) and then try to install wrong `triton` dependency. User ends up with `triton-xpu` being overwritten by `triton`. This combination doesn't work. So I check `sycl-ls` which should be available. The output I get on PVC machine: ``` [level_zero:gpu][level_zero:0] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Data Center GPU Max 1100 12.60.7 [1.6.32567+18] [opencl:cpu][opencl:0] Intel(R) OpenCL, Intel(R) Xeon(R) Gold 6438Y+ OpenCL 3.0 (Build 0) [2024.18.12.0.05_160000] [opencl:gpu][opencl:1] Intel(R) OpenCL Graphics, Intel(R) Data Center GPU Max 1100 OpenCL 3.0 NEO [25.05.32567] ``` The output I get on B570 machine: ``` [level_zero:gpu][level_zero:0] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) B570 Graphics 20.1.0 [1.6.32567+19] [opencl:cpu][opencl:0] Intel(R) OpenCL, Intel(R) Core(TM) Ultra 7 265K OpenCL 3.0 (Build 0) [2025.19.4.0.18_160000.xmain-hotfix] [opencl:gpu][opencl:1] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) B570 Graphics OpenCL 3.0 NEO [25.05.32567] ``` ## Possible alternative We could just import pytorch and check `torch.xpu.is_available()`. It might even be better that way, since right now if user is missing torch we will try to install cuda torch even for XPU devices. ## Testing Done I tested it by installing on new conda environment from the source code with: ``` pip install torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/xpu --no-cache-dir pip install -e . ``` - Hardware Type: XPU - [ ] run `make test` to ensure correctness - [ ] run `make checkstyle` to ensure code style - [ ] run `make test-convergence` to ensure convergence
diff --git a/setup.py b/setup.py
@@ -47,6 +47,27 @@ def get_optional_dependencies():
     }
 
 
+def is_xpu_available():
+    """
+    Check if Intel XPU is available.
+    xpu-smi is often missing right now.
+    """
+    try:
+        subprocess.run(["xpu-smi"], check=True)
+        return True
+    except (subprocess.SubprocessError, FileNotFoundError):
+        pass
+
+    try:
+        result = subprocess.run("sycl-ls", check=True, capture_output=True, shell=True)
+        if 'level_zero:gpu' in result.stdout.decode():
+            return True
+    except (subprocess.SubprocessError, FileNotFoundError):
+        pass
+
+    return False
+
+
 def get_platform() -> Literal["cuda", "rocm", "cpu", "xpu"]:
     """
     Detect whether the system has NVIDIA or AMD GPU without torch dependency.
@@ -63,11 +84,10 @@ def get_platform() -> Literal["cuda", "rocm", "cpu", "xpu"]:
             print("ROCm GPU detected")
             return "rocm"
         except (subprocess.SubprocessError, FileNotFoundError):
-            try:
-                subprocess.run(["xpu-smi"], check=True)
+            if is_xpu_available():
                 print("Intel GPU detected")
                 return "xpu"
-            except (subprocess.SubprocessError, FileNotFoundError):
+            else:
                 print("No GPU detected")
                 return "cpu"