Haidra-Org · LaNataliaaa · Apr 12, 2026 · Copilot · Apr 12, 2026
diff --git a/README.md b/README.md
@@ -17,6 +17,7 @@ You can read about [kudos](https://github.com/Haidra-Org/haidra-assets/blob/main
       - [Option 2: Without Git](#option-2-without-git)
     - [Linux](#linux)
     - [AMD GPUs](#amd-gpus)
+    - [Intel Arc / XPU](#intel-arc--xpu)
     - [DirectML](#directml)
   - [Configuration](#configuration)
     - [Basic Settings](#basic-settings)
@@ -88,6 +89,15 @@ AMD support is experimental, and **Linux-only** for now:
 - [WSL support](README_advanced.md#advanced-users-amd-rocm-inside-windows-wsl) is highly experimental.
 - Join the [AMD discussion on Discord](https://discord.com/channels/781145214752129095/1076124012305993768) if you're interested in trying.
 
+### Intel Arc / XPU
+
+Intel Arc support is available on **Linux** through PyTorch XPU:
+
+- Use `update-runtime-xpu.sh` and `horde-bridge-xpu.sh`.
+- Install the Intel GPU driver and Level Zero runtime on the host OS before running the worker.
+- If you have multiple Intel GPUs, set `ONEAPI_DEVICE_SELECTOR` before launching the worker.
+- Safety checks currently stay on the CPU on XPU, so keep `safety_on_gpu: false`.
+
 ### DirectML
 
 **Experimental** Support for DirectML has been added. See [Running on DirectML](README_advanced.md#advanced-users-running-on-directml) for more information and further instructions. You can now follow this guide using  `update-runtime-directml.cmd` and `horde-bridge-directml.cmd` where appropriate. Please note that DirectML is several times slower than *ANY* other methods of running the worker.
@@ -133,6 +143,18 @@ Tailor settings to your GPU, following these pointers:
   - max_batch: 4 # Or higher
   ```
 
+- **Intel Arc A770 (16GB, Linux/XPU)**:
+
+  ```yaml
+  - queue_size: 1
+  - safety_on_gpu: false # XPU keeps the safety stack on CPU for now
+  - moderate_performance_mode: true
+  - unload_models_from_vram_often: false
+  - max_threads: 1
+  - max_power: 40
+  - max_batch: 4
+  ```
+
 - **8-10GB VRAM** (e.g. 2080, 3060, 4060, 4060 Ti):
 
   ```yaml
@@ -176,6 +198,7 @@ Tailor settings to your GPU, following these pointers:
 1. Install the worker as described in the [Installation](#installation) section.
 2. Run `horde-bridge.cmd` (Windows) or `horde-bridge.sh` (Linux).
    - **AMD**: Use `horde-bridge-rocm` versions.
+   - **Intel Arc / XPU**: Use `horde-bridge-xpu.sh`.
 
 ### Stopping
 
@@ -208,13 +231,20 @@ CUDA_VISIBLE_DEVICES=0 ./horde-bridge.sh -n "Instance 1"
 CUDA_VISIBLE_DEVICES=1 ./horde-bridge.sh -n "Instance 2"
 ```
 
+For Intel XPU, select the visible GPU with `ONEAPI_DEVICE_SELECTOR`:
+
+```bash
+ONEAPI_DEVICE_SELECTOR=level_zero:gpu:0 ./horde-bridge-xpu.sh -n "Arc A770 #1"
+ONEAPI_DEVICE_SELECTOR=level_zero:gpu:1 ./horde-bridge-xpu.sh -n "Arc A770 #2"
+```
+
 **Warning**: High RAM (32-64GB+) is needed for multiple workers. `queue_size` and `max_threads` greatly impact RAM per worker.
 
 ## Updating
 
 The worker is constantly improving. Follow development and get update notifications in our [Discord](https://discord.gg/3DxrhksKzn).
 
-Script names below assume Windows (`.cmd`) and NVIDIA. For Linux use `.sh`, for AMD use `-rocm` versions.
+Script names below assume Windows (`.cmd`) and NVIDIA. For Linux use `.sh`, for AMD use `-rocm` versions, and for Intel Arc use `-xpu` versions.
 
 ### Updating the Worker
 
@@ -234,8 +264,9 @@ Script names below assume Windows (`.cmd`) and NVIDIA. For Linux use `.sh`, for
 > **Warning**: Some antivirus software (e.g. Avast) may interfere with the update. If you get `CRYPT_E_NO_REVOCATION_CHECK` errors, disable antivirus, retry, then re-enable.
 
 4. Run `update-runtime` for your OS to update dependencies.
-   - Not all updates require this, but run it if unsure
-   - **Advanced users**: see [README_advanced.md](README_advanced.md) for manual options
+   - **Intel Arc / XPU**: Use `update-runtime-xpu.sh`
+    - Not all updates require this, but run it if unsure
+    - **Advanced users**: see [README_advanced.md](README_advanced.md) for manual options
-    - Not all updates require this, but run it if unsure
-    - **Advanced users**: see [README_advanced.md](README_advanced.md) for manual options
+   - Not all updates require this, but run it if unsure
+   - **Advanced users**: see [README_advanced.md](README_advanced.md) for manual options
-    - Not all updates require this, but run it if unsure
-    - **Advanced users**: see [README_advanced.md](README_advanced.md) for manual options
+   - Not all updates require this, but run it if unsure
+   - **Advanced users**: see [README_advanced.md](README_advanced.md) for manual options
 5. [Start the worker](#starting) again
 
 ## Custom Models
@@ -293,6 +324,7 @@ Check the [#local-workers Discord channel](https://discord.com/channels/78114521
 Common issues and fixes:
 
 - **Download failures**: Check disk space and internet connection.
+- **Intel Arc / XPU not detected**: Confirm the Intel GPU driver and Level Zero runtime are installed, then check that `torch.xpu.is_available()` is true inside the worker environment.
 - **Job timeouts**:
   - Remove large models (Flux, Cascade, SDXL)
   - Lower `max_power`

diff --git a/README_advanced.md b/README_advanced.md
@@ -139,7 +139,7 @@ HSA Agents
 
 ### Prerequisites
 * Install [git](https://git-scm.com/) in your system.
-* Install CUDA/RoCM if you haven't already.
+* Install CUDA/RoCM/Intel XPU drivers if you haven't already.
 * Install Python 3.10 or 3.11.
   * If using the official python installer **and** you do not already regularly already use python, be sure to check the box that says `Add python.exe to PATH` at the first screen.
 * We **strongly recommend** you configure at least 8gb (preferably 16gb+) of memory swap space. This recommendation applies to linux too.
@@ -159,16 +159,24 @@ HSA Agents
 - Install the requirements:
   - CUDA: `pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cu128`
   - RoCM: `pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/rocm6.2`
+  - Intel XPU: `pip install -r requirements.txt --index-url https://download.pytorch.org/whl/xpu --extra-index-url https://pypi.org/simple`
+    - Intel XPU requires the Intel GPU driver and Level Zero runtime on the host OS.
+    - If you need to pin a specific Intel GPU, set `ONEAPI_DEVICE_SELECTOR` (for example `level_zero:gpu:0`) before running the commands below.
 
 ### Run worker
 - Set your config now, copying `bridgeData_template.yaml` to `bridgeData.yaml`, being sure to set an API key and worker name at a minimum
 - `python download_models.py` (**critical - must be run first every time**)
 - `python run_worker.py` (to start working)
+- Intel XPU manual invocation:
+  - `python download_models.py --xpu`
+  - `python run_worker.py --xpu`
+  - Keep `safety_on_gpu: false`, because the current safety stack uses CPU on XPU.
 
 Pressing control-c will stop the worker but will first have the worker complete any jobs in progress before ending. Please try and avoid hard killing it unless you are seeing many major errors. You can force kill by repeatedly pressing control+c or doing a SIGKILL.
 
 ### Important note if manually manage your venvs
-- You should be running `python -m pip install -r requirements.txt -U https://download.pytorch.org/whl/cu128` every time you `git pull`. (Use `/whl/rocm6.2` instead if applicable)
+- You should be running `python -m pip install -r requirements.txt -U --extra-index-url https://download.pytorch.org/whl/cu128` every time you `git pull`.
+- Use `--extra-index-url https://download.pytorch.org/whl/rocm6.2` for RoCM or `--index-url https://download.pytorch.org/whl/xpu --extra-index-url https://pypi.org/simple` for Intel XPU.
 
 
 ## Advanced users, running on directml

diff --git a/download_models.py b/download_models.py
@@ -1,6 +1,7 @@
 import argparse
 
 from horde_worker_regen.download_models import download_all_models
+from horde_worker_regen.runtime_backend import HordeRuntimeBackend
 from horde_worker_regen.version_meta import do_version_check
 
 if __name__ == "__main__":
@@ -23,13 +24,34 @@
         default=None,
         help="Enable directml and specify device to use.",
     )
+    parser.add_argument(
+        "--xpu",
+        action="store_true",
+        default=False,
+        help="Enable Intel XPU support for Arc and other Intel GPUs.",
+    )
+    parser.add_argument(
+        "--oneapi-device-selector",
+        type=str,
+        default=None,
+        help="Restrict Intel XPU visibility using ONEAPI_DEVICE_SELECTOR, e.g. level_zero:gpu:0.",
+    )
 
     args = parser.parse_args()
 
+    try:
+        backend = HordeRuntimeBackend(
+            directml=args.directml,
+            xpu=args.xpu,
+            oneapi_device_selector=args.oneapi_device_selector,
+        )
+    except ValueError as e:
+        parser.error(str(e))
+
     do_version_check()
 
     download_all_models(
         purge_unused_loras=args.purge_unused_loras,
         load_config_from_env_vars=args.load_config_from_env_vars,
-        directml=args.directml,
+        backend=backend,
     )
diff --git a/environment.xpu.yaml b/environment.xpu.yaml
@@ -0,0 +1,9 @@
+name: ldm
+channels:
+  - conda-forge
+  - defaults
+# Minimal environment for Intel XPU. PyTorch and the rest of the stack are installed with pip.
+dependencies:
+  - git
+  - pip
+  - python==3.11
diff --git a/horde-bridge-xpu.sh b/horde-bridge-xpu.sh
@@ -0,0 +1,50 @@
+#!/bin/bash
+# Get the directory of the current script
+SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
+
+# Build the absolute path to the Conda environment
+CONDA_ENV_PATH="$SCRIPT_DIR/conda/envs/linux/lib"
+
+# Add the Conda environment to LD_LIBRARY_PATH
+export LD_LIBRARY_PATH="$CONDA_ENV_PATH:$LD_LIBRARY_PATH"
+
+# List of directories to check
+dirs=(
+    "/usr/lib"
+    "/usr/local/lib"
+    "/lib"
+    "/lib64"
+    "/usr/lib/x86_64-linux-gnu"
+)
+
+# Check each directory
+for dir in "${dirs[@]}"; do
+    if [ -f "$dir/libjemalloc.so.2" ]; then
+        export LD_PRELOAD="$dir/libjemalloc.so.2"
+        printf "Using jemalloc from $dir\n"
+        break
+    fi
+done
+
+# If jemalloc was not found, print a warning
+if [ -z "$LD_PRELOAD" ]; then
+    printf "WARNING: jemalloc not found. You may run into memory issues! We recommend running 'sudo apt install libjemalloc2'\n"
+    read -n 1 -s -r -p "Press q to quit or any other key to continue: " key
+    if [ "$key" = "q" ]; then
+        printf "\n"
+        exit 1
+    fi
+fi
+
+XPU_ARGS=(--xpu)
+if [ -n "${ONEAPI_DEVICE_SELECTOR:-}" ]; then
+    XPU_ARGS+=("--oneapi-device-selector=${ONEAPI_DEVICE_SELECTOR}")
+    printf "Using ONEAPI_DEVICE_SELECTOR=%s\n" "$ONEAPI_DEVICE_SELECTOR"
+fi
+
+if "$SCRIPT_DIR/runtime-xpu.sh" python -s "$SCRIPT_DIR/download_models.py" "${XPU_ARGS[@]}"; then
+    echo "Model Download OK. Starting worker..."
+    "$SCRIPT_DIR/runtime-xpu.sh" python -s "$SCRIPT_DIR/run_worker.py" "${XPU_ARGS[@]}" "$@"
+else
+    echo "download_models.py exited with error code. Aborting"
+fi
diff --git a/horde_worker_regen/download_models.py b/horde_worker_regen/download_models.py
@@ -1,15 +1,20 @@
 """Contains the code to download all models specified in the config file. Executable as a standalone script."""
 
+from horde_worker_regen.runtime_backend import HordeRuntimeBackend
+
 
 def download_all_models(
     *,
     load_config_from_env_vars: bool = False,
     purge_unused_loras: bool = False,
-    directml: int | None = None,
+    backend: HordeRuntimeBackend | None = None,
 ) -> None:
     """Download all models specified in the config file."""
     from horde_worker_regen.load_env_vars import load_env_vars_from_config
 
+    backend = backend or HordeRuntimeBackend()
+    backend.apply_environment()
+
     if not load_config_from_env_vars:
         load_env_vars_from_config()
 
@@ -57,8 +62,7 @@ def download_all_models(
     del _
 
     extra_comfyui_args = []
-    if directml is not None:
-        extra_comfyui_args.append(f"--directml={directml}")
+    backend.append_comfyui_args(extra_comfyui_args)
 
     hordelib.initialise(extra_comfyui_args=extra_comfyui_args)
     from hordelib.shared_model_manager import SharedModelManager

diff --git a/horde_worker_regen/process_management/inference_process.py b/horde_worker_regen/process_management/inference_process.py
@@ -40,6 +40,7 @@
     HordeProcessState,
     ModelLoadState,
 )
+from horde_worker_regen.runtime_backend import HordeRuntimeBackend, clear_torch_cache
 
 if TYPE_CHECKING:
     from hordelib.horde import HordeLib, ProgressReport, ResultingImageReturn
@@ -80,6 +81,7 @@ class HordeInferenceProcess(HordeProcess):
     _active_model_name: str | None = None
     """The name of the currently active model. Note that other models may be loaded in RAM or VRAM."""
     _aux_model_lock: Lock
+    _backend: HordeRuntimeBackend
 
     def __init__(
         self,
@@ -93,6 +95,7 @@ def __init__(
         process_launch_identifier: int,
         *,
         high_memory_mode: bool = False,
+        backend: HordeRuntimeBackend | None = None,
     ) -> None:
         """Initialise the HordeInferenceProcess.
 
@@ -119,6 +122,7 @@ def __init__(
         )
 
         self._aux_model_lock = aux_model_lock
+        self._backend = backend or HordeRuntimeBackend()
 
         # We import these here to guard against potentially importing them in the main process
         # which would create shared objects, potentially causing issues
@@ -548,13 +552,10 @@ def start_inference(self, job_info: ImageGenerateJobPopResponse) -> list[Resulti
                 self._vae_decode_semaphore.release()
         return results
 
-    @staticmethod
-    def clear_gc_and_torch_cache() -> None:
+    def clear_gc_and_torch_cache(self) -> None:
         """Clear the garbage collector and the PyTorch cache."""
         gc.collect()
-        from torch.cuda import empty_cache
-
-        empty_cache()
+        clear_torch_cache(self._backend)
 
     @logger.catch(reraise=True)
     def unload_models_from_vram(self) -> None:

diff --git a/horde_worker_regen/process_management/main_entry_point.py b/horde_worker_regen/process_management/main_entry_point.py
@@ -4,23 +4,22 @@
 
 from horde_worker_regen.bridge_data.data_model import reGenBridgeData
 from horde_worker_regen.process_management.process_manager import HordeWorkerProcessManager
+from horde_worker_regen.runtime_backend import HordeRuntimeBackend
 
 
 def start_working(
     ctx: BaseContext,
     bridge_data: reGenBridgeData,
     horde_model_reference_manager: ModelReferenceManager,
     *,
-    amd_gpu: bool = False,
-    directml: int | None = None,
+    backend: HordeRuntimeBackend | None = None,
 ) -> None:
     """Create and start process manager."""
     process_manager = HordeWorkerProcessManager(
         ctx=ctx,
         bridge_data=bridge_data,
         horde_model_reference_manager=horde_model_reference_manager,
-        amd_gpu=amd_gpu,
-        directml=directml,
+        backend=backend,
     )
 
     process_manager.start()