Skip to content
Open
Show file tree
Hide file tree
Changes from 165 commits
Commits
Show all changes
179 commits
Select commit Hold shift + click to select a range
5ef91fc
Add ggml-openvino base files
YangleiZouIntel Oct 29, 2024
2e42b6c
add openvino as optional backend for Llama.cpp ggml
zhanmyz Nov 13, 2024
bfba5b9
* Configure the device(default CPU) that uses OpenVINO to compile th…
zhanmyz Nov 19, 2024
e3b1386
Solve the issue of abnormal model output caused by using OpenVINO ADD…
zhanmyz Nov 21, 2024
9ba111e
Add OpenVINO MUL operator to GGML of Llama.cpp.
zhanmyz Dec 2, 2024
543d929
Add compile options
zhanmyz Dec 2, 2024
684086c
add OpenVINO frontend convert process steps
zhanmyz Dec 4, 2024
e71e41a
add get openvino available ops function
zhanmyz Dec 5, 2024
311674e
Add PoC of integration of openvino frontend. Main changes: ggml-ov-fr…
yumengbo Nov 16, 2024
51ecdf4
Implement GgmlOvDecoder. Add dump functions.
yumengbo Nov 19, 2024
727246e
Convert subgraph with add, sub, mul, div op to ov model and do infer …
yumengbo Nov 22, 2024
0802220
Add GGML_OV_FRONTEND option. Add readme.
yumengbo Nov 22, 2024
d6c148b
Change output for infer request to set output tensor. Support scale, …
yumengbo Dec 5, 2024
8769d9e
add GET_ROWS operator of OpenVINO to GGML of llama.cpp
zhanmyz Dec 9, 2024
e4754ab
Update build.md and add operation mapping(GGML to OpenVINO)
zhanmyz Dec 10, 2024
76ee005
add the rms_norm operator implemented using OpenVINO to the GGML back…
zhanmyz Dec 16, 2024
2a86e7b
Fix issue for output memory copy of infer request
yumengbo Dec 12, 2024
0689ee3
Change to implementation following pytorch frontend
yumengbo Dec 12, 2024
1c301ce
Add support for UNARY SILU op . Fix pytorch impl bugs.
yumengbo Dec 17, 2024
6028316
Support Softmax op
yumengbo Dec 18, 2024
213761e
Support Softmax op
yumengbo Dec 18, 2024
b0406c2
Support ROPE op.
yumengbo Dec 21, 2024
9b4d445
Add support for RMS_NORM OP
zhanmyz Dec 19, 2024
60e899c
Add MUL_MAT,CPY,CONT as operators implemented in OpenVINO for GGML ba…
zhanmyz Jan 14, 2025
5749e82
Move CPY from GGML OV Backend to OV Frontend
zhanmyz Jan 22, 2025
ad57734
add implementation of MUL_MAT, CPY, CONT of GGML ops using OV ops
zhanmyz Feb 18, 2025
015f11e
add implementation of CPY when the output tensor is non-contiguous
zhanmyz Feb 19, 2025
cc3066b
add tmp source code files
zhanmyz Feb 25, 2025
81f8c75
Execute singel CONT operator is OK
zhanmyz Feb 25, 2025
28acc0e
Execute CONT & VIEW operators in OV Frontend is OK
zhanmyz Mar 1, 2025
3b4f3ac
OV Frontend supports GET_ROWS/RMS_NORM/MUL/MUL_MAT graph conversion o…
zhanmyz Mar 3, 2025
dceeefa
OV Frontend supports GET_ROWS/RMS_NORM/MUL/MUL_MAT/ROPE/SCALE/SOFTMAX…
zhanmyz Mar 5, 2025
b794841
Change the input parameter shape of CONT operator
zhanmyz Mar 5, 2025
f4bb7d2
Change the input and ouput node shape of MUL_MAT operator
zhanmyz Mar 5, 2025
e077a7c
Change the input and ouput node shape of MUL_MAT operator
zhanmyz Mar 5, 2025
171deac
change CONT and MULMAT input node shape
zhanmyz Mar 6, 2025
a0672d3
All adjacent ops can conversion but calculation result is wrong and n…
zhanmyz Mar 6, 2025
f508c15
1. All operators implemented using OpenVINO can be successfully execu…
zhanmyz Mar 9, 2025
ee35e8c
1. Update the implementation of CPY node when it's non-contiguous
zhanmyz Mar 11, 2025
0ee0781
Minor Update
zhanmyz Mar 11, 2025
a9f6725
Try to add VIEW node to OV Frontend and have some issues that need to…
zhanmyz Mar 12, 2025
a6da47b
1. In the Prompt process and predict first token stage, the PERMUTE n…
zhanmyz Mar 15, 2025
952dbc4
add debug info
zhanmyz Mar 17, 2025
1b7ed3d
Process Prompt and predict first token is OK
zhanmyz Mar 26, 2025
cc21645
1. Solve the AC issue of Permute+VIEW and MULMAL issue in the phase o…
zhanmyz Mar 31, 2025
bdd0962
1. Delete some comments
zhanmyz Mar 31, 2025
3f40786
* Use find_package in CMake to configure OpenVINO
wine99 Apr 14, 2025
d75fee7
change op mappings to list in openvino_supports_op
wine99 Apr 15, 2025
c53e290
2nd+ token correct by fix CPY in OV, remove single op backend compute…
wine99 Apr 15, 2025
d424199
Arbitrary token len (>32) work; Fix bug in mulmat
wine99 Apr 17, 2025
87f691d
FEAT: do PERMUTE eagerly
wine99 Apr 21, 2025
dafb10e
FEAT: Add interleaved mode for ROPE
wine99 Apr 22, 2025
70c234a
REFACTOR: support weigts as constant
wine99 Apr 28, 2025
216fdc2
STYLE: minor refactor
wine99 Apr 28, 2025
3314ef0
PERF: share const nodes for weights for diff infer
wine99 Apr 28, 2025
f27e526
BUILD: update build doc, add cmake preset, add CACHE_DIR env var
wine99 Apr 29, 2025
9d3ee0b
FEAT: improve debug capability
wine99 Apr 30, 2025
18be2ca
PERF: compile once (dynamic graph + cache)
wine99 May 8, 2025
4e1d196
Rebase - Bring up to date and fix build process
virajwad May 9, 2025
ce5df66
fix build error
wine99 May 13, 2025
0036a21
FIX: backend buffer type issue
wine99 May 13, 2025
79449d7
STYLE: clang-format
wine99 May 9, 2025
3e8e678
FEAT: Add all conversion code from ov side
wine99 May 9, 2025
3a3d776
PERF: favor low precision matmul
wine99 May 13, 2025
f881c58
STYLE and minor REFACTOR
wine99 May 13, 2025
a3be048
FIX: Re-add tensor names in cgraph, Add another case for RESHAPE
wine99 May 14, 2025
7ce1783
FIX: input shape of KQ_mask
wine99 May 14, 2025
ea520a3
PERF: add weight constant in parallel
wine99 May 14, 2025
264011b
FIX: set_max_token_len
wine99 May 16, 2025
f6de4c1
PERF: use Slice+Concat in writing cache_v
wine99 May 16, 2025
c632aed
Update build doc
wine99 May 20, 2025
3427daa
Add cgraph tensor output name to OV op name
wine99 May 22, 2025
c6d3e92
Update openvino build instructions
ravi9 May 29, 2025
aa2f495
Add initial NPU support
wine99 May 27, 2025
5984be4
draft NPU support version 2: prefill + kvcache
wine99 May 29, 2025
1a9411f
NPU support version 2: prefill + kvcache
wine99 Jun 3, 2025
ee36029
Change due to ggml cgraph changes, not correct yet
wine99 Jun 4, 2025
51f7698
Change due to ggml cgraph changes, llama-3.2 CPU work
wine99 Jun 16, 2025
f922d18
Add AMD64 to CMakeLists
wine99 Jun 16, 2025
0fc9477
Change due to ggml cgraph changes, all device work
wine99 Jun 16, 2025
43d57f3
Refactor: clean, fix warning
wine99 Jun 20, 2025
e8ce78f
Update clang-format
wine99 Jun 23, 2025
a63cfb2
Statful transformation for CPU GPU
wine99 Jun 26, 2025
389d3c4
Add SwiGLU
wine99 Jul 3, 2025
0200596
Fuse to SDPA
wine99 Jul 3, 2025
25d5197
Replace Concat with Broadcast in MulMat for GQA
wine99 Jul 4, 2025
d30f6f7
Pull out indices creation for kv cache update
wine99 Jul 6, 2025
93ac991
Refactor: remove past_token_len from extra_inputs
wine99 Jul 9, 2025
5de7da5
Fix Phi3 SwiGLU and SoftMax
wine99 Jul 9, 2025
2df2e39
Pull out sin cos from rope
wine99 Jul 9, 2025
bc2bfaf
Reduce memory: free ov weights node after graph conversion
wine99 Jul 11, 2025
01b858a
Fix CPY due to cgraph change
wine99 Jul 17, 2025
c5313d3
Added OpenVINO CI/CD. Updated docs
ravi9 Jul 18, 2025
2a8d318
Fix llama-cli
wine99 Jul 23, 2025
e0c370c
Fix Phi3 ROPE; Add test-backend-ops
wine99 Jul 21, 2025
2e5ebb7
Fix NPU
wine99 Jul 23, 2025
d388d7e
Fix llama-bench; Clang-format
wine99 Jul 24, 2025
3a5eb95
Fix llama-perplexity
wine99 Jul 24, 2025
407114f
temp. changes for mark decomp
cavusmustafa Jul 29, 2025
5f47e95
matmul in fp32
wine99 Jul 29, 2025
9e34ea4
mulmat input conversion fix
cavusmustafa Jul 30, 2025
1ab7de3
mulmat type conversion update
cavusmustafa Jul 30, 2025
cc7c17b
add mark decomp pass
cavusmustafa Jul 30, 2025
e2cfd6e
Revert changes in fuse_to_sdpa
wine99 Jul 30, 2025
4dced3a
Update build.md
ravi9 Jul 31, 2025
d693fda
Fix test-backend-ops
wine99 Jul 31, 2025
164bfeb
Skip test-thread-safety; Run ctest only in ci/run.sh
wine99 Jul 31, 2025
42577f7
Use CiD for NPU
wine99 Aug 1, 2025
2197129
Optimize tensor conversion, improve TTFT
wine99 Aug 4, 2025
fb758ff
Support op SET_ROWS
wine99 Aug 13, 2025
2541b9d
Fix NPU
wine99 Aug 14, 2025
7424136
Remove CPY
wine99 Aug 14, 2025
49c75c2
Fix test-backend-ops
wine99 Aug 14, 2025
006f6e8
Minor updates for raising PR
wine99 Aug 14, 2025
c7f165a
Perf: RMS fused to OV internal RMS op
wine99 Aug 27, 2025
bcb7053
Fix after rebasing
wine99 Sep 4, 2025
04dba82
Change openvino device_type to GPU; Enable flash_attn
wine99 Sep 5, 2025
13c0d71
Update supports_buft and supports_op for quantized models
wine99 Aug 5, 2025
f7f9273
Add quant weight conversion functions from genai gguf reader
wine99 Aug 5, 2025
604adc3
Quant models run with accuracy issue
wine99 Aug 6, 2025
b35884a
Fix accuracy: disable cpu_repack
wine99 Aug 7, 2025
85247b6
Fix CI; Disable test-backend-ops
wine99 Aug 7, 2025
e1235b9
Fix Q4_1
wine99 Aug 8, 2025
63792a1
Fix test-thread-safety
wine99 Aug 8, 2025
e1f9aab
Fix test-backend-ops: Treat quantized tensors as weights
wine99 Aug 12, 2025
715fd26
Add NPU Q4_0 support
wine99 Aug 19, 2025
ca5ceb7
NPU perf: eliminate zp
wine99 Aug 22, 2025
9623246
Dequantize q4_1 q4_k q6_k for NPU
wine99 Aug 29, 2025
7a0b852
Add custom quant type: q8_1_c, q4_0_128
wine99 Sep 2, 2025
c02d362
Set m_is_static=false as default in decoder
wine99 Sep 2, 2025
e7a3ab9
Simpilfy translation of get_rows
wine99 Sep 2, 2025
404fac9
Fix after rebasing
wine99 Sep 8, 2025
dc2eeb4
Improve debug util; Eliminate nop ReshapeReshape
wine99 Sep 10, 2025
c3b8963
STYLE: make get_types_to_requant a function
wine99 Sep 10, 2025
3cd3def
Support BF16 model
wine99 Sep 11, 2025
a482f40
Fix NPU compile
wine99 Sep 12, 2025
bd862a0
WA for npu 1st token acc issue
wine99 Sep 12, 2025
4eb3819
Apply EliminateZP only for npu
wine99 Sep 12, 2025
7f69755
Add GeGLU
wine99 Sep 15, 2025
244ec02
Fix Hunyuan
wine99 Sep 15, 2025
29b4e72
Support iSWA
wine99 Sep 16, 2025
51f9bea
Fix NPU accuracy
wine99 Sep 17, 2025
dd416f7
Fix ROPE accuracy when freq_scale != 1
wine99 Sep 17, 2025
72833f2
Minor: not add attention_size_swa for non-swa model
wine99 Sep 17, 2025
0e50ed9
Minor refactor
wine99 Sep 19, 2025
cee3982
Add Q5_K to support phi-3-q4_k_m
wine99 Sep 23, 2025
8825c3d
Requantize Q6_K (gs16) to gs32 on GPU
wine99 Sep 26, 2025
3e18759
Fix after rebasing
wine99 Sep 28, 2025
47e253a
Always apply Eliminate_ZP to fix GPU compile issue on some platforms
wine99 Sep 28, 2025
3dc9a72
kvcachefusion support
cavusmustafa Oct 1, 2025
61d007d
env variable GGML_OPENVINO_DISABLE_SDPA_OPTIMIZATION added
cavusmustafa Oct 1, 2025
ba62f7b
Fix for Phi3
cavusmustafa Oct 2, 2025
de961a0
Fix llama-cli (need to run with --no-warmup)
wine99 Oct 9, 2025
fa18b7b
Fix add_sliced_mask; Revert mulmat, softmax; Remove input attention_s…
wine99 Oct 10, 2025
4c1f60f
fix after rebasing
wine99 Oct 11, 2025
8cc6cd0
Fix llama-3-8b and phi3-mini q4_0 NPU
wine99 Oct 14, 2025
8af46c4
Update to OV-2025.3 and CMakeLists.txt
ravi9 Oct 15, 2025
509c5f4
Add OV CI cache
wine99 Oct 15, 2025
cfd40a9
Apply CISC review and update CI to OV2025.3
ravi9 Oct 15, 2025
4c280cc
Update CI to run OV dep install before build
ravi9 Oct 15, 2025
3feac74
Update OV dockerfile to use OV2025.3 and update build docs
ravi9 Oct 15, 2025
7ac02a8
Style: use switch in supports_ops
wine99 Oct 21, 2025
7c8a4a5
Style: middle ptr and ref align, omit optional struct keyword
wine99 Oct 21, 2025
0f97715
NPU Unify PD (#14)
wine99 Nov 4, 2025
d5038aa
Clean placeholders in ggml-openvino.cpp
wine99 Oct 21, 2025
e866ed0
Update .github/workflows/docker.yml
wine99 Nov 4, 2025
75c720a
NPU unify PD (handled internally)
wine99 Nov 5, 2025
0981dec
Update ggml-decoder.cpp
I-N-T-E-L Nov 20, 2025
02eb109
Update ggml-decoder.cpp
I-N-T-E-L Nov 20, 2025
546cabd
Update ggml-decoder.cpp
I-N-T-E-L Nov 20, 2025
6b2153d
Update ggml-decoder.cpp
I-N-T-E-L Nov 20, 2025
51167ab
Update ggml-decoder.cpp
I-N-T-E-L Nov 20, 2025
b8d0e2a
Update ggml-decoder.cpp
I-N-T-E-L Nov 20, 2025
5070d2d
change graph to 4d, support multi sequences
wine99 Nov 20, 2025
6be0146
Fix llama-bench
wine99 Nov 20, 2025
bbecac0
Fix NPU
wine99 Nov 24, 2025
1c05c32
Merge pull request #17 from I-N-T-E-L/fix---unsetenv()
ynimmaga Nov 24, 2025
5d433c8
Remove the second decoder for node. Moving the function into the mode…
zhaixuejun1993 Nov 26, 2025
33a5b45
Fix error for naive
zhaixuejun1993 Nov 26, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
134 changes: 134 additions & 0 deletions .devops/openvino.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
ARG OPENVINO_VERSION_MAJOR=2025.3
ARG OPENVINO_VERSION_FULL=2025.3.0.19807.44526285f24
ARG UBUNTU_VERSION=24.04

# Optional proxy build arguments - empty by default
ARG http_proxy=
ARG https_proxy=

## Build Image
FROM ubuntu:${UBUNTU_VERSION} AS build

# Pass proxy args to build stage
ARG http_proxy
ARG https_proxy

RUN apt-get update && \
apt-get install -y --no-install-recommends \
ca-certificates \
gnupg \
wget \
git \
cmake \
ninja-build \
build-essential \
libtbb12 \
libcurl4-openssl-dev && \
rm -rf /var/lib/apt/lists/*

# Install OpenVINO for Ubuntu 24.04
ARG OPENVINO_VERSION_MAJOR
ARG OPENVINO_VERSION_FULL
RUN mkdir -p /opt/intel && \
wget https://storage.openvinotoolkit.org/repositories/openvino/packages/${OPENVINO_VERSION_MAJOR}/linux/openvino_toolkit_ubuntu24_${OPENVINO_VERSION_FULL}_x86_64.tgz && \
tar -xf openvino_toolkit_ubuntu24_${OPENVINO_VERSION_FULL}_x86_64.tgz && \
mv openvino_toolkit_ubuntu24_${OPENVINO_VERSION_FULL}_x86_64 /opt/intel/openvino_${OPENVINO_VERSION_MAJOR} && \
cd /opt/intel/openvino_${OPENVINO_VERSION_MAJOR} && \
echo "Y" | ./install_dependencies/install_openvino_dependencies.sh && \
cd - && \
ln -s /opt/intel/openvino_${OPENVINO_VERSION_MAJOR} /opt/intel/openvino

ENV OpenVINO_DIR=/opt/intel/openvino

WORKDIR /app

COPY . .

# Build Stage
RUN bash -c "source ${OpenVINO_DIR}/setupvars.sh && \
cmake -B build/ReleaseOV -G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DGGML_OPENVINO=ON && \
cmake --build build/ReleaseOV -j$(nproc)"

# Copy all necessary libraries
RUN mkdir -p /app/lib && \
find build/ReleaseOV -name '*.so*' -exec cp {} /app/lib \; && \
find ${OpenVINO_DIR}/runtime/lib/intel64 -name '*.so*' -exec cp -P {} /app/lib \; 2>/dev/null || \
find ${OpenVINO_DIR}/lib/intel64 -name '*.so*' -exec cp -P {} /app/lib \;

# Create runtime directories and copy binaries
RUN mkdir -p /app/full \
&& cp build/ReleaseOV/bin/* /app/full/ \
&& cp *.py /app/full \
&& cp -r gguf-py /app/full \
&& cp -r requirements /app/full \
&& cp requirements.txt /app/full \
&& cp .devops/tools.sh /app/full/tools.sh

## Base Runtime Image
FROM ubuntu:${UBUNTU_VERSION} AS base

# Pass proxy args to runtime stage
ARG http_proxy
ARG https_proxy

RUN apt-get update \
&& apt-get install -y libgomp1 libtbb12 curl\
&& apt autoremove -y \
&& apt clean -y \
&& rm -rf /tmp/* /var/tmp/* \
&& find /var/cache/apt/archives /var/lib/apt/lists -not -name lock -type f -delete \
&& find /var/cache -type f -delete

COPY --from=build /app/lib/ /app/

### Full (all binaries)
FROM base AS full

ARG http_proxy
ARG https_proxy

COPY --from=build /app/full /app/

WORKDIR /app

RUN apt-get update && \
apt-get install -y --no-install-recommends \
git \
python3 \
python3-venv \
python3-pip && \
python3 -m venv /ov-venv && \
/ov-venv/bin/pip install --no-cache-dir --upgrade pip setuptools wheel && \
/ov-venv/bin/pip install --no-cache-dir -r requirements.txt && \
apt-get autoremove -y && \
apt-get clean && \
rm -rf /tmp/* /var/tmp/* && \
find /var/cache/apt/archives /var/lib/apt/lists -not -name lock -type f -delete && \
find /var/cache -type f -delete

ENTRYPOINT ["/bin/bash", "-c", "source /ov-venv/bin/activate && exec /app/tools.sh \"$@\"", "--"]


### Light, CLI only
FROM base AS light

COPY --from=build /app/full/llama-cli /app/

WORKDIR /app

ENTRYPOINT [ "/app/llama-cli" ]

### Server, Server only
FROM base AS server

ENV LLAMA_ARG_HOST=0.0.0.0

COPY --from=build /app/full/llama-server /app/

WORKDIR /app

HEALTHCHECK CMD [ "curl", "-f", "http://localhost:8080/health" ]

ENTRYPOINT [ "/app/llama-server" ]
25 changes: 25 additions & 0 deletions .github/actions/linux-setup-openvino/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
name: "Linux - Setup OpenVINO Toolkit"
description: "Setup OpenVINO Toolkit for Linux"
inputs:
path:
description: "Installation path"
required: true
version_major:
description: "OpenVINO major version (e.g., 2025.3)"
required: true
version_full:
description: "OpenVINO full version (e.g., 2025.3.0.19807.44526285f24)"
required: true

runs:
using: "composite"
steps:
- name: Setup OpenVINO Toolkit
id: setup
uses: ./.github/actions/unarchive-tar
with:
url: https://storage.openvinotoolkit.org/repositories/openvino/packages/${{ inputs.version_major }}/linux/openvino_toolkit_ubuntu24_${{ inputs.version_full }}_x86_64.tgz
path: ${{ inputs.path }}
type: z
strip: 1

28 changes: 28 additions & 0 deletions .github/workflows/build-cache.yml
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,34 @@ jobs:
path: ./spacemit_toolchain
version: ${{ env.SPACEMIT_IME_TOOLCHAIN_VERSION }}

ubuntu-24-openvino-cache:
runs-on: ubuntu-24.04

env:
# Make sure this is in sync with build.yml
OPENVINO_VERSION_MAJOR: "2025.3"
OPENVINO_VERSION_FULL: "2025.3.0.19807.44526285f24"

steps:
- name: Clone
id: checkout
uses: actions/checkout@v4

- name: Setup Cache
uses: actions/cache@v4
id: cache-openvino
with:
path: ./openvino_toolkit
key: openvino-toolkit-v${{ env.OPENVINO_VERSION_FULL }}-${{ runner.os }}

- name: Setup OpenVINO Toolkit
if: steps.cache-openvino.outputs.cache-hit != 'true'
uses: ./.github/actions/linux-setup-openvino
with:
path: ./openvino_toolkit
version_major: ${{ env.OPENVINO_VERSION_MAJOR }}
version_full: ${{ env.OPENVINO_VERSION_FULL }}

windows-2022-rocm-cache:
runs-on: windows-2022

Expand Down
101 changes: 78 additions & 23 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -700,6 +700,61 @@ jobs:
-DGGML_SYCL_F16=ON
cmake --build build --config Release -j $(nproc)

ubuntu-24-cmake-openvino:
runs-on: ubuntu-24.04

env:
# Make sure this is in sync with build-cache.yml
OPENVINO_VERSION_MAJOR: "2025.3"
OPENVINO_VERSION_FULL: "2025.3.0.19807.44526285f24"

steps:
- name: Clone
id: checkout
uses: actions/checkout@v4

- name: ccache
uses: ggml-org/[email protected]
with:
key: ubuntu-24-cmake-openvino-no-preset-v1
evict-old-files: 1d

- name: Dependencies
id: depends
run: |
sudo apt-get update
sudo apt-get install -y build-essential libcurl4-openssl-dev libtbb12 cmake ninja-build python3-pip

- name: Use OpenVINO Toolkit Cache
uses: actions/cache@v4
id: cache-openvino
with:
path: ./openvino_toolkit
key: openvino-toolkit-v${{ env.OPENVINO_VERSION_FULL }}-${{ runner.os }}

- name: Setup OpenVINO Toolkit
if: steps.cache-openvino.outputs.cache-hit != 'true'
uses: ./.github/actions/linux-setup-openvino
with:
path: ./openvino_toolkit
version_major: ${{ env.OPENVINO_VERSION_MAJOR }}
version_full: ${{ env.OPENVINO_VERSION_FULL }}

- name: Install OpenVINO dependencies
run: |
cd ./openvino_toolkit
chmod +x ./install_dependencies/install_openvino_dependencies.sh
echo "Y" | sudo -E ./install_dependencies/install_openvino_dependencies.sh

- name: Build
id: cmake_build
run: |
source ./openvino_toolkit/setupvars.sh
cmake -B build/ReleaseOV -G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DGGML_OPENVINO=ON
cmake --build build/ReleaseOV --config Release -j $(nproc)

build-linux-cross:
uses: ./.github/workflows/build-linux-cross.yml

Expand Down Expand Up @@ -1625,27 +1680,27 @@ jobs:
GG_BUILD_VULKAN=1 bash ./ci/run.sh ~/results/llama.cpp ~/mnt/llama.cpp

ggml-ci-arm64-cpu-kleidiai:
runs-on: ubuntu-22.04-arm

steps:
- name: Clone
id: checkout
uses: actions/checkout@v4

- name: ccache
uses: ggml-org/[email protected]
with:
key: ggml-ci-arm64-cpu-kleidiai
evict-old-files: 1d

- name: Dependencies
id: depends
run: |
sudo apt-get update
sudo apt-get install -y build-essential libcurl4-openssl-dev

- name: Test
id: ggml-ci
run: |
GG_BUILD_KLEIDIAI=1 GG_BUILD_EXTRA_TESTS_0=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt
runs-on: ubuntu-22.04-arm

steps:
- name: Clone
id: checkout
uses: actions/checkout@v4

- name: ccache
uses: ggml-org/[email protected]
with:
key: ggml-ci-arm64-cpu-kleidiai
evict-old-files: 1d

- name: Dependencies
id: depends
run: |
sudo apt-get update
sudo apt-get install -y build-essential libcurl4-openssl-dev

- name: Test
id: ggml-ci
run: |
GG_BUILD_KLEIDIAI=1 GG_BUILD_EXTRA_TESTS_0=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt

13 changes: 7 additions & 6 deletions .github/workflows/docker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,12 +39,13 @@ jobs:
# Note: the arm64 images are failing, which prevents the amd64 images from being built
# https://github.com/ggml-org/llama.cpp/issues/11888
#- { tag: "cpu", dockerfile: ".devops/cpu.Dockerfile", platforms: "linux/amd64,linux/arm64", full: true, light: true, server: true, free_disk_space: false }
- { tag: "cpu", dockerfile: ".devops/cpu.Dockerfile", platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: false, runs_on: "ubuntu-22.04" }
- { tag: "cuda", dockerfile: ".devops/cuda.Dockerfile", platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: true, runs_on: "ubuntu-22.04" }
- { tag: "musa", dockerfile: ".devops/musa.Dockerfile", platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: true, runs_on: "ubuntu-22.04" }
- { tag: "intel", dockerfile: ".devops/intel.Dockerfile", platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: true, runs_on: "ubuntu-22.04" }
- { tag: "vulkan", dockerfile: ".devops/vulkan.Dockerfile", platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: false, runs_on: "ubuntu-22.04" }
- { tag: "s390x", dockerfile: ".devops/s390x.Dockerfile", platforms: "linux/s390x", full: true, light: true, server: true, free_disk_space: false, runs_on: "ubuntu-22.04-s390x" }
- { tag: "cpu", dockerfile: ".devops/cpu.Dockerfile", platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: false, runs_on: "ubuntu-22.04" }
- { tag: "cuda", dockerfile: ".devops/cuda.Dockerfile", platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: false, runs_on: "ubuntu-22.04" }
- { tag: "musa", dockerfile: ".devops/musa.Dockerfile", platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: true, runs_on: "ubuntu-22.04" }
- { tag: "intel", dockerfile: ".devops/intel.Dockerfile", platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: true, runs_on: "ubuntu-22.04" }
- { tag: "vulkan", dockerfile: ".devops/vulkan.Dockerfile", platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: false, runs_on: "ubuntu-22.04" }
- { tag: "s390x", dockerfile: ".devops/s390x.Dockerfile", platforms: "linux/s390x", full: true, light: true, server: true, free_disk_space: false, runs_on: "ubuntu-22.04-s390x" }
- { tag: "openvino", dockerfile: ".devops/openvino.Dockerfile", platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: false }
# Note: the rocm images are failing due to a compiler error and are disabled until this is fixed to allow the workflow to complete
#- {tag: "rocm", dockerfile: ".devops/rocm.Dockerfile", platforms: "linux/amd64,linux/arm64", full: true, light: true, server: true, free_disk_space: true }
steps:
Expand Down
Loading