Skip to content

Conversation

alitariq4589
Copy link
Contributor

@ggerganov This is a reference PR to #14439 for adding CI with RVV1.0 hardware

In the previously merged PR, the workflow did not have a pull_request flag, so the builds are not getting triggered. I have just added the pull_request flag.

I am working on creating a file that mimics the functionality of the build.yml file, which contains all the tests that can run on RISC-V. Until then, please consider this change so that the project can just build on RISC-V vector hardware for now.

@github-actions github-actions bot added the devops improvements to build systems and github actions label Aug 18, 2025
@CISC
Copy link
Collaborator

CISC commented Aug 18, 2025

Since it's building on native it would be useful if it ran tests like the other native builds:

- name: Test
id: cmake_test
run: |
cd build
ctest -L main --verbose --timeout 900

@CISC
Copy link
Collaborator

CISC commented Aug 18, 2025

Adding ccache is probably a good idea too.

@alitariq4589
Copy link
Contributor Author

Since it's building on native it would be useful if it ran tests like the other native builds:

- name: Test
id: cmake_test
run: |
cd build
ctest -L main --verbose --timeout 900

@CISC

Yes. You are right.

But since the setup is currently running CIs inside the podman container, which is running on a non-root user on RISC-V, the docker execution is not yet successful. I am testing this and will take some time.

I have created this PR since the last PR, with just the building project functionality merged, but without the pull_request flag so it was not triggering at all. So till I have a native build tests ready, merging this will prove useful (check this link for infrastructure details) and will provide insights to what is failing in the current CI so I can add changes in the next PR or improve the infrastructure.

Also, the RISC-V doesn't seem to have full Vulkan support right now, so all tests as written in build.yml may not run successfully but I am working on this.

@CISC
Copy link
Collaborator

CISC commented Aug 18, 2025

Also, the RISC-V doesn't seem to have full Vulkan support right now, so all tests as written in build.yml may not run successfully but I am working on this.

CPU-only is fine, but looking at how long the build takes I'd say ccache is a must.

@CISC
Copy link
Collaborator

CISC commented Aug 18, 2025

Just installing ccache is probably not enough as the .ccache folder needs to be persisted across runs, see: https://github.com/hendrikmuhs/ccache-action?tab=readme-ov-file#how-it-works

@alitariq4589
Copy link
Contributor Author

Just installing ccache is probably not enough as the .ccache folder needs to be persisted across runs, see: https://github.com/hendrikmuhs/ccache-action?tab=readme-ov-file#how-it-works

In case of GitHub-hosted runners, the ccache has to be made to stay consistent, but as this self-hosted runner is the same for every runner, I suppose the ccache is naturally the same for every run. I have added a step to set up a directory of the ccache with a 5 GB cache capacity.

@alitariq4589
Copy link
Contributor Author

The podman container is not ephemeral and is always up. So the folder should remain consistent in every run. But I have added a step for ccache in the workflow file.

@CISC
Copy link
Collaborator

CISC commented Aug 21, 2025

@CISC CISC merged commit 029bb39 into ggml-org:master Aug 21, 2025
3 checks passed
qnixsynapse pushed a commit to menloresearch/llama.cpp that referenced this pull request Aug 22, 2025
* Changed the CI file to hw

* Changed the CI file to hw

* Added to sudoers for apt

* Removed the clone command and used checkout

* Added libcurl

* Added gcc-14

* Checking gcc --version

* added gcc-14 symlink

* added CC and C++ variables

* Added the gguf weight

* Changed the weights path

* Added system specification

* Removed white spaces

* ci: Replace Jenkins riscv native build Cloud-V pipeline with GitHub Actions workflow

Removed the legacy .devops/cloud-v-pipeline Jenkins CI configuration and introduced .github/workflows/build-riscv-native.yml for native RISC-V builds using GitHub Actions.

* removed trailing whitespaces

* Added the trigger at PR creation

* Corrected OS name

* Added ccache as setup package

* Added ccache for self-hosted runner

* Added directory for ccache size storage

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Changed the build command and added ccache debug log

* Added the base dir for the ccache

* Re-trigger CI

* Cleanup and refactored ccache steps

* Cleanup and refactored ccache steps

---------

Co-authored-by: Akif Ejaz <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
sudo apt-get update || true
sudo apt-get install -y --no-install-recommends \
build-essential \
gcc-14-riscv64-linux-gnu \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, Since we are building directly on a RISC-V Linux machine, are we still required to use a cross-compilation toolchain?
It seems that native compilation should be sufficient in this case. Could you help clarify the motivation for using a cross toolchain here? Thnaks you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the cross-compilation was only introduced because we didnt have the hardware for checking the RVV1.0 support on RISC-V. But I did not add this CI, so author may better tell us about the motivation.

Copy link
Collaborator

@CISC CISC Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @ixgbe is referring to the gcc-14-riscv64-linux-gnu package (which comes from your #14439 PR), which usually installs a cross-compiler, but build_essential should already give you a native riscv64 compiler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Installing gcc-14-riscv64-linux-gnu doesn't install a cross compiler. In fact, this is the native toolchain when the command is executed inside the riscv compute.

Also, the build essential installs 13.2.0 GCC by default, which has a vector intrinsics issue with RISC-V.

Check this issue and comment: #12693 (comment)

@CISC
Copy link
Collaborator

CISC commented Sep 20, 2025

@alitariq4589 The CI has been failing for a few days now with this error:

sudo: a terminal is required to read the password; either use the -S option to read from standard input or configure an askpass helper
sudo: a password is required

@alitariq4589
Copy link
Contributor Author

@alitariq4589 The CI has been failing for a few days now with this error:

sudo: a terminal is required to read the password; either use the -S option to read from standard input or configure an askpass helper
sudo: a password is required

Thank you for letting me know. I will resolve this by today.

@alitariq4589
Copy link
Contributor Author

@CISC I have checked the latest failing CI and this is not the runner that I integrated with my CI. If you have access to the actions settings, there should be a runner named 87a313462bbe in there. But this build is running on ggml-4-x86-cuda-v100 as shown in the following logs.

Current runner version: '2.328.0'
Runner name: 'ggml-4-x86-cuda-v100'
Runner group name: 'ggml-ci'
Machine name: 'ggml-4-x86-cuda-v100'

I don't know if you can assign a label to the runner from inside the repo settings. If you can, then I can change the CI to use that label.

The cause of the issue is that CI workflow has runs-on: self-hosted and i am assuming that ggml-4-x86-cuda-v100 is also self-hosted

If you cannot change the label, then can you check if there is a unique label which is not common in both 87a313462bbe and ggml-4-x86-cuda-v100? e.g. RISC-V or something?

@alitariq4589
Copy link
Contributor Author

Also, my version of github runner is 2.321.0. I have patched the GitHub Actions version 2.328.0 for RISC-V from upstream. I am working to update all the running instances to use tha,t but it will take some time.

@ggerganov
Copy link
Member

I don't know if you can assign a label to the runner from inside the repo settings. If you can, then I can change the CI to use that label.

Yes, please use the label RISCV64:

image

@alitariq4589
Copy link
Contributor Author

Changed the runner label in #16149 . I think the workflow needs approval to run.

@ggerganov
Copy link
Member

Thanks, I just opened a PR too: #16150. Will merge that one.

@alitariq4589
Copy link
Contributor Author

Cool. I will close mine then.

@alitariq4589
Copy link
Contributor Author

Since it's building on native it would be useful if it ran tests like the other native builds:

- name: Test
id: cmake_test
run: |
cd build
ctest -L main --verbose --timeout 900

@CISC, I am adding complete CI for RISC-V, and I would like to know if it is okay if I add the tests to the build.yml file instead of creating a separate file for RISC-V? (I will delete the old CI file once done)

@CISC
Copy link
Collaborator

CISC commented Sep 26, 2025

@CISC, I am adding complete CI for RISC-V, and I would like to know if it is okay if I add the tests to the build.yml file instead of creating a separate file for RISC-V? (I will delete the old CI file once done)

Yes, please do.

@alitariq4589
Copy link
Contributor Author

@CISC There seems to be some issue with the ggml-vocabs magic characters. Two of the CI tests are failing because of this.

Start 14: test-tokenizers-ggml-vocabs
14/35 Test #14: test-tokenizers-ggml-vocabs .......***Failed    1.71 sec
Cloning into '/home/runner/_work/llama.cpp/llama.cpp/models/ggml-vocabs'...
main : reading vocab from: '/home/runner/_work/llama.cpp/llama.cpp/models/ggml-vocabs/PLaMo2/ggml-vocab-plamo2.gguf'
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (Spacemit(R) X60)
gguf_init_from_file_impl: invalid magic characters: 'vers', expected 'GGUF'
llama_model_load: error loading model: llama_model_loader: failed to load model from /home/runner/_work/llama.cpp/llama.cpp/models/ggml-vocabs/PLaMo2/ggml-vocab-plamo2.gguf
llama_model_load_from_file_impl: failed to load model
main: error: failed to load vocab '/home/runner/_work/llama.cpp/llama.cpp/models/ggml-vocabs/PLaMo2/ggml-vocab-plamo2.gguf'
main : reading vocab from: '/home/runner/_work/llama.cpp/llama.cpp/models/ggml-vocabs/RWKV/ggml-vocab-rwkv-7-world.gguf'
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (Spacemit(R) X60)
gguf_init_from_file_impl: invalid magic characters: 'vers', expected 'GGUF'
llama_model_load: error loading model: llama_model_loader: failed to load model from /home/runner/_work/llama.cpp/llama.cpp/models/ggml-vocabs/RWKV/ggml-vocab-rwkv-7-world.gguf
llama_model_load_from_file_impl: failed to load model
main: error: failed to load vocab '/home/runner/_work/llama.cpp/llama.cpp/models/ggml-vocabs/RWKV/ggml-vocab-rwkv-7-world.gguf'
main : reading vocab from: '/home/runner/_work/llama.cpp/llama.cpp/models/ggml-vocabs/SPM/ggml-vocab-gemma-3.gguf'
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (Spacemit(R) X60)
gguf_init_from_file_impl: invalid magic characters: 'vers', expected 'GGUF'
llama_model_load: error loading model: llama_model_loader: failed to load model from /home/runner/_work/llama.cpp/llama.cpp/models/ggml-vocabs/SPM/ggml-vocab-gemma-3.gguf
llama_model_load_from_file_impl: failed to load model
main: error: failed to load vocab '/home/runner/_work/llama.cpp/llama.cpp/models/ggml-vocabs/SPM/ggml-vocab-gemma-3.gguf'
main : reading vocab from: '/home/runner/_work/llama.cpp/llama.cpp/models/ggml-vocabs/UGM/ggml-vocab-nomic-bert-moe.gguf'
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (Spacemit(R) X60)
gguf_init_from_file_impl: invalid magic characters: 'vers', expected 'GGUF'
llama_model_load: error loading model: llama_model_loader: failed to load model from /home/runner/_work/llama.cpp/llama.cpp/models/ggml-vocabs/UGM/ggml-vocab-nomic-bert-moe.gguf
llama_model_load_from_file_impl: failed to load model
main: error: failed to load vocab '/home/runner/_work/llama.cpp/llama.cpp/models/ggml-vocabs/UGM/ggml-vocab-nomic-bert-moe.gguf'
main : reading vocab from: '/home/runner/_work/llama.cpp/llama.cpp/models/ggml-vocabs/WPM/ggml-vocab-jina-v2-en.gguf'
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (Spacemit(R) X60)
gguf_init_from_file_impl: invalid magic characters: 'vers', expected 'GGUF'
llama_model_load: error loading model: llama_model_loader: failed to load model from /home/runner/_work/llama.cpp/llama.cpp/models/ggml-vocabs/WPM/ggml-vocab-jina-v2-en.gguf
llama_model_load_from_file_impl: failed to load model
main: error: failed to load vocab '/home/runner/_work/llama.cpp/llama.cpp/models/ggml-vocabs/WPM/ggml-vocab-jina-v2-en.gguf'

Check this test.

Is this something expected? This seems to be the hugging face repo. Should I create an issue for this?

@CISC
Copy link
Collaborator

CISC commented Oct 3, 2025

@CISC There seems to be some issue with the ggml-vocabs magic characters. Two of the CI tests are failing because of this.

Is this something expected? This seems to be the hugging face repo. Should I create an issue for this?

Not sure why it's vers for you? The original files are little endian, but automatically byteswapped for s390x, see #15925, should probably be made to work for all big endian platforms.

@alitariq4589
Copy link
Contributor Author

@CISC There seems to be some issue with the ggml-vocabs magic characters. Two of the CI tests are failing because of this.
Is this something expected? This seems to be the hugging face repo. Should I create an issue for this?

Not sure why it's vers for you? The original files are little endian, but automatically byteswapped for s390x, see #15925, should probably be made to work for all big endian platforms.

Most of the RISC-V chips are little endian and I am assuming k1 (chip on the RISC-V board which I am using for builds) should also be little endian. Is this something directly related to llama.cpp or is this parsed in some other repository?

@CISC
Copy link
Collaborator

CISC commented Oct 3, 2025

Not sure why it's vers for you? The original files are little endian, but automatically byteswapped for s390x, see #15925, should probably be made to work for all big endian platforms.

Most of the RISC-V chips are little endian and I am assuming k1 (chip on the RISC-V board which I am using for builds) should also be little endian. Is this something directly related to llama.cpp or is this parsed in some other repository?

I wonder, vers suggests maybe you are not getting the GGUFs at all, but some other content?

@alitariq4589
Copy link
Contributor Author

Not sure why it's vers for you? The original files are little endian, but automatically byteswapped for s390x, see #15925, should probably be made to work for all big endian platforms.

Most of the RISC-V chips are little endian and I am assuming k1 (chip on the RISC-V board which I am using for builds) should also be little endian. Is this something directly related to llama.cpp or is this parsed in some other repository?

I wonder, vers suggests maybe you are not getting the GGUFs at all, but some other content?

I don't yet understand how this works. I will have to look closely at this CI script to determine which model/file is causing the incorrect magic header.

@CISC
Copy link
Collaborator

CISC commented Oct 3, 2025

Not sure why it's vers for you? The original files are little endian, but automatically byteswapped for s390x, see #15925, should probably be made to work for all big endian platforms.

Most of the RISC-V chips are little endian and I am assuming k1 (chip on the RISC-V board which I am using for builds) should also be little endian. Is this something directly related to llama.cpp or is this parsed in some other repository?

I wonder, vers suggests maybe you are not getting the GGUFs at all, but some other content?

I don't yet understand how this works. I will have to look closely at this CI script to determine which model/file is causing the incorrect magic header.

It looks like all of them, downloaded from here: https://huggingface.co/ggml-org/vocabs/tree/main

@alitariq4589
Copy link
Contributor Author

Not sure why it's vers for you? The original files are little endian, but automatically byteswapped for s390x, see #15925, should probably be made to work for all big endian platforms.

Most of the RISC-V chips are little endian and I am assuming k1 (chip on the RISC-V board which I am using for builds) should also be little endian. Is this something directly related to llama.cpp or is this parsed in some other repository?

I wonder, vers suggests maybe you are not getting the GGUFs at all, but some other content?

I don't yet understand how this works. I will have to look closely at this CI script to determine which model/file is causing the incorrect magic header.

It looks like all of them, downloaded from here: https://huggingface.co/ggml-org/vocabs/tree/main

An update on this bug...

There seems to be an issue with how the test runs on the fork. I ran the CI test on the master of my fork (with no added changes, so it stays synced with fork) on my x86 work laptop and got the following error:

      Start 14: test-tokenizers-ggml-vocabs
14/35 Test #14: test-tokenizers-ggml-vocabs .......***Failed    0.56 sec
Already up to date.
main : reading vocab from: '/home/user0/.WORKDIR/llama.cpp/models/ggml-vocabs/PLaMo2/ggml-vocab-plamo2.gguf'
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz)
gguf_init_from_file_impl: invalid magic characters: 'vers', expected 'GGUF'
llama_model_load: error loading model: llama_model_loader: failed to load model from /home/user0/.WORKDIR/llama.cpp/models/ggml-vocabs/PLaMo2/ggml-vocab-plamo2.gguf
llama_model_load_from_file_impl: failed to load model
main: error: failed to load vocab '/home/user0/.WORKDIR/llama.cpp/models/ggml-vocabs/PLaMo2/ggml-vocab-plamo2.gguf'
main : reading vocab from: '/home/user0/.WORKDIR/llama.cpp/models/ggml-vocabs/RWKV/ggml-vocab-rwkv-7-world.gguf'
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz)
gguf_init_from_file_impl: invalid magic characters: 'vers', expected 'GGUF'
llama_model_load: error loading model: llama_model_loader: failed to load model from /home/user0/.WORKDIR/llama.cpp/models/ggml-vocabs/RWKV/ggml-vocab-rwkv-7-world.gguf
llama_model_load_from_file_impl: failed to load model
main: error: failed to load vocab '/home/user0/.WORKDIR/llama.cpp/models/ggml-vocabs/RWKV/ggml-vocab-rwkv-7-world.gguf'
main : reading vocab from: '/home/user0/.WORKDIR/llama.cpp/models/ggml-vocabs/SPM/ggml-vocab-gemma-3.gguf'
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz)
gguf_init_from_file_impl: invalid magic characters: 'vers', expected 'GGUF'
llama_model_load: error loading model: llama_model_loader: failed to load model from /home/user0/.WORKDIR/llama.cpp/models/ggml-vocabs/SPM/ggml-vocab-gemma-3.gguf
llama_model_load_from_file_impl: failed to load model
main: error: failed to load vocab '/home/user0/.WORKDIR/llama.cpp/models/ggml-vocabs/SPM/ggml-vocab-gemma-3.gguf'
main : reading vocab from: '/home/user0/.WORKDIR/llama.cpp/models/ggml-vocabs/UGM/ggml-vocab-nomic-bert-moe.gguf'
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz)
gguf_init_from_file_impl: invalid magic characters: 'vers', expected 'GGUF'
llama_model_load: error loading model: llama_model_loader: failed to load model from /home/user0/.WORKDIR/llama.cpp/models/ggml-vocabs/UGM/ggml-vocab-nomic-bert-moe.gguf
llama_model_load_from_file_impl: failed to load model
main: error: failed to load vocab '/home/user0/.WORKDIR/llama.cpp/models/ggml-vocabs/UGM/ggml-vocab-nomic-bert-moe.gguf'
main : reading vocab from: '/home/user0/.WORKDIR/llama.cpp/models/ggml-vocabs/WPM/ggml-vocab-jina-v2-en.gguf'
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz)
gguf_init_from_file_impl: invalid magic characters: 'vers', expected 'GGUF'
llama_model_load: error loading model: llama_model_loader: failed to load model from /home/user0/.WORKDIR/llama.cpp/models/ggml-vocabs/WPM/ggml-vocab-jina-v2-en.gguf
llama_model_load_from_file_impl: failed to load model
main: error: failed to load vocab '/home/user0/.WORKDIR/llama.cpp/models/ggml-vocabs/WPM/ggml-vocab-jina-v2-en.gguf'

I dont see this while running on the upstream repo

@CISC
Copy link
Collaborator

CISC commented Oct 9, 2025

It looks like all of them, downloaded from here: https://huggingface.co/ggml-org/vocabs/tree/main

An update on this bug...

There seems to be an issue with how the test runs on the fork. I ran the CI test on the master of my fork (with no added changes, so it stays synced with fork) on my x86 work laptop and got the following error:

Ahhh, I know, you don't have git-lfs installed. :)

@alitariq4589
Copy link
Contributor Author

I have checked it. It is installed.

@alitariq4589
Copy link
Contributor Author

$ sudo apt install git-lfs -y
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
git-lfs is already the newest version (3.0.2-1ubuntu0.3+esm1).
0 upgraded, 0 newly installed, 0 to remove and 27 not upgraded.

@CISC
Copy link
Collaborator

CISC commented Oct 9, 2025

I have checked it. It is installed.

Ok, but it is definitely a git-lfs issue, because what you're seeing is the first 4 bytes of
version https://git-lfs.github.com/spec/v1

So, for some reason LFS files are not resolved.

@ggerganov
Copy link
Member

Not sure if this is relevant, but git-lfs requires you to run the following command once after installing the package:

git lfs install

@alitariq4589
Copy link
Contributor Author

Yes, that was right. It was git-lfs issue. All the tests are now passing except one ggml-ci-riscv64-native-cpu-high-perf.

I think it is because of the missing Python transformers module. I am installing it. If this is not installable, then I think I will have to add the condition in the test scripts to convert the weights to the appropriate format and then transfer to RISC-V for execution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

devops improvements to build systems and github actions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants