ci : remove cuda 11.7 releases, switch runner to windows 2022 #13997

slaren · 2025-06-03T20:14:46Z

CUDA 11.7 does not work in the windows 2022 runners, and the windows 2019 runners are being removed. At this point I don't think the 11.7 releases are necessary, everybody should already have a driver compatible with CUDA 12.

henk717 · 2025-06-04T06:15:53Z

There are older GPU's some users have which can't run cuda12 (They are stuck on 11.4). Its a minority but I do suspect cuda 11 support as a whole to be at risk if no longer tested for. Its the main reason we ship cuda11 downstream.

Is this still covered somewhere going forward? For example by a Linux based CI?

I do plan to look in to this issue downstream as we want to keep making cuda 11 builds. Working theory is that windows server 2022 is fine but visual studio needs downgrading. If thats desirable and we have something I can share it here.

slaren · 2025-06-04T13:35:07Z

The releases are already failing for several hours every day due to the scheduled brownouts, so we need to fix this now regardless. I am not sure that it would be worth re-adding CUDA 11 releases for two reasons: first, even if we can make a release that supports very old GPUs (over 10 years old), the performance is not likely to be good enough to be usable in practice. Second, we would still need a CUDA 12 release because some features depend on it. Most people know nothing about CUDA versions, and offering multiple versions is likely to create confusion while adding very little value.

@JohannesGaessler it should be possible to change the linux CI to a different CUDA version if you are interested in keeping build tests for CUDA 11.

henk717 · 2025-06-04T15:17:43Z

The releases I am less concerned with as we can do that downstream. My worry is that regressions in cuda11 will get missed if there are no tests for it anymore. Do those remain?

slaren · 2025-06-04T15:23:34Z

Since it would be mainly @JohannesGaessler who would have to maintain support for CUDA 11, I will leave it up to him to decide whether to add CI tests for it. It would only test if the build completes though, there were never any actual tests being run on CUDA 11.

henk717 · 2025-06-04T19:02:10Z

I don't know how suitable this is for upstream but for KoboldCpp we will continue building by installing visual studio 2019 back.
Script is here:

echo Preparing setup
curl -fLO https://download.visualstudio.microsoft.com/download/pr/1fbe074b-8ae1-4e9b-8e83-d1ce4200c9d1/61098e228df7ba3a6a8b4e920a415ad8878d386de6dd0f23f194fe1a55db189a/vs_Enterprise.exe
vs_Enterprise.exe --quiet --add Microsoft.VisualStudio.Workload.VCTools --add Microsoft.VisualStudio.Component.VC.CLI.Support --add Microsoft.VisualStudio.Component.Windows10SDK.19041 --add Microsoft.VisualStudio.Workload.UniversalBuildTools --add Microsoft.VisualStudio.Component.VC.CMake.Project
echo Waiting for setup
set "ProcessName=setup.exe"

:CheckProcess
tasklist /FI "IMAGENAME eq %ProcessName%" | find /I "%ProcessName%" >nul
if %errorlevel%==0 (
    ping 127.0.0.1 /n 5 >nul
    goto CheckProcess
)

echo Setup completed

Sharing it mostly since I promised i'd share it, the one caveat is that this would be the trial version of vs2019 but considering github preinstalls enterprise in their images for CI usage I suspect MS won't object. Might be possible to do it cleaner with just the build tools but this worked in my test run.

We do have a few users who threw together cheap LLM rigs from scrap parts so for us downstream its worth it to keep support, that's separate from upstream release build desirability.

JohannesGaessler · 2025-06-05T11:03:03Z

I have CUDA 11 installed on one of my machines and will continue maintaining support for it for the foreseeable future. I will do so regardless of CI. The condition from my end is that I want to have a clear problem description and steps to reproduce using llama.cpp/ggml only.

henk717 · 2025-06-05T13:07:18Z

Thats great to hear, naturally all KoboldCpp related issues we try to reproduce on llamacpp before reporting thats a given.

…rg#13997)

whoreson · 2025-07-17T11:37:02Z

Erm, does anybody do Windows CUDA 11 builds?.. This is the last thing I wanna dick with on this damn OS... (virtualized windows, driver change/update fuckery is... an issue)

henk717 · 2025-07-17T11:56:46Z

KoboldCpp's OldPC build is 11.5 so we can have it work on Kepler.

whoreson · 2025-07-17T21:07:46Z

KoboldCpp's OldPC build is 11.5 so we can have it work on Kepler.

Thx I'll try that - as soon as it has plamo-2.

henk717 · 2025-07-17T21:24:13Z

You are in luck then, it seems to work fine on the version we released today.

whoreson · 2025-07-18T08:30:02Z

You are in luck then, it seems to work fine on the version we released today.

But this doesn't seem to use the llama.cpp streaming format?..

data: {"id": "koboldcpp", "object": "text_completion", "created": 1752826704, "model": "koboldcpp/Wayfarer-12B-Q8_0", "choices": [{"index": 0, "finish_reason": null, "text": " ,"}]}

And took ~5 minutes hanging here before continuing:

Initializing CUDA/HIP, please wait, the following step may take a few minutes (only for first launch)...
---
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes

henk717 · 2025-07-18T09:19:38Z

The streaming if openai is used is close to OpenAI's format. If theres an important difference we'd have to know. If you are using our UI then its not using OpenAI's API and it will be different.

The hanging during loading is normal the first time, we only ship PTX these days so your driver had to compile them which on CUDA11 can take a bit. Now you have it you should be good until you update either KoboldCpp or the driver.

ci : remove cuda 11.7 releases, switch runner to windows 2022

24456c6

github-actions bot added the devops improvements to build systems and github actions label Jun 3, 2025

ggerganov approved these changes Jun 4, 2025

View reviewed changes

slaren merged commit 2589ad3 into master Jun 4, 2025
45 checks passed

slaren deleted the sl/cuda-ci-win-2022 branch June 4, 2025 13:37

slaren mentioned this pull request Jun 4, 2025

ci: Update windows-2019 to windows-2022 #13960

Closed

furyhawk pushed a commit to furyhawk/llama.cpp that referenced this pull request Jun 6, 2025

ci : remove cuda 11.7 releases, switch runner to windows 2022 (ggml-o…

3c83b32

…rg#13997)

ci : remove cuda 11.7 releases, switch runner to windows 2022 #13997

ci : remove cuda 11.7 releases, switch runner to windows 2022 #13997

Uh oh!

Conversation

slaren commented Jun 3, 2025

Uh oh!

henk717 commented Jun 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

slaren commented Jun 4, 2025

Uh oh!

Uh oh!

henk717 commented Jun 4, 2025

Uh oh!

slaren commented Jun 4, 2025

Uh oh!

henk717 commented Jun 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JohannesGaessler commented Jun 5, 2025

Uh oh!

henk717 commented Jun 5, 2025

Uh oh!

whoreson commented Jul 17, 2025

Uh oh!

henk717 commented Jul 17, 2025

Uh oh!

whoreson commented Jul 17, 2025

Uh oh!

henk717 commented Jul 17, 2025

Uh oh!

whoreson commented Jul 18, 2025

Uh oh!

henk717 commented Jul 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

henk717 commented Jun 4, 2025 •

edited

Loading

henk717 commented Jun 4, 2025 •

edited

Loading