-
Notifications
You must be signed in to change notification settings - Fork 13.7k
ci : remove cuda 11.7 releases, switch runner to windows 2022 #13997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
There are older GPU's some users have which can't run cuda12 (They are stuck on 11.4). Its a minority but I do suspect cuda 11 support as a whole to be at risk if no longer tested for. Its the main reason we ship cuda11 downstream. Is this still covered somewhere going forward? For example by a Linux based CI? I do plan to look in to this issue downstream as we want to keep making cuda 11 builds. Working theory is that windows server 2022 is fine but visual studio needs downgrading. If thats desirable and we have something I can share it here. |
|
The releases are already failing for several hours every day due to the scheduled brownouts, so we need to fix this now regardless. I am not sure that it would be worth re-adding CUDA 11 releases for two reasons: first, even if we can make a release that supports very old GPUs (over 10 years old), the performance is not likely to be good enough to be usable in practice. Second, we would still need a CUDA 12 release because some features depend on it. Most people know nothing about CUDA versions, and offering multiple versions is likely to create confusion while adding very little value. @JohannesGaessler it should be possible to change the linux CI to a different CUDA version if you are interested in keeping build tests for CUDA 11. |
|
The releases I am less concerned with as we can do that downstream. My worry is that regressions in cuda11 will get missed if there are no tests for it anymore. Do those remain? |
|
Since it would be mainly @JohannesGaessler who would have to maintain support for CUDA 11, I will leave it up to him to decide whether to add CI tests for it. It would only test if the build completes though, there were never any actual tests being run on CUDA 11. |
|
I don't know how suitable this is for upstream but for KoboldCpp we will continue building by installing visual studio 2019 back. Sharing it mostly since I promised i'd share it, the one caveat is that this would be the trial version of vs2019 but considering github preinstalls enterprise in their images for CI usage I suspect MS won't object. Might be possible to do it cleaner with just the build tools but this worked in my test run. We do have a few users who threw together cheap LLM rigs from scrap parts so for us downstream its worth it to keep support, that's separate from upstream release build desirability. |
|
I have CUDA 11 installed on one of my machines and will continue maintaining support for it for the foreseeable future. I will do so regardless of CI. The condition from my end is that I want to have a clear problem description and steps to reproduce using llama.cpp/ggml only. |
|
Thats great to hear, naturally all KoboldCpp related issues we try to reproduce on llamacpp before reporting thats a given. |
|
Erm, does anybody do Windows CUDA 11 builds?.. This is the last thing I wanna dick with on this damn OS... (virtualized windows, driver change/update fuckery is... an issue) |
|
KoboldCpp's OldPC build is 11.5 so we can have it work on Kepler. |
Thx I'll try that - as soon as it has plamo-2. |
|
You are in luck then, it seems to work fine on the version we released today. |
But this doesn't seem to use the llama.cpp streaming format?.. And took ~5 minutes hanging here before continuing: |
|
The streaming if openai is used is close to OpenAI's format. If theres an important difference we'd have to know. If you are using our UI then its not using OpenAI's API and it will be different. The hanging during loading is normal the first time, we only ship PTX these days so your driver had to compile them which on CUDA11 can take a bit. Now you have it you should be good until you update either KoboldCpp or the driver. |
CUDA 11.7 does not work in the windows 2022 runners, and the windows 2019 runners are being removed. At this point I don't think the 11.7 releases are necessary, everybody should already have a driver compatible with CUDA 12.