A10G not using VRAM after generating training split in AutoTrain #925
Unanswered
LeivurGargiulo
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone,
I recently purchased a Hugging Face AutoTrain space with an NVIDIA A10G (24GB, reports ~22.49 GiB usable) for fine-tuning nothingiisreal/MN-12B-Celeste-V1.9 on josecannete/large_spanish_corpus.
When I start training, AutoTrain first generates the training split. During that step, VRAM usage is basically zero (around 2.88 MiB/22.49 GiB). After the split finishes, the process just stops — no training actually begins, and GPU usage never increases.
I expected VRAM usage to spike when training started, but it seems the job never reaches that stage.
Has anyone else experienced this with AutoTrain + A10G?
Could this be an issue with:
Dataset size or format?
The LoRA/PEFT + quantization setup I’m using?
Some AutoTrain pipeline bug for large models?
Any help would be appreciated — I just want to confirm if this is normal behavior for the split step, and why the actual training might not be starting.
Thanks in advance!
I want to train Celeste V1.9 to learn Spanish, then Spanish books with PleIAs/Spanish-PD-Books and then Argentine Spanish with ylacombe/google-argentinian-spanish. But I’m not that sure if my current JSON for Spanish Corpus is OK, or how to config the other JSON for next steps.
I've already wasted a ton of money and time on this and I need to know if it has a solution. Thanks!
This is my JSON:
I also tried this YAML:
Beta Was this translation helpful? Give feedback.
All reactions