fine tuning on custom dataset taking ages to progress #247
Closed
bijucyborg
started this conversation in
General
Replies: 1 comment
-
ok it was the parent id matching causing the training to be slow. Since the oasst dataset also has parent id matching, I would need to investigate why fine tuning with my dataset is so much slower than oasst. I made up this id and parent id column based on some assumptions so there is high likelyhood that something is wrong here. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
So I powered up a Ubuntu VM with access to GPU.
GPU #1 - current utilization: 0.0% - VRAM usage: 4.1 GB / 16.0 GB - NVIDIA A16-16Q
GPU #2 - current utilization: 0.0% - VRAM usage: 4.1 GB / 16.0 GB - NVIDIA A16-16Q
I then created a dataset according to specifications.
https://www.kaggle.com/code/bijucyborg/amzn-top-cellphones-q-a-for-h2o-llm-studio
If I start a fine tuning job with 50 to 100 rows, the fine tuning is completed in 24 seconds or so.
If I run the fine tuning on oasst dataset, the fine tuning is completed in 15 minutes approx.
I'm using the pythia 1b parameter model to conduct these experiments.
But if I include 250 rows or 500 rows, the training starts but is taking forever to even initialise.
2023-07-05 12:28:29,736 - INFO: Training Epoch: 1 / 1
2023-07-05 12:28:29,737 - INFO: train loss: 0%| | 0/123 [00:00<?, ?it/s]
If I look at the resource utilization the CPU is maxed out at 100% whereas the GPU is not being utilised at all.
What could I be doing wrong and where should be I be looking to find why this is happening. The dataset is quite a suspect but since reducing the rows makes the training super fast, I believe its got something to do with quantity than quality.
However since oasst which is 8000 rows also works like a charm, im confused about what could be wrong.
Appreciate any clues to make this work. Thanks in advance.
Beta Was this translation helpful? Give feedback.
All reactions