Lora rank & alpha #2037
Unanswered
BigDataMLexplorer
asked this question in
Q&A
Replies: 1 comment 2 replies
-
When using rslora, the LoRA output will be scaled by a factor of |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, I'm training the Llama3 8b model. I did many trials with lora rank = 16 and different aplhas -> (32, 16 and 8). In my case the best result was with aplha 8. I did not use rslora in this testing.
Even assuming I have the best aplha value, can it still help me to use use_rslora=True in this configuration? If I have aplha set to 8, what aplha will be used when rslora? I didn't quite get it from the huggingface article.
In general, I read that a higher rank in lora should capture more nuances, because more parameters will be trained.
That's why I also tried to increase lora's rank to 256 and leave aplha at half (128). Of course with a proper learning rate, otherwise the results would be very bad. I already used use_rslora=True here. The result was 1% percentage point worse than rank 16. The result was worse than rank 16 even though I didn't use rslora.
Do you think I may have already reached the optimum or should I do something different when using rslora?
Thank you
Beta Was this translation helpful? Give feedback.
All reactions