-
Notifications
You must be signed in to change notification settings - Fork 32
Open
Description
I was starting to try to implement the SVD-LLMv2
But i got into trouble as neither the pseudocode, nor the math formular nor figuring it out from the text written provied me enough information to do it.
Here are my questions:
- 1/Log(LG) logarithm of a loss dosent make much sense... how should i understand this? because if Lmin is small it would go to near infinity
- The ratio formula of Lmin isnt used in the sum context, shouldnt it use a individual loss per matrix?
- S⁻¹_s × U⁻¹_s <- shouldnt this involve V from the second SVD and not the inverse of U_s?
- Vws , how can it be reconstructed? because the pseudo code omits it
- D ← W × Us × √Ss how do these dimensionaly align? Especial Us and Ss (truncated?)
- in the pseudo code you use Lmin <- theoretical_loss(w,x,R) but later you make r <- Len(LG) * R * Lmin / Sum(LG) so i am missing something, additionally in the text you write the same :
"the compression ratio of each weight matrix within a group is determined as Len(LG) × R × Lmin / Sum(LG)"
i think r should then probably be in the inner loop, is this correct?
Plus some other things i kinda not get and would need help with to understand it. What is approx. the release date of the code for SVD-LLMv2?
Because i absolutely failed to write it myself with the papers information....
Metadata
Metadata
Assignees
Labels
No labels