Few questions on the SVD-LLM2 paper

I was starting to try to implement the SVD-LLMv2
But i got into trouble as neither the pseudocode, nor the math formular nor figuring it out from the text written provied me enough information to do it.

Here are my questions:
- 1/Log(LG) logarithm of a loss dosent make much sense... how should i understand this? because if Lmin is small it would go to near infinity
- The ratio formula of Lmin isnt used in the sum context, shouldnt it use a individual loss per matrix?
- S⁻¹_s × U⁻¹_s  <- shouldnt this involve V from the second SVD and not the inverse of U_s?
-  Vws , how can it be reconstructed? because the pseudo code omits it
- D ← W × Us × √Ss how do these dimensionaly align? Especial Us and Ss (truncated?)
- in the pseudo code you use Lmin <- theoretical_loss(w,x,R) but later you make r <- Len(LG) * R * Lmin / Sum(LG) so i am missing something, additionally in the text you write the same :
"the compression ratio of each weight matrix within a group is determined as Len(LG) × R × Lmin / Sum(LG)" 
i think r should then probably be in the inner loop, is this correct?

Plus some other things i kinda not get and would need help with to understand it. What is approx. the release date of the code for SVD-LLMv2?
Because i absolutely failed to write it myself with the papers information....



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Few questions on the SVD-LLM2 paper #46

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Few questions on the SVD-LLM2 paper #46

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions