GPTQ for RWKV by 3outeille · Pull Request #98 · BlinkDL/ChatRWKV

3outeille · 2023-04-19T09:25:29Z

This is work in progress and serve as main thread for any questions related to this topic

3outeille · 2023-04-19T14:43:56Z

@BlinkDL Do I have to quantize blocks.1.att.* as well ? (I am thinking of key, value, receptance weight)

BlinkDL · 2023-04-20T07:18:55Z

@3outeille yes do it for all matrices weights (ignore time_xxx)

…ation for now

…ow for some layer + need tests)

3outeille · 2023-04-25T10:05:47Z

@BlinkDL Do you happen to have a reference perplexity measure (or whatever metrics ) I can use as a baseline ?

BlinkDL · 2023-04-25T14:48:15Z

@BlinkDL Do you happen to have a reference perplexity measure (or whatever metrics ) I can use as a baseline ?

https://github.com/BlinkDL/ChatRWKV/blob/main/v2/benchmark.py use the LAMBADA ppl here

…t quantization

meditans · 2023-04-26T23:51:24Z

Question: would we expect a huge improvement wrt perplexity if we did quantization-aware training?

3outeille · 2023-04-27T08:17:03Z

@meditans QAT will probably yield huge improvement but this imply re-training your model whereas GPTQ uses a post-training quantization strategy (no re-training involved)

… step Date: Tue May 2 18:17:57 2023 +0000

BlinkDL · 2023-05-08T04:18:28Z

How's it going :) are you in Discord

3outeille · 2023-05-09T08:08:16Z

Yep, I sent a message on discord in quantization channel

Evilran · 2023-05-19T06:47:06Z

Hi. Is it available now?

3outeille · 2023-06-03T08:27:31Z

@Evilran Hi, making it work with chatRWKV is too much of a hassle because it requires to change the RWKV class too much, thus the PR will not be accepted. However, I made it work with HuggingFace version of RWKV if you want: https://github.com/3outeille/GPTQ-for-RWKV

3outeille added 3 commits April 18, 2023 14:53

feat(quantize): measure perplexity on wikitext2

6e556e5

feat(quantize): add gptq files

bde6374

feat(quantize): begin to readapt with RWKV

943af70

3outeille mentioned this pull request Apr 19, 2023

Implement GPTQ for RWKV BlinkDL/RWKV-LM#88

Closed

3outeille added 4 commits April 23, 2023 12:50

breaking(quantize): draft gptq rwkv

629fc9b

fix(quantize): GPTQ hooks now work with RWKV

4a19476

feat(quantize): link fasterquant with RWKV + remove 1D tensor quantiz…

dba2670

…ation for now

feat(quantize): full gptq pipeline now integrated with RKWV (quite sl…

57079e7

…ow for some layer + need tests)

fix(quantize): add missing part in forward block + support head.weigh…

8e78f2d

…t quantization

3outeille force-pushed the quantize branch from c92f72f to 8e78f2d Compare April 26, 2023 08:14

3outeille force-pushed the quantize branch 3 times, most recently from f4584b4 to 76d937b Compare April 28, 2023 20:29

3outeille added 6 commits April 28, 2023 20:30

feat(sanity-check): begin sanity check for GPTQ on MNIST

f87df05

breaking(sanity-check): add save & load option for reference gptq

b77715d

breaking(sanity-check): enhance with dummy model

816def4

fix(sanity-check): dont quantize last layer for dummy example

f141e52

breaking(sanity-check): adding my implem gptq

a1ea882

fix(sanity-check): training ref and implem now yield same outputs

8a37fb4

3outeille force-pushed the quantize branch from 76d937b to 5278821 Compare April 28, 2023 20:30

feat(sanity-check): implem version of gptq now added

4233522

3outeille force-pushed the quantize branch from fc9a065 to 4233522 Compare May 2, 2023 07:05

3outeille added 2 commits May 3, 2023 12:30

fix(sanity-check): ref and implem now yield the same results at every…

e74d72a

… step Date: Tue May 2 18:17:57 2023 +0000

feat(quantize): readapt GPTQ for rwkv

cf14124

3outeille force-pushed the quantize branch from bb43465 to cf14124 Compare May 3, 2023 12:32

breaking(gptq): quantizing only 1 layer yield high perplexity

c2bbe64

3outeille force-pushed the quantize branch from 5d2ad0c to c2bbe64 Compare May 7, 2023 20:00

fix(ppl): measure ppl using sliding window

9b9c714

update

3399ef0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GPTQ for RWKV#98

GPTQ for RWKV#98
3outeille wants to merge 20 commits intoBlinkDL:mainfrom
3outeille:quantize

3outeille commented Apr 19, 2023

Uh oh!

3outeille commented Apr 19, 2023 •

edited by BlinkDL

Loading

Uh oh!

BlinkDL commented Apr 20, 2023

Uh oh!

3outeille commented Apr 25, 2023

Uh oh!

BlinkDL commented Apr 25, 2023 •

edited

Loading

Uh oh!

meditans commented Apr 26, 2023

Uh oh!

3outeille commented Apr 27, 2023 •

edited

Loading

Uh oh!

BlinkDL commented May 8, 2023

Uh oh!

3outeille commented May 9, 2023

Uh oh!

Evilran commented May 19, 2023

Uh oh!

3outeille commented Jun 3, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

3outeille commented Apr 19, 2023

Uh oh!

3outeille commented Apr 19, 2023 • edited by BlinkDL Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BlinkDL commented Apr 20, 2023

Uh oh!

3outeille commented Apr 25, 2023

Uh oh!

BlinkDL commented Apr 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

meditans commented Apr 26, 2023

Uh oh!

3outeille commented Apr 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BlinkDL commented May 8, 2023

Uh oh!

3outeille commented May 9, 2023

Uh oh!

Evilran commented May 19, 2023

Uh oh!

3outeille commented Jun 3, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

3outeille commented Apr 19, 2023 •

edited by BlinkDL

Loading

BlinkDL commented Apr 25, 2023 •

edited

Loading

3outeille commented Apr 27, 2023 •

edited

Loading