I wonder if we can use GPTQ quantized models for this? 7B HF models are too big even for 12Gb cards... #58

barleyj21 · 2023-06-02T14:07:37Z

barleyj21
Jun 2, 2023

7B HF models take more than 12Gb of memory, so is there a way to use GPTQ ones?

PromtEngineer · 2023-06-03T00:48:29Z

PromtEngineer
Jun 3, 2023
Maintainer

Its a good idea. I am testing them and also looking at adding support for different embeddings.

0 replies

PromtEngineer · 2023-06-08T22:56:18Z

PromtEngineer
Jun 8, 2023
Maintainer

Just saw the new accelerate update. I think it will now be possible.

0 replies

LitenBuzzTh · 2023-06-09T03:13:10Z

LitenBuzzTh
Jun 9, 2023

Support for GPTQ models would be awesome

0 replies

KonradHoeffner · 2023-07-31T11:59:48Z

KonradHoeffner
Jul 31, 2023

You can use GPTQ now.

1 reply

LitenBuzzTh Jul 31, 2023

yeah been using it for a while

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

I wonder if we can use GPTQ quantized models for this? 7B HF models are too big even for 12Gb cards... #58

Uh oh!

{{title}}

Uh oh!

Replies: 4 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

I wonder if we can use GPTQ quantized models for this? 7B HF models are too big even for 12Gb cards... #58

Uh oh!

barleyj21 Jun 2, 2023

Replies: 4 comments · 1 reply

Uh oh!

PromtEngineer Jun 3, 2023 Maintainer

Uh oh!

PromtEngineer Jun 8, 2023 Maintainer

Uh oh!

LitenBuzzTh Jun 9, 2023

Uh oh!

KonradHoeffner Jul 31, 2023

Uh oh!

LitenBuzzTh Jul 31, 2023

barleyj21
Jun 2, 2023

Replies: 4 comments 1 reply

PromtEngineer
Jun 3, 2023
Maintainer

PromtEngineer
Jun 8, 2023
Maintainer

LitenBuzzTh
Jun 9, 2023

KonradHoeffner
Jul 31, 2023