Unable to inference on Quantized 70B Model using llama.cpp #2575
Answered
by
klosax
vatsarishabh22
asked this question in
Q&A
-
Got error : I am using ubuntu linux |
Beta Was this translation helpful? Give feedback.
Answered by
klosax
Aug 10, 2023
Replies: 1 comment
-
Use parameter |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
klosax
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Use parameter
-gqa 8
for 70b models to work.