Where can I learn about the different models, types of models and fine-tuning? #423
dportabella
started this conversation in
General
Replies: 2 comments 1 reply
-
I have NVDIA GeForce RTX 4070 with 12GB. The best models I can use are the 7B GPTQ 8bit models, right?
|
Beta Was this translation helpful? Give feedback.
0 replies
-
@dportabella hi, what is the response time for the best model you have tried till now on your GeForce 12GB? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I have some books in PDF and I want to query the books using localGPT.
I have a machine with proxmox (server virtualization). Inside proxmox, I have a VM with 24 GB mem, 20 processors (1 socket, 20 cores), and NVDIA GeForce RTX 4070 with 16 GB VRAM (directly attached from host to the VM). Is this a good setup for optimal and fast replies?
Which is the best powerful model available for this task?
Where can I learn about the different models and fine-tuning?
"TheBloke/Llama-2-7B-Chat-GGML"
"TheBloke/vicuna-7B-1.1-HF"
"TheBloke/Wizard-Vicuna-7B-Uncensored-HF"
"TheBloke/guanaco-7B-HF"
'NousResearch/Nous-Hermes-13b'
I see that GPTQ models are for small device such as phones, so I don't need those.
Should I use HF models or GGML (quantized cpu+gpu+mps) models?
Beta Was this translation helpful? Give feedback.
All reactions