-
I have a 6GB Graphics card. I am using: The first query is answered without a problem, but at the second query, I get similar errors as following: Enter a query: what is the power of the congress How can I prevent this error? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
One recommendation that I will have is to change the embedding model. For starters, try the The issue you are facing is coming from llamacpp and seems to be a common (here, and here). |
Beta Was this translation helpful? Give feedback.
The LLM will still use GPU for generation, its simply changing the embedding model. This will improve your GPUT utilization.