Replies: 1 comment
-
Nice breakdown — this kind of tuning makes a difference if the underlying logic path is already stable. But from experience with RAG systems, a lot of the issues show up after the LLM receives the chunk, not before. A few things I’ve run into when debugging similar setups:
So parameter tuning is useful — but sometimes the real culprit is upstream: retrieval format, memory boundary, or how the prompt bridges semantic layers. Happy to compare notes if you're exploring chunk attention or logic fallback strategies. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
I am building a RAG with Llama-cpp-python and langchain LlamaCpp for a few hundred PDFs of scientific information and a few GPUs.
I have tried optimizing the parameters of the LLM to my best knowledge based on information online.
I was wondering if those parameters would seem appropriate for the intended purpose of interrogating a large set of data?
Loading the model as such with parameters:
Parameters loaded:
Beta Was this translation helpful? Give feedback.
All reactions