Is RAG fundamentally mismatched with how LLMs want to think? #32257

onestardao · 2025-07-27T14:26:10Z

onestardao
Jul 27, 2025

Hey folks — not trying to start a war here, but I’m genuinely confused.

Been trying to ship something with RAG for a few weeks now, and the more I do it, the more I feel like I’m… fighting the architecture itself.

Chunks aren’t natural to LLMs.

Indexing sounds great but dies on context coherence.

Retrieval is only as good as the chunking you hacked up 3 weeks ago.

Post-processing is duct tape over duct tape.

Is this normal??
Like—are we just all pretending this is fine, or am I missing some master plan?

Not trying to sell anything — just honestly wondering if others feel the same.
If anyone has cracked this in a way that feels elegant (not just “it works if you babysit it”), I’d love to hear it.

Take your time, no rush.

JoonYong-Park · 2025-07-31T07:09:49Z

JoonYong-Park
Jul 31, 2025

Thank you for articulating this so well. It's a frustration I think many of us have felt, and it's reassuring to know we're not alone in feeling like we're fighting the architecture sometimes.

The perspective that I've found helpful personally is the one you hinted at: seeing RAG less as a patch and more as building scaffolding for the LLM's reasoning.

To deal with that "duct tape" feeling in my own work, I've been experimenting in a couple of areas, and I'm sure you've navigated these waters as well:

On Chunking: As you know, standard fixed-size chunking can be brittle. I've been leaning more towards content-aware chunking (using paragraphs, etc.) simply because it feels like it respects the source material more.
On Retrieval: You're right that retrieval is only as good as the chunking. I've been finding that adding a Re-ranker on top, or ensuring the model gets the wider context around a retrieved sentence, helps mitigate some of those downstream issues.

I don't believe there's a single "master plan" out there, but my personal take is that the field is slowly moving towards Adaptive RAG. Your question is a perfect example of the critical thinking that pushes us all in that more elegant direction.

0 replies

onestardao · 2025-08-05T06:42:15Z

onestardao
Aug 5, 2025
Author

Thanks a lot for such a thoughtful reply — I really appreciate the clarity in how you laid out the adaptive mindset around chunking and retrieval. You're absolutely right: treating RAG as a scaffold rather than a fix-all patch seems to be the most sustainable approach so far.

Totally agree that standard fixed-size chunking feels brittle. I’ve tried content-aware chunking too, but even then I kept running into what felt like an architectural dissonance between how the model reasons and how the chunks are presented. That led me to wonder: is there a deeper layer we're missing, beneath chunking and re-ranking, that might align better with how LLMs actually form semantic continuity?

In my experiments, I ended up building a small reasoning engine on top of vanilla LLMs — not to replace RAG, but to intercept and stabilize semantic context before it collapses. I know that sounds a bit abstract, but it solved some of the deeper retrieval failures that chunk-aware indexing couldn’t fix.

Anyway, I really appreciate your response. This kind of back-and-forth is what I hoped for when I wrote the post. If you're ever interested in exploring some of the math or failure modes I’ve been documenting, I’d be happy to share what I’ve got.

Thanks again!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Is RAG fundamentally mismatched with how LLMs want to think? #32257

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Is RAG fundamentally mismatched with how LLMs want to think? #32257

Uh oh!

onestardao Jul 27, 2025

Replies: 2 comments

Uh oh!

JoonYong-Park Jul 31, 2025

Uh oh!

onestardao Aug 5, 2025 Author

onestardao
Jul 27, 2025

JoonYong-Park
Jul 31, 2025

onestardao
Aug 5, 2025
Author