-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Description
Add an official tutorial or guide that shows how to debug RAG / LLM pipelines built with Kedro using a structured 16-problem failure map (WFGY ProblemMap). The guide would help users locate whether a failing RAG system is due to chunking, embeddings, vector stores, retrieval, routing or post-processing, instead of only tuning the LLM prompt.
The change is documentation-only and does not require modifications to Kedro core.
Context
Kedro is increasingly used as the structural backbone for AI and RAG projects:
- nodes for ingestion, cleaning and chunking,
- pipelines for embeddings and vector-store updates,
- pipelines for retrieval + LLM calls + evaluation.
When something goes wrong, users often have a working pipeline from Kedro’s point of view (no failing nodes), but the RAG behaviour is poor: hallucinations, missing context, unstable answers between runs.
Right now, there is no single Kedro guide that:
- names the typical failure modes of a RAG pipeline end-to-end, and
- explains where in a Kedro pipeline to add logging / tests / diagnostic nodes for each failure mode.
I maintain an MIT-licensed project called WFGY (~1.5k GitHub stars). One of its components is the WFGY 16-problem ProblemMap, which categorises common RAG / LLM pipeline failures (retriever behaviour, chunking, vector stores, routing, hallucinations, evaluation, etc.) and is already referenced by several curated lists and research projects. I would like to adapt this map specifically for Kedro.
Possible Implementation
-
A new guide under the documentation section that covers “Debugging RAG / LLM pipelines with Kedro”.
-
A simple example project with a RAG pipeline, for example:
load_raw_docs → clean_text → chunk_docs → embed → write_to_vector_store → retrieve → call_llm → postprocess → evaluate -
A table that maps each of the 16 failure modes to:
- which Kedro nodes / datasets are relevant,
- what to log or visualise (e.g. chunk statistics, retrieval coverage, distribution of similarity scores),
- small experiments users can run (change chunking, retriever settings, evaluation dataset).
I am happy to open a PR that:
- adds the tutorial page,
- wires it into the docs navigation, and
- includes a minimal example project if that is helpful.
Possible Alternatives
-
Keep this entirely as a community blog post or a separate example repo.
This would work, but an official guide in Kedro docs would give new users a much clearer starting point and would standardise the vocabulary for RAG failure modes across the ecosystem. -
Wait for more first-party RAG tooling in Kedro before adding such a guide.
Even in today’s state, Kedro already orchestrates many RAG-like pipelines; the proposal is to document how to debug them using patterns that many users are already re-discovering ad-hoc.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status