Skip to content

[Feature Request] Add a small "Multimodal RAG failure modes" checklist (docs only) #207

@onestardao

Description

@onestardao

Do you need to file a feature request?

  • I have searched the existing feature request and this feature request is not already filed.
  • I believe this is a legitimate feature request, not just a question or bug.

Feature Request Description

Hi RAG-Anything team, thanks for releasing this all-in-one multimodal document RAG system. It is very helpful for real world use cases that mix text, tables and images.

I have been working on RAG reliability and I often see the same kinds of problems in multimodal pipelines, for example:

  • OCR or layout parsing errors that silently corrupt the corpus
  • table structure lost during preprocessing
  • misalignment between image regions and textual descriptions
  • retrieval that focuses only on text while the answer actually depends on images or tables
  • evaluation that only looks at text and ignores multimodal context and grounding

I would like to propose a very small, documentation only feature:

Feature

Add a short doc page called something like multimodal_rag_failure_modes.md and link it from the README or a Troubleshooting / Best Practices section.

The page would contain:

  1. A list of common multimodal failure modes (OCR/layout issues, table loss, image text misalignment, biased retrieval toward one modality).
  2. Simple sanity checks for each one, for example:
    • visually inspect a few processed documents
    • check index size and embeddings per modality
    • run a few probe queries that should clearly depend on images or tables.
  3. A short checklist for bug reports:
    • which modality failed
    • what preprocessing pipeline was used
    • a minimal example document and query

Why it helps

  • Many users will run RAG-Anything on noisy, real world PDFs and mixed documents.
  • A shared failure mode checklist can save time when debugging and reduce repeated “it does not work” questions.
  • This is documentation only, so it is low risk and easy to review or adjust.

If you think this is useful and in scope, I am happy to draft a PR with a concise version that follows your documentation style.

Additional Context

Additional context:

I recently contributed a small robustness related entry to Harvard MIMS Lab’s ToolUniverse project, so I am used to keeping these kinds of checklists very focused and easy to maintain. Happy to do the same here if you think it fits the roadmap.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions