Skip to content

refactor: Implement RAG Pipeline for AI Chatbot to Replace Full-Context Injection #22

@priyansh-narang2308

Description

@priyansh-narang2308

Overview

The chatbot currently injects the full documentation into the system prompt using INCLUDE_DOCS_CONTEXT. This works for testing, but it’s inefficient, slow, and wastes tokens.

Goal

Replace full-doc injection with a RAG setup to reduce tokens and improve response time.

Plan

  • Parse all .md files from user_docs
  • Split content into logical chunks
  • Generate embeddings per chunk
  • Store them in a simple local vector index
  • On each query:
    • Run semantic search
    • Fetch top 3–5 relevant chunks
    • Inject only those into the prompt

Acceptance

  • INCLUDE_DOCS_CONTEXT is removed or rerouted to RAG
  • Token usage drops noticeably
  • Project-specific answers still work

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions