Project-level shared knowledge / mini-RAG for conversations #2193
tessaherself
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Problem
ChatGPT and Claude both let you upload files to a project that become available as context in every conversation within that project. chat-ui currently has no equivalent — each conversation is isolated.
With the Projects feature in #2192, we now have named containers for conversations with shared custom instructions. The natural next question is: how should projects share knowledge/files?
How competitors do it
Proposed approach for chat-ui
#2192 includes an implementation of a two-tier hybrid system:
Tier 1 — Context stuffing (zero extra infra)
Upload small files (< 50k chars) to a project → store in GridFS → extract text at upload time → prepend full text to system prompt on each turn.
Pros: Works with any model, no extra infrastructure, immediate availability.
Cons: Token-expensive for larger knowledge bases.
Tier 2 — Chunk + retrieve (needs TEI)
For larger knowledge bases:
Pros: Scales to large knowledge bases, token-efficient.
Cons: Requires running TEI, async embedding pipeline.
Automatic tier selection
The system auto-selects based on total project knowledge size vs a configurable threshold. Falls back to Tier 1 if no TEI endpoint is configured.
Questions for the community
TEI integration — Should this use HuggingFace's hosted inference infrastructure, or require self-hosted TEI? What embedding models should be default?
Storage limits — What's a reasonable limit per project? (Currently: 20 files, 10MB each)
Supported file types — Currently: PDF, TXT, MD, CSV, JSON, XML, HTML, YAML. Should we add DOCX, PPTX, or other formats?
Interaction with web search — chat-ui previously had web search / RAG infrastructure that was removed. How does project knowledge interact with (or replace) that functionality?
Chunk strategy — Simple character-based chunking with paragraph/sentence boundary awareness. Is this sufficient, or should we support semantic chunking?
Vector search — Currently using manual cosine similarity (works with MongoMemoryServer in dev). Should we use MongoDB Atlas Vector Search for production deployments?
Related
Projectdata model is designed to be extensible — adding more knowledge features is straightforwardWould love to hear thoughts from maintainers and the community on the right approach here!
Beta Was this translation helpful? Give feedback.
All reactions