Vedana is a (multi)agentic AI chatbot system built with semantic RAG and knowledge graph as its main tools.
This is a complete framework for building conversational AI systems. Key features include:
- Thread-based conversation management with persistent event storage
- Semantic RAG using Memgraph (knowledge graph) + pgvector (vector search)
- Business-defined data model managed through Grist spreadsheets
- Multiple interfaces: Telegram bot, Terminal UI, Web backoffice
- Incremental ETL built with Datapipe
Fill the .env based on the .env.example here
docker-compose -f apps/vedana/docker-compose.yml up --build -dBy the way, this repository is a uv workspace:
uv sync| Script | Package | Description |
|---|---|---|
vedana-backoffice-with-caddy |
vedana-backoffice |
Caddy reverse proxy + Reflex backend (production entry point for Docker) |
jims-telegram |
jims-telegram |
Run telegram bot for JIMS |
jims-tui |
jims-tui |
Run terminal ui for JIMS |
jims-backoffice |
jims-backoffice |
Minimal FastAPI backoffice for JIMS |
ai-assistants-oss/
├── libs/ # Reusable libraries
│ ├── jims-core/ # Core JIMS framework - thread management for user sessions
│ ├── jims-backoffice/ # Minimal FastAPI backoffice for JIMS
│ ├── jims-telegram/ # Telegram bot adapter
│ ├── jims-tui/ # Terminal UI for testing
│ ├── vedana-core/ # Core Vedana framework
│ ├── vedana-backoffice/ # Reflex-based admin UI
│ └── vedana-etl/ # ETL pipeline (Datapipe-based)
├── apps/
│ ├── vedana/ # Main Vedana deployment
│ └── jims-demo/ # JIMS demo project
└── pyproject.toml # UV Workspace configuration
JIMS is a framework for building conversational AI systems with persistent thread management.
- Thread: A conversation between user(s) and the agentic system
- Event: Something that happens in a thread (messages, actions, state changes)
- Pipeline: An async function that processes a
ThreadContextand produces events - ThreadContext: Spawned by ThreadController, an object that stores and handles all thread-related data during a
Pipelineexecution - ThreadController: Manages thread lifecycle, event storage, and
Pipelineexecution
{
"event_id": "...",
"event_type": "comm.user_message",
"event_data": {
"role": "user",
"content": "Hello!"
}
}Vedana is an agentic AI system built on JIMS that provides Graph RAG capabilities.
- Graph-based knowledge retrieval using Cypher queries on Memgraph
- Semantic vector search using pgvector embeddings
- Dynamic data model filtering to optimize token usage
- Configurable prompts and query templates via Grist
- Conversation lifecycle management (custom /start responses, etc.)
- User sends a message
- (Optional) Data model filtering selects relevant anchors/links
- LLM generates Cypher and/or vector search queries using tools
- Results are retrieved from Graph + Vector stores
- LLM synthesizes final answer from retrieved context
The ETL pipeline ingests data from Grist into the graph and vector databases.
- Extract: Load data model and data. In the most basic form data is loaded from Grist, but the pipeline can be easily extended to incorporate other sources
- Transform: Process data into nodes and edges, generate embeddings
- Load: Update knowledge graph and store pgvector embeddings
- Python 3.12
- PostgreSQL with pgvector extension
- Memgraph
- Grist (for data model and data source)
- OpenAI API key (or compatible LLM provider)
Note on pgvector:
Migration
[2dfad73e5cce_move_emb_to_pgvector]requires pgvector.Some cloud providers (Supabase, Neon etc.) manage extensions on their own; that's why you can set
CREATE_PGVECTOR_EXTENSION=falsein environment to avoid conflicts. If your configuration requires manually enabling pgvector, set envCREATE_PGVECTOR_EXTENSION=true
JIMS manages conversations as threads containing events (messages, actions, state changes). A pipeline, provided by Vedana in this case, processes user input and produces response events.
Vedana provides a RAG pipeline that:
- Receives user query
- LLM generates Cypher / vector search queries as tool calls
- Retrieves context from graph + vector stores
- LLM synthesizes the answer
The data model (node types, relationships, attributes) is defined in Grist spreadsheets and synced via ETL.
The data model is configured via tables in Grist workspace:
| Table | Purpose |
|---|---|
Anchors |
Node types (entities) in the graph |
Anchor_attributes |
Properties of node types, including embeddable fields |
Links |
Relationship types between nodes |
Link_attributes |
Properties of relationships |
Queries |
Example query scenarios for the LLM |
Prompts |
Customizable prompt templates |
ConversationLifecycle |
Responses for lifecycle events (e.g., /start) |
Models are handled via LiteLLM (with OpenRouter support inside for easier usage and access management), configurable via environment variables for production and in backoffice UI for testing:
| Variable | Purpose |
|---|---|
MODEL |
Main question answering model |
FILTER_MODEL |
Data model filtering (smaller, faster model for a preprocessing step) |
EMBEDDINGS_MODEL |
Text embeddings generation |
EMBEDDINGS_DIM |
Embedding dimensions |
| Variable | Default |
|---|---|
JIMS_DB_CONN_URI |
PostgreSQL connection URI |
JIMS_DB_USE_NULL_POOL |
Disable connection pooling |
JIMS_DB_POOL_SIZE |
Max connections kept in the pool |
JIMS_DB_POOL_MAX_OVERFLOW |
Max extra connections above pool_size |
- OpenTelemetry tracing for pipeline execution
- Prometheus metrics for LLM usage, pipeline duration
- Sentry integration for error tracking (optional)
This repository uses automated workflow generation for libraries. Workflows are generated based on configuration in each library's pyproject.toml.
For details on configuring and using the CI/CD code generation tool, see uv-workspace-codegen.
TODO