Skip to content

Commit cb4d1bd

Browse files
PostgreSQL database integration for conversation persistence (#181)
* chore: update deps * feat: add session id to input object * feat: implement database models and CRUD operations * feat: add conversations API with DB integration * feat: add api docs * chore: move api docs * feat: delete unused chains api * feat: add database configuration to .env.example * chore: update deps * feat: update dockerfile * feat: Add postgresql and pgadmin services * feat: add response models for conversations endpoints * chore: update deps * feat: include conversations router * feat: initialize database module * fix: Expose core components in package init * feat: load environment variables and configure CUDA settings * chore: update deps * fix: update embeddings type * refactor: move CUDA environment variable setup to main entry point * fix: update healthcheck command to include database name * refactor: replace session_id with conversation_id in db models * refactor: update type hints * refactor: formatting changes * fix: handle incomplete message pairs * fix: remove hardcoded PostgreSQL environment variables * feat: update psycopg2 dependency to psycopg2-binary * fix: update import from graphs to conversations * chore: update google-cloud-storage dependency * fix: use retriever graph * fix: revert google-cloud-storage dependency * fix: add return type hints * fix: add return type hint to lifespan function * docs: add docstrings for endpoints * fix: untrack data folder * docs: update README with PostgreSQL setup instructions * remove api_docs * remove unused response_models * db models: - use uuid - standardise var names * specific docker-compose commands * improve history string robustness * fix LLM response extraction (docker compose) - async bug `get_tools` - pass chat history in prompt * add unit tests * reduce docstring verbosity * feat: save streamed conversation messages to the database * fix streaming for more messages * add streaming unit tests * fix lint * fix test --------- Signed-off-by: Palaniappan R <[email protected]> Signed-off-by: Jack Luar <[email protected]> Co-authored-by: Jack Luar <[email protected]>
1 parent 0b2032b commit cb4d1bd

29 files changed

+4149
-1889
lines changed

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
.env
22
*.ipynb
33
__pycache__/
4-
backend/data/*
4+
backend/data/
55
backend/src/*.json
66
*.pyc
77
*.egg-info/

Makefile

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,11 +23,15 @@ check:
2323

2424
.PHONY: docker-up
2525
docker-up:
26-
@docker compose up --build --wait
26+
@docker compose -f docker-compose.yml up --build --wait
2727

2828
.PHONY: docker-down
2929
docker-down:
30-
@docker compose down --remove-orphans
30+
@docker compose -f docker-compose.yml down --remove-orphans
31+
32+
.PHONY: docker-dev
33+
docker-dev:
34+
@docker compose -f docker-compose.yml -f docker-compose.dev.yml up --build --wait
3135

3236
# --- Development Commands ---
3337
.PHONY: seed-credentials

README.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,31 @@ This setup involves the setting of both the frontend and backend components. We
2727

2828
### Backend Setup
2929

30+
#### Database Schema
31+
32+
The database automatically creates the following tables:
33+
- `conversations` - Stores conversation metadata (id, user_id, title, timestamps)
34+
- `messages` - Stores individual messages within conversations
35+
36+
#### Setting Up PostgreSQL Database Variables
37+
38+
The backend uses PostgreSQL for conversation persistence. Configure these database variables in your `.env` file:
39+
40+
- `POSTGRES_USER` - Database username (default: `orassistant`)
41+
- `POSTGRES_PASSWORD` - Database password (default: `password`)
42+
- `POSTGRES_DB` - Database name (default: `orassistant_db`)
43+
- `POSTGRES_HOST` - Database host (default: `postgres` for Docker, `localhost` for local)
44+
- `POSTGRES_PORT` - Database port (default: `5432`)
45+
46+
**For local development without Docker:**
47+
1. Install PostgreSQL on your system
48+
2. Create a database: `createdb orassistant_db`
49+
3. Set `POSTGRES_HOST=localhost` in your `.env` file
50+
51+
**For Docker deployment:**
52+
- The database is automatically configured via docker-compose
53+
- Data persists in a Docker volume named `postgres_data`
54+
3055
#### Option 1 - Docker
3156

3257
Ensure you have `docker` and `docker-compose` installed in your system.

backend/.env.example

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ GOOGLE_GEMINI=2.0_flash
2222

2323
LLM_TEMP=1
2424

25-
EMBEDDINGS_TYPE=GOOGLE_VERTEXAI
25+
EMBEDDINGS_TYPE=GOOGLE_GENAI
2626
GOOGLE_EMBEDDINGS=text-embedding-004
2727
HF_EMBEDDINGS=thenlper/gte-large
2828
HF_RERANKER=BAAI/bge-reranker-base
@@ -70,3 +70,10 @@ HEALTHCHECK_INTERVAL=30s
7070
HEALTHCHECK_TIMEOUT=10s
7171
HEALTHCHECK_RETRIES=5
7272
HEALTHCHECK_START_PERIOD=1200s
73+
74+
# PostgreSQL Database Configuration
75+
POSTGRES_USER=orassistant
76+
POSTGRES_PASSWORD=password
77+
POSTGRES_DB=orassistant_db
78+
POSTGRES_HOST=postgres
79+
POSTGRES_PORT=5432

backend/Dockerfile

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,18 @@ FROM ghcr.io/astral-sh/uv:python3.13-bookworm-slim
22

33
WORKDIR /ORAssistant-backend
44

5-
RUN apt-get update && apt-get install -y pandoc git wget curl git-lfs && git lfs install
5+
RUN apt-get update && apt-get install -y \
6+
build-essential \
7+
curl \
8+
gcc \
9+
git \
10+
git-lfs \
11+
libpq-dev \
12+
pandoc \
13+
postgresql-client \
14+
wget && \
15+
git lfs install && \
16+
rm -rf /var/lib/apt/lists/*
617

718
RUN pip install uv
819

backend/Dockerfile_slim

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,18 @@ FROM ghcr.io/astral-sh/uv:python3.13-bookworm-slim
33
WORKDIR /ORAssistant-backend
44

55
RUN apt-get update && apt-get install -y \
6-
pandoc git wget curl \
7-
git-lfs
6+
build-essential \
7+
curl \
8+
gcc \
9+
git \
10+
git-lfs \
11+
libpq-dev \
12+
pandoc \
13+
postgresql-client \
14+
wget && \
15+
git lfs install && \
16+
rm -rf /var/lib/apt/lists/*
817

9-
RUN git lfs install
1018
RUN pip install uv
1119

1220
COPY ./pyproject.toml /ORAssistant-backend/pyproject.toml

backend/chatbot.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
import os
22
import logging
3-
from src.api.routers import graphs
3+
from src.api.routers import conversations
44

55

66
def get_history_str(chat_history: list[dict[str, str]]) -> str:
@@ -13,7 +13,7 @@ def get_history_str(chat_history: list[dict[str, str]]) -> str:
1313
chat_history: list[dict[str, str]] = []
1414

1515
if __name__ == "__main__":
16-
rg = graphs.rg
16+
rg = conversations.rg
1717
os.system("clear")
1818

1919
while True:

backend/main.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,13 @@
22
import uvicorn
33
from dotenv import load_dotenv
44

5-
from src.api.main import app
6-
75
load_dotenv()
86

7+
if os.getenv("USE_CUDA", "false").lower() != "true":
8+
os.environ["CUDA_VISIBLE_DEVICES"] = ""
9+
10+
from src.api.main import app # noqa: E402
11+
912

1013
def main() -> None:
1114
uvicorn.run(

backend/pyproject.toml

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,12 @@ description = "Add your description here"
55
readme = "README.md"
66
requires-python = ">=3.13"
77
dependencies = [
8+
"asyncpg>=0.30.0",
89
"faiss-cpu==1.12.0",
910
"fastapi==0.116.1",
1011
"fastmcp>=2.12.2",
12+
"google-cloud-storage>=2.19.0",
13+
"httpx>=0.28.1",
1114
"huggingface-hub[cli]==0.34.4",
1215
"langchain==0.3.27",
1316
"langchain-community==0.3.27",
@@ -19,17 +22,22 @@ dependencies = [
1922
"langgraph==0.6.6",
2023
"markdown==3.8.2",
2124
"myst-parser==4.0.1",
25+
"nest-asyncio>=1.6.0",
2226
"nltk==3.9.1",
2327
"openai==1.100.2",
24-
"pypdf==6.1.3",
28+
"psycopg2-binary>=2.9.11",
29+
"pydantic>=2.11.7",
30+
"pypdf==6.0.0",
2531
"rank-bm25==0.2.2",
32+
"rich>=13.7.0",
2633
"sentence-transformers>=5.1.0",
2734
"sphinx==8.1.3",
2835
"sphinx-autobuild==2024.10.3",
2936
"sphinx-book-theme==1.1.4",
3037
"sphinx-copybutton==0.5.2",
3138
"sphinx-external-toc==1.0.1",
3239
"sphinxcontrib-mermaid==1.0.0",
40+
"sqlalchemy>=2.0.43",
3341
"unstructured==0.18.13",
3442
]
3543

@@ -58,6 +66,10 @@ markers = [
5866
]
5967
asyncio_mode = "auto"
6068

69+
[[tool.mypy.overrides]]
70+
module = "nest_asyncio"
71+
ignore_missing_imports = true
72+
6173
[tool.mypy]
6274
exclude = [
6375
"^tests/",

backend/src/agents/retriever_graph.py

Lines changed: 15 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -106,12 +106,21 @@ def classify(self, state: AgentState) -> dict[str, list[str]]:
106106
return {"agent_type": [response.content]} # type: ignore
107107

108108
def fork_route(self, state: AgentState) -> str:
109-
# TODO: if more than one agent add handler
110-
if not self.enable_mcp:
111-
tmp = "rag_agent"
112-
else:
113-
tmp = "mcp_agent"
114-
return tmp
109+
"""Route to the appropriate agent based on classification"""
110+
agent_type = state.get("agent_type", [])
111+
112+
if not agent_type:
113+
# Default to RAG agent if no classification
114+
return "rag_agent"
115+
116+
# Return the classified agent type
117+
classified = agent_type[0]
118+
119+
# If MCP is disabled but classifier chose MCP, fall back to RAG
120+
if classified == "mcp_agent" and not self.enable_mcp:
121+
return "rag_agent"
122+
123+
return classified
115124

116125
def initialize(self) -> None:
117126
self.workflow = StateGraph(AgentState)

0 commit comments

Comments
 (0)