Cortex

This is the backend for the Chatterbox project. Uses FastAPI as the backend framework and SQLAlchemy as the ORM.

Development

Set up environment variables

We use direnv to manage environment variables, it can be installed here

cp .envrc.example .envrc
direnv allow . Fill in the environment variables in the .envrc file.

Installation

Prerequisites

Install uv here for dependency management
use python 3.12 if you don't have it
- uv python install 3.12.9
activate the venv
- uv venv to create the venv
- source .venv/bin/activate to use the venv
run uv sync within the virtual environment to sync the dependencies from the uv.lock file into your virtual environment

Running the development server

You should always be in the virtual environment when developing e.g. this (cortex) $ should be present in your terminal
Activate the virtual environment if you are not already in it
- source .venv/bin/activate
Load the environment variables using direnv allow .

Starting the database

docker compose up -d

Start the web server

fastapi dev app/main.py

Running the temporal server

Open another terminal and run the following command to start the temporal server

temporal server start-dev

Running the temporal worker

chmod +x app/temporal/run_worker.sh
app/temporal/run_worker.sh

Getting a Cognito token

Use the following to get a cognito access token to simulate a user login to access authenticated endpoints

aws cognito-idp initiate-auth \
   --auth-flow USER_PASSWORD_AUTH \
   --client-id ${COGNITO_CLIENT_ID} \
  --auth-parameters USERNAME=${username},PASSWORD=${password} \
   --query 'AuthenticationResult.AccessToken' \
  --output text

running the test cases

Activate the virtual environment if you are not already in it
- source .venv/bin/activate
run uv pip install -e ".[dev]"
run pytest
current tests that work is test get user profile, test create chatbot valid and invalid

Scenario walkthrough

Successful sync

Success flow video

User upload file
File uploaded to S3, return success to client
Start a temporal workflow that:
Generate a presigned S3 url to pass to Mistral OCR API to parse PDF
Convert the parsed pdf text into chunks
Generate embeddings from each chunk
Store the embeddings into vector store
Update the sync status of the document to Synced If there are exceptions e.g. Mistral API rate limit Retry policy will be carried out by the temporal server. Failed retry video

Chat walkthrough

User submits question
Question gets passed to agent workflow
Agent workflow has its tools (search_info_from_documents) and the agent (function calling LLM) does:
Query decomposition from complex queries into multiple single queries 1. Routes each query into the right tools to answer the question
Invoke the tools with the question based on the function arguments
Keep doing this (invoke the tool with arguments, decide what tool to invoke based on the answer and question to return) until the LLM thinks the tool responses can answer the question
Store the answer and tool call responses in the chat store for conversation

Rate limiting

Rate limit of 2 request per minute at the chat API endpoint level handled by SlowAPI.

running the test cases

Activate the virtual environment if you are not already in it
- source .venv/bin/activate
run uv pip install -e ".[dev]"
run pytest
current tests that work is test get user profile, test create chatbot valid and invalid

df70495 (added test stuff running to readme)

Technologies

Frontend

Typescript
Next.js

Backend

Python
FastAPI (Web server)
SQLAlchemy (ORM)
Llama Index (RAG Framework)
Temporal (Workflow orchestration) To handle syncing of documents uploaded into vector store and easy configuration of retry policies in the event of activity failures e.g. Mistral OCR API rate limit, vector store not available etc

Database

Postgres + pgvector (for storing vector embeddings)

Infrastructure

AWS Cognito
AWS S3 Terraform for provisioning resources

Third party

OpenAI (for ReAct agent that powers the chat that does the query decomposition and invokes the tool to search information from the vector store)
Mistral OCR API

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
alembic		alembic
app		app
chatterbox_backend.egg-info		chatterbox_backend.egg-info
tests		tests
.dockerignore		.dockerignore
.envrc.sample		.envrc.sample
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
README.md		README.md
alembic.ini		alembic.ini
docker-compose-prod.yml		docker-compose-prod.yml
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cortex

Development

Set up environment variables

Installation

Prerequisites

Running the development server

Starting the database

Start the web server

Running the temporal server

Running the temporal worker

Getting a Cognito token

running the test cases

Scenario walkthrough

Successful sync

Chat walkthrough

Rate limiting

running the test cases

Technologies

About

Uh oh!

Releases

Packages

Languages

zhiweit/cortex

Folders and files

Latest commit

History

Repository files navigation

Cortex

Development

Set up environment variables

Installation

Prerequisites

Running the development server

Starting the database

Start the web server

Running the temporal server

Running the temporal worker

Getting a Cognito token

running the test cases

Scenario walkthrough

Successful sync

Chat walkthrough

Rate limiting

running the test cases

Technologies

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages