Skip to content

Latest commit

 

History

History
225 lines (142 loc) · 4.99 KB

File metadata and controls

225 lines (142 loc) · 4.99 KB

Allycat RAG Remote

This setup runs Allycat RAG - runs models on cloud services.

Prerequisites

  • API_KEYs for service we wll be using. For example, to use Nebius AI, we will need NEBIUS_API_KEY

Tech Stack

Component Functionality Runtime
Milvus embedded Vector db Locally or remotely
Models LLM runtime Remotely (Nebius, Replicate ...etc)

Step-1: Get the code

# Substitute appropriate repo URL
git   clone https://github.com/The-AI-Alliance/allycat/
cd    allycat/rag-remote

Step-2: Setup

Follow the Python dev env setup guide.

And activate your python env as below

## if using uv
uv sync

## if using python venv
source  .venv/bin/activate
pip install -r requirements.txt

## If using conda
conda  activate  allycat-1  # what ever the name of the env
pip install -r requirements.txt

Step-3: Setup .env file

A sample env.sample.txt is provided. Copy this file into .env file.

cp  env.sample.txt  .env

And edit .env file to make your changes.

1) To use Nebius AI

  • Get NEBIUS_API_KEY from Nebius
  • add the NEBIUS_API_KEY to .env file
NEBIUS_API_KEY = "your key goes here" 

The default models used will be:

  • LLM: nebius/Qwen/Qwen3-30B-A3B-Instruct-2507
  • Embedding: nebius/Qwen/Qwen3-Embedding-8B

Optionally you can configure models we might use in .env file:

Find the available models at Nebius Token Factory

EMBEDDING_MODEL = Qwen/Qwen3-Embedding-8B
EMBEDDING_LENGTH = 384

LLM_MODEL = nebius/Qwen/Qwen3-30B-A3B-Instruct-2507

Allycat Workflow

Step-4: Crawl the website

This step will crawl a site and download the website content into the workspace/crawled directory

code: 1_crawl_site.py

# default settings
## if using uv
uv run python     1_crawl_site.py  --url https://thealliance.ai
# or
python     1_crawl_site.py  --url https://thealliance.ai


# or specify parameters
uv run python  1_crawl_site.py   --url https://thealliance.ai --max-downloads 100 --depth 5
# or
python  1_crawl_site.py   --url https://thealliance.ai --max-downloads 100 --depth 5

Step-5: Process Downloaded files

We will process the downloaded files (html / pdf) and extract the text as markdown. The output will be saved in theworkspace/processed directory in markdown format

We use Docling to process downloaded files. It will convert the files into markdown format for easy digestion.

uv run python   2_process_files.py
# or 
python   2_process_files.py
# uv run python   2_process_files.py

Step-6: Save data into Milvus Vector DB

In this step we:

  • create chunks from cleaned documents
  • create embeddings (embedding models may be downloaded at runtime)
  • save the chunks + embeddings into a vector database

We currently use Milvus as the vector database. We use the embedded version, so there is no setup required!

uv run python   3_save_to_vector_db.py
# or
python   3_save_to_vector_db.py

Step-7: Query documents

uv run python  4_query.py
# or
python  4_query.py

Step-8: Run Web UI

Option 1: Flask UI

python app_flask.py

Go to url : http://localhost:8080 and start chatting!

Option 2: Chainlit UI

uv run chainlit run app_chainlit.py --port 8090
# or
chainlit run app_chainlit.py --port 8090

Go to url : http://localhost:8090 and start chatting!


Step-9: Let's turn Allycat RAG into an MCP server!

MCP server code
MCP client code

See mcp.md for more.


Step-10: Packaging the app to deploy

We will create a docker image of the app. It will package up the code + data

Note: Be sure to run the docker command from the root of the project.

docker  build    -t allycat-remote  .

Step-11: Run the AllyCat Docker

Let's start the docker in 'dev' mode

docker run -it --rm -p 8090:8090  -p 8080:8080  allycat-remote  deploy
# docker run -it --rm -p 8090:8090  -v allycat-vol1:/allycat/workspace  sujee/allycat

deploy option starts web UI.


Dev Notes

Creating requirements.txt using uv

when uv dependencies are updated, run this command to create requirements.txt

uv export --frozen --no-hashes --no-emit-project --no-default-groups --output-file=requirements.txt