ceo-chatbot

Project setup helper (LLM chatbot) for CEO

🗺️ Navigating the repo

ceo-chatbot/
├── .venv/                    # uv python virtual environment directory (not tracked; created on first `uv sync` or `uv run`)
├── conf/                     # Configuration files for script arguments (YAML, JSON)
│   ├── base/                 # Global/shared configuration files tracked in repo
│   │   ├── prompts.yml       # Prompt configuration
│   │   └── rag_config.yml    # RAG configuration: models, vectorstore, prompts...
│   └── local/                # Personal/local configs (excluded from version control)
│
├── data/                     # Project data directory (not tracked in repo)
│
├── demo/                     # Try ceo-chatbot in a streamlit app
│
├── notebooks/                # Jupyter notebooks for development, prototyping, demos
│
├── src/                      # Source code (modules and utilities imported by scripts) and scripts
│   └── ceo_chatbot
│       ├── __init__.py
│       ├── config.py               # Utilities for loading confs in conf/base/
│       │
│       ├── ingest                  # Data ingestion pipeline: loaders.py, chunking.py, embeddings.py -> index_builder.py
│       │   ├── __init__.py
│       │   ├── chunking.py         # Utilities for splitting documents into chunks
│       │   ├── embeddings.py       # Utilities for defining embedding model
│       │   ├── index_builder.py    # Define data ingestion pipeline
│       │   └── loaders.py          # Utilities for loading dataset
│       │
│       └── rag                     # RAG pipeline: llm.py, retriever.py -> pipeline.py
│           ├── __init__.py
│           ├── llm.py              # Utilities for defining reader model
│           ├── pipeline.py         # Define RAG pipeline
│           └── retriever.py        # Utilities for doc retrieval and similarity search
│
├── scripts/
│   ├── build_index.py        # Data ingestion pipeline runner (offline); chunk, embed, build vector store
│   └── ...             
│
├── tests/                    # Unit and integration tests
│
├── .gitignore                
├── .python-version           # Python version used by uv environment manager
├── pyproject.toml            # Metadata about the project (for uv)
├── uv.lock                   # Locked dependency versions 
└── README.md

🚀 Getting Started

1. Clone the repo

SSH:

git clone git@github.com:sig-gis/ceo-chatbot.git

HTTPS:

git clone https://github.com/sig-gis/ceo-chatbot.git

2. Manage dependencies with `uv`

This app uses uv for dependency managment. Read more about uv in the docs.

Install `uv`:

macOS/Linux

curl -LsSf https://astral.sh/uv/install.sh | sh

See the uv installation docs for Windows installation instructions

Also see CONTRIBUTING.md for more detail on developing with uv in this project.

3. Install ceo-chatbot

Install the ceo-chatbot package locally in editable mode.

From the project root:

uv pip install -e .

4. (offline) Build vector DB

For local development, the knowledge corpus must be set up one time.

If using a gated model such as google/embeddinggemma-300m,

Request access to the model on its Hugging Face model page (access is granted instantly)
Generate an access token: Profile > Settings > Access Tokens > + Create new token > Token type Read > Create token
Run the following hf cli command in your terminal and paste your HF access token when prompted

uv run hf auth login

Authenticate GCloud for access to the GCS bucket specified in conf/base/extract_docs_config.yml

gcloud auth login

To initially set up the CEO docs corpus or to manually re-upload updates to the GCS bucket specified in conf/base/extract_docs_config.yml, run the CEO docs extraction pipeline. NOTE: this step is not required to reproduce the chatbot once the corpus exists in GCS. If the corpus already exists in GCS, skip to Build the vector DB below.

uv run scripts/extract_docs.py

Build the vector DB:

uv run scripts/build_index.py

Stay tuned! A planned future version will automate uploading the corpus to GCS and building the vector DB.

5. Run a demo chat UI

Launch a basic streamlit application to demo ceo-chatbot in a chat UI.

uv run streamlit run demo/chat_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ceo-chatbot

🗺️ Navigating the repo

🚀 Getting Started

1. Clone the repo

2. Manage dependencies with `uv`

Install `uv`:

3. Install ceo-chatbot

4. (offline) Build vector DB

5. Run a demo chat UI

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
conf		conf
data		data
demo		demo
notebooks		notebooks
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
.python-version		.python-version
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

sig-gis/ceo-chatbot

Folders and files

Latest commit

History

Repository files navigation

ceo-chatbot

🗺️ Navigating the repo

🚀 Getting Started

1. Clone the repo

2. Manage dependencies with uv

Install uv:

3. Install ceo-chatbot

4. (offline) Build vector DB

5. Run a demo chat UI

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

2. Manage dependencies with `uv`

Install `uv`:

Packages