RSQKit Chatbot

A Retrieval-Augmented Generation (RAG) chatbot connected to the RSQKit. Built with Streamlit for the web interface, ChromaDB for vector storage, Ollama for local model serving, and supports multiple AI providers via OpenAI-compatible APIs.

📌 Features

Retrieval‑Augmented Generation (RAG) Chatbot
Leverages document retrieval and generative AI to provide accurate, context‑aware answers.
Multi‑Provider AI Support
Compatible with Ollama and any OpenAI‑style API for both language models and embedding services. Set default models, and API credentials without modifying the code using the provider_config.yaml file.
Hybrid Search Engine
Integrates BM25 keyword search with vector‑based similarity to ensure precise and relevant results.
Streamlit‑Powered Interface
Interactive web app featuring chat history, source traceability, and real‑time sidebar controls.
Modular Architecture
Clean separation of ingestion, configuration, and application logic for easy extension and maintenance.

📦 Installation

Prerequisites

Before you begin, please make sure you’ve installed all of the following. Ollama is only needed if you plan to run models locally. If that’s the case and you haven’t installed it yet, head over to the official docs and follow the platform-specific instructions: https://ollama.com/

Python 3.10 or higher
Ollama
After installation, confirm it's working by running:
```
  ollama --help
```

API Key from an AI provider that supports the OpenAI API protocol (required for using remote models)
Embedding Model for semantic search (e.g., bge-m3 with Ollama). Install it locally with:
```
ollama pull bge-m3
```
LLM for text generation (e.g., deepseek-r1:14b with Ollama). Install it locally with:
```
ollama pull deepseek-r1:14b
```
Reranker (optional) — improves ranking accuracy of search results.

Configuring AI Providers

To integrate an AI provider, update the provider_config.yaml file with the provider's details. Below is an example configuration format:

new_provider:
  name: "Example Provider"
  base_url: "https://api.exampleprovider.com/v1"
  base_url_vision: "https://api.exampleprovider.com/v1/chat/completions"
  rerank_url: "https://api.exampleprovider.com/v1/rerank"
  api_env_var: "API_KEY_EXAMPLE_PROVIDER"
  fall_back_provider: "ollama"
  supports_embedding: true
  supports_reranker: true
  models:
    default_embedding: "example-embedding-model"
    default_reranker: "example-reranker-model"
    default_llm: "example-llm-model"
    default_vision: "example-vision-model"

Steps

Clone the repository
Open your terminal and run:

git clone https://github.com/EVERSE-ResearchSoftware/RSQKit-chatbot.git
cd RSQKit-chatbot

Set up a virtual environment

Create and activate a virtual environment:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Install dependencies

With the virtual environment activated, install the required packages:
```
pip install -e .
```
⚠️ Configure environment variables

Create a .env file and populate it with your API keys and other required settings as referenced in provider_config.yaml.

🚀 Usage

1. Make these Bash Files Executable

chmod +x download_rsqkit_files.sh ingest_rsqkit_files.sh run_app.sh

2. Download and Ingest RSQKit Files

./download_rsqkit_files.sh  # This will take a moment. Skip if you already have the files in the directory `rsqkit_markdown`.
./ingest_rsqkit_files.sh    # This will also take a moment. Ensure you have an embedding model available.

This will download RSQKit files and ingest them into ChromaDB for retrieval.

3. Run the Chatbot App

./run_app.sh

Opens a Streamlit web interface at http://localhost:8501.

📁 Project Structure

rsqkit-chatbot/
├── app.py                              # Main Streamlit app
├── chroma_data_ingestor.py
├── core_utils                          # Containing utility functions (data processing, retrieval)
├── directories.yaml
├── download_rsqkit_files.sh
├── ingest_rsqkit_files.sh
├── llm_provider_tools.py
├── llms                                # Contains LLM chat functions for OpenAI protocol
├── pages                               # Contains the other pages of the app - Streamlit protocol
├── prompt_templates
├── provider_config.yaml                # ⚠️ Where you setup AI provider, API_KEY_VAR, LLM, embedding, etc
├── pyproject.toml
├── pytest.ini
├── README.md
├── rsqkit_scrap.py
├── rsqkit_markdown                     # Contains markdown input files (not tracked)
├── run_app.sh
├── settings.py
├── static
├── task_modules
├── templates
├── tests
├── ui
├── uv.lock
└── vision                              # Contains function for Vision Language Models

📝 Contributing

We welcome contributions of all kinds!

Bug Reports: If you encounter an issue, please open a GitHub issue with a clear and detailed description.
Feature Requests: Have an idea for an enhancement? Open a discussion or submit a pull request with your proposal.
Code Contributions: Fork the repository, make your changes, and submit a pull request. Make sure to follow any existing coding conventions and include relevant documentation or tests where applicable.

📄 License

This project is licensed under the Apache-2.0 License.
See the LICENSE file for full details.

🙏 Acknowledgments

This project is financially supported by EVERSE, IJCLAB, IN2P3, and CNRS
We extend our gratitude to the teams at AI4EOSC, Albert API and RAGaRenn for generously providing their AI resources during the development of this application.
The project draws inspiration from Retrieval-Augmented Generation (RAG) and hybrid search strategies in modern AI engineering.
The application is built using tools like Streamlit, ChromaDB, and Ollama.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RSQKit Chatbot

📌 Features

📦 Installation

Prerequisites

Configuring AI Providers

Steps

🚀 Usage

1. Make these Bash Files Executable

2. Download and Ingest RSQKit Files

3. Run the Chatbot App

📁 Project Structure

📝 Contributing

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
.streamlit		.streamlit
core_utils		core_utils
llms		llms
pages		pages
prompt_templates		prompt_templates
static		static
task_modules		task_modules
tests		tests
ui		ui
vision		vision
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
app.py		app.py
chroma_data_ingestor.py		chroma_data_ingestor.py
directories.yaml		directories.yaml
download_rsqkit_files.sh		download_rsqkit_files.sh
ingest_rsqkit_files.sh		ingest_rsqkit_files.sh
llm_provider_tools.py		llm_provider_tools.py
provider_config.yaml		provider_config.yaml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
rsqkit_scrap.py		rsqkit_scrap.py
run_app.sh		run_app.sh
settings.py		settings.py
uv.lock		uv.lock

License

EVERSE-ResearchSoftware/RSQKit-chatbot

Folders and files

Latest commit

History

Repository files navigation

RSQKit Chatbot

📌 Features

📦 Installation

Prerequisites

Configuring AI Providers

Steps

🚀 Usage

1. Make these Bash Files Executable

2. Download and Ingest RSQKit Files

3. Run the Chatbot App

📁 Project Structure

📝 Contributing

📄 License

🙏 Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages