Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added .DS_Store
Binary file not shown.
Binary file added examples/.DS_Store
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid commit .DS_Store to git repo

Binary file not shown.
6 changes: 0 additions & 6 deletions examples/rag/.env.example

This file was deleted.

93 changes: 35 additions & 58 deletions examples/rag/README.md
Original file line number Diff line number Diff line change
@@ -1,89 +1,66 @@
# RAG Example
# Intramind
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Intramind
# Chatbot with RAG

using a general name would be better.


This example demonstrates how to use PyTiDB to build a minimal RAG application.

* Use Ollama to deploy local embedding model and LLM model
* Use Streamlit to build a Web UI for the RAG application
* Use PyTiDB to build a minimal RAG application

<p align="center">
<img src="https://github.com/user-attachments/assets/dfd85672-65ce-4a46-8dd2-9f77d826363e" alt="RAG application built with PyTiDB" width="600" />
<p align="center"><i>RAG application built with PyTiDB</i></p>
</p>
* An RAG-Driven AI Chatbot that allows users to manage a private knowledge base by feeding private files, therefore obtaining more accurate, reliable, and secure answers about any private fields that simple LLMs cannot answer
* Use `pytidb` to connect to TiDB
* Use `openai` to deploy embedding model and response generation
* Use Streamlit as web ui

## Prerequisites

- **Python 3.10+**
- **A TiDB Cloud Starter cluster**: Create a free cluster here: [tidbcloud.com ↗️](https://tidbcloud.com/?utm_source=github&utm_medium=referral&utm_campaign=pytidb_readme)
- **Ollama**: You can install it from [Ollama ↗️](https://ollama.com/download)
* Python 3.10+
* A TiDB Cloud Serverless cluster: Create a free cluster here: tidbcloud.com
* OpenAI API key: Go to Open AI to get your own API key
* Google Auth: Create a web application in Google Cloud Console (https://docs.streamlit.io/develop/tutorials/authentication/google)

## How to run

**Step 1**: Prepare the inference API

Pull the embedding and LLM model via ollama CLI:

```bash
ollama pull mxbai-embed-large
ollama pull gemma3:4b
ollama run gemma3:4b
```

Test the `/embed` and `/generate` endpoints to make sure they are running:

```bash
curl http://localhost:11434/api/embed -d '{
"model": "mxbai-embed-large",
"input": "Llamas are members of the camelid family"
}'
```
**Step1**: Clone the repo

```bash
curl http://localhost:11434/api/generate -d '{
"model": "gemma3:4b",
"prompt": "Hello, Who are you?"
}'
git clone https://github.com/cindywxw1/Intramind.git
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
git clone https://github.com/cindywxw1/Intramind.git
git clone https://github.com/pingcap/pytidb.git
cd examples/chatbot_with_rag

```

**Step 2**: Clone the repository to local

```bash
git clone https://github.com/pingcap/pytidb.git
cd pytidb/examples/rag/;
```

**Step 3**: Install the required packages and setup environment
**Step2**: Install the required packages and setup environment

```bash
python -m venv .venv
source .venv/bin/activate
pip install -r reqs.txt
pip install -r requirements.txt
```

**Step 4**: Set up environment to connect to database
**Step3**: Set up environment to connect to storage

Go to [TiDB Cloud console](https://tidbcloud.com/clusters) and get the connection parameters, then set up the environment variable like this:
As you are using a local TiDB server, you can set up the environment variable like this:
(You can also referense)

```bash
cat > .env <<EOF
TIDB_HOST={gateway-region}.prod.aws.tidbcloud.com
OPENAI_API_KEY=
TIDB_HOST=localhost
TIDB_PORT=4000
TIDB_USERNAME={prefix}.root
TIDB_PASSWORD={password}
TIDB_DATABASE=rag_example
TIDB_USERNAME=root
TIDB_PASSWORD=
TIDB_DATABASE=test
EOF
```

**Step 5**: Run the Streamlit app
**Step4**: Set up Google Auth Platform info

```bash
streamlit run main.py
cat > .streamlit/secrets.toml <<EOF
[auth]
redirect_uri = "http://localhost:8501/oauth2callback"
cookie_secret =
client_id =
client_secret =
server_metadata_url = "https://accounts.google.com/.well-known/openid-configuration"
EOF
```

**Step 6**: Open the browser and visit `http://localhost:8501`
**Step5**: Run the Streamlit app

## Troubleshooting
```bash
streamlit run src/app.py
```

### `502 Bad Gateway` Error
**Step6**: open the browser and visit `http://localhost:8501`

Try to disable the global proxy settings.
7 changes: 7 additions & 0 deletions examples/rag/env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
OPENAI_API_KEY = "sk-xxxxxx"

TIDB_HOST=xxxxxx
TIDB_PORT=xxxxxx
TIDB_USERNAME=xxxxxx
TIDB_PASSWORD=xxxxxx
TIDB_DATABASE=test
157 changes: 0 additions & 157 deletions examples/rag/main.py

This file was deleted.

4 changes: 0 additions & 4 deletions examples/rag/reqs.txt

This file was deleted.

11 changes: 11 additions & 0 deletions examples/rag/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
streamlit
openai
pytidb

python-dotenv
streamlit-authenticator

sqlalchemy
litellm
PyPDF2
langchain_text_splitters
17 changes: 17 additions & 0 deletions examples/rag/src/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
import streamlit as st

doc_page = st.Page("page_files/doc_page.py", title = "Manage Uploaded Files")
main_page = st.Page("page_files/main_page.py", title = "Chats")
login_page = st.Page("page_files/login_page.py")


def main():
if not st.user.is_logged_in:
pg = st.navigation([login_page])
else:
pg = st.navigation([main_page, doc_page])
pg.run()

if __name__ == "__main__":
main()

Loading