V-Chat is an AI-powered conversational agent developed for integration into a sustainability-focused web platform. The chatbot is designed to support user interaction within the agriculture and environmental sustainability domain. It is built to provide contextual assistance, information retrieval, and guidance for platform users.
This project uses LangChain, Pinecone, and Gemini 2.0 Flash to deliver relevant answers based only on the platform's verified data.
The chatbot:
- Takes a user's question.
- Searches the vector database (Pinecone) for relevant information from the
villam_hub_knowledge_base.mddocument. - Uses AI model (Google's Gemini 2.0 Flash) to understand the question and the retrieved information.
- Generates a concise and helpful answer.
GitHub repository: https://github.com/DataxEnv/v-chat-sustainability-chatbot
- AI-Powered Responses: uses Google's Gemini 2.0 Flash model for intelligent answers.
- Context-Aware: Retrieves relevant information from
villam_hub_knowledge_base.mdusing Pinecone vector search. - Conversational Memory: Remembers previous parts of the conversation to provide more relevant follow-up answers.
- Python: The primary programming language.
- Langchain: A framework to simplify building applications with Large Language Models (LLMs).
- Google Generative AI (Gemini 2.0 Flash): The LLM used for understanding and generating text.
- GoogleGenerativeAIEmbeddings: For converting text into numerical representations (embeddings) that AI can understand.
- Pinecone: A vector database used to store and efficiently search through the embeddings of our knowledge base.
- Dotenv: To manage sensitive information like API keys.
This repo contains the backend logic and testing environment for V-Chat. It is meant for the Internal team members.
| File/Folder | Description |
|---|---|
.gitignore |
Hides the .env containing API keys |
.env.example |
API keys template |
requirements.txt |
Lists all Python libraries needed to run the chatbot |
villam_hub_knowledge_base.md |
Cleaned demo dataset containing Villam Hub knowledge |
vchat_pipeline.ipynb |
Main chatbot logic for exploration and debugging: embeds queries, retrieves data, and generates responses |
vchat.py |
Main chatbot logic for integration |
vchat_pinecone.ipynb |
Notebook to load and upsert the dataset into Pinecone |
test_chatbot.py |
Notebook to test chatbot responses interactively |
git clone https://github.com/olamide-analyst/Villam-chatbot v-chat
cd V-Chat
Add API keys to .env.example file and rename to .env :
GOOGLE_API_KEY=google_generative_ai_key
PINECONE_API_KEY=pinecone_key
.env is already excluded from version control via .gitignore.
macOS/Linux (zsh/bash):
python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pipWindows (PowerShell):
py -m venv .venv
.venv\Scripts\Activate.ps1
python -m pip install --upgrade pipNote:
- Your prompt should show
(.venv)when activated. - Run
deactivateto exit the environment.
Install all the Python packages listed in requirements.txt inside the activated virtual environment:
pip install -r requirements.txtIf you plan to use the notebooks in this repo with this environment, add the kernel to Jupyter:
python -m ipykernel install --user --name villam-chatbot-venv --display-name "Python (.venv)"The project has two main Python scripts:
-
vchat_pinecone.ipynb: This script processes the datasetvillam_hub_knowledge_base.md, converts its content into embeddings/vectors (numerical representations), and stores(upsert) them in the Pinecone vector database. You only need to run this once (or whenever the dataset document changes). -
vchat_pipeline.ipynborvchat.py: These runs the chatbot engine logic (retrieval + Gemini 2.0 generation).
Before running examples below, make sure your virtual environment is active (source .venv/bin/activate).
Run vchat_pinecone.ipynb to upload the contents of villam_hub_knowledge_base.md to Pinecone.
Use either:
test_chatbot.pyfor interactive testing
streamlit run test_chatbot.py- Or
vchat_pipeline.ipynbto debug the full flow end-to-end
For developers:
- All chatbot logic lives in
generate_response()insidevchat.py
This is ready to be integrated with a Flask, FastAPI, or Streamlit frontend.
-
The chatbot uses Retrieval-Augmented Generation (RAG).
-
Gemini 2.0 Flash is rate-limited on the free tier:
- 15 requests/minute
- 200 requests/day
This project is maintained by the Villam Hub team.
- YouTube Gemini + LangChain Tutorial: https://youtu.be/DFjuV2YBoe4?si=2ND-frk2_Wjfv9FF