This project is an open-source sample chatbot application built with HonoJS, utilizing Retrieval-Augmented Generation (RAG) powered by Google Gemini and Pinecone. It features real-time streaming responses and persists chat history in MongoDB.
- Framework: Built on HonoJS for a lightweight and fast web standard-based server.
- LLM: Uses Google's Gemini 2.5 Flash for fast and efficient text generation.
- Embeddings: Uses Gemini Text Embedding 004 for high-quality vector embeddings.
- Vector Database: Integrates with Pinecone for efficient similarity search and context retrieval.
- Database: Stores chat sessions and history in MongoDB.
- Streaming: Supports streaming responses for a better user experience.
- RAG: Implements a complete RAG pipeline:
- Document ingestion from text files.
- Chunking and embedding.
- Context-aware response generation.
Before you begin, ensure you have the following:
- Node.js (v18 or higher)
- MongoDB: A running MongoDB instance (local or Atlas).
- Pinecone Account: An API key and an Index created in Pinecone.
- Google Gemini API Key: Access to Google's Generative AI models.
-
Clone the repository:
git clone <repository-url> cd hono-chatbot-rag
-
Install dependencies:
npm install
- Create a
.envfile in the root directorytouch .env
Refer to the .env.example for the required env variables.
To use the RAG capabilities, you need to ingest documents into your Pinecone vector store.
-
Place your text documents (
.txtfiles) in the.bin/docsdirectory.- The script looks for files in
.bin/docsrelative to the project root. - Create the directory if it doesn't exist:
mkdir -p .bin/docs
- The script looks for files in
-
Run the ingestion script:
npm run ingest:embeddings
This script will:
- Load text files from
.bin/docs. - Split them into chunks (1000 chars, 200 overlap).
- Generate embeddings using Gemini.
- Upload the vectors to your Pinecone index.
- Load text files from
Run the server with hot-reloading:
npm run devBuild and start the production server:
npm run build
npm startThe server will start on http://localhost:3000 (or your configured PORT).
Initialize a new chat session.
- Endpoint:
POST /chats - Response:
{ "id": "65f..." // The Chat ID }
Send a user message and receive a streaming response.
- Endpoint:
PUT /chats/:id - Body:
{ "content": "What is name of the chapter one?" } - Response: A text stream of the assistant's response.
src/app.ts: Main application entry point and server setup.src/api/: Route handlers (Chat creation, message handling).src/services/: External services integration (Gemini, Pinecone).src/models/: Mongoose data models..bin/ingest.ts: Script for processing and ingesting documents..bin/docs: Directory for source documents for RAG.