ContextIQ is a professional "Chat with PDF" Retrieval-Augmented Generation (RAG) app. Upload a PDF, index it in Pinecone using Gemini embeddings, and ask questions with answers grounded in retrieved context.
- Upload a PDF in the Streamlit UI
- Extract text and split into chunks (
RecursiveCharacterTextSplitter) - Generate embeddings with Gemini
- Store vectors in Pinecone
- Embed the query and retrieve relevant chunks
- Generate an answer strictly from the retrieved context
- Streamlit (UI)
- Google Gemini (LLM + embeddings)
- Pinecone (vector database)
- LangChain (chunking)
- PyPDF (PDF parsing)
- Install dependencies:
pip install -r requirements.txt- Create an environment file:
cp .env.example .env- Fill in the required values in
.env:
GOOGLE_API_KEYPINECONE_API_KEYPINECONE_INDEX_NAME
- Run the app:
streamlit run app.py- The app keeps only the latest uploaded PDF (vectors are stored in a single Pinecone namespace and overwritten on each upload).
- If no relevant context is found, the app will say it does not know.