Skip to content

aawhan0/PuppetGPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

32 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿค– PuppetGPT โ€” Chat With Your Documents

Live Demo


PuppetGPT is a Retrieval-Augmented Generation (RAG) powered document assistant that allows users to upload a PDF and ask questions about its contents.

Instead of relying only on the language modelโ€™s internal knowledge, PuppetGPT retrieves relevant sections from the document using semantic search and feeds them to the LLM to generate accurate, grounded responses.

Built with LangChain, Groq LLaMA models, Chroma vector database, and Streamlit, this project demonstrates how modern AI applications combine retrieval systems with large language models to build reliable document assistants.


๐Ÿš€ Live Demo

You can try PuppetGPT directly in your browser:

https://puppetgpt.streamlit.app/

Upload a PDF and start asking questions about the document.


๐ŸŽญ Why the Name PuppetGPT?

Most LLMs generate responses freely based on their training data, which can sometimes result in hallucinations or incorrect answers.

PuppetGPT takes a different approach.

Instead of letting the model respond freely, the system guides the LLM using retrieved document context. The retrieved chunks act like strings controlling the modelโ€™s responses, ensuring answers remain grounded in the document.

In simple terms:

Document Context (strings)
        โ†“
Retriever pulls relevant chunks
        โ†“
LLM generates grounded response
        โ†“
Accurate Answer

Just like a puppet moves according to the strings controlling it, the language model generates answers based on the document context provided.

This design significantly improves accuracy, reliability, and transparency.


๐Ÿš€ Features

๐Ÿ“„ Upload Any PDF - Upload any document and instantly start querying it.

๐Ÿง  Retrieval-Augmented Generation (RAG) - Responses are generated using retrieved document context.

โšก Fast LLM Responses - Powered by Groqโ€™s ultra-fast LLaMA models.

๐Ÿ” Semantic Document Search - Embeddings + vector similarity search retrieve relevant document chunks.

๐Ÿ“š Source Transparency - Shows which document sections were used to generate the answer.

๐Ÿ–ฅ Interactive Streamlit Interface - Clean and simple UI for chatting with documents.


๐Ÿง  System Architecture

User Uploads PDF
        โ†“
Document Loader
        โ†“
Text Chunking
        โ†“
Embedding Generation
        โ†“
Chroma Vector Database
        โ†“
Semantic Retrieval (Top-K)
        โ†“
Prompt Construction
        โ†“
Groq LLaMA Model
        โ†“
Answer + Sources

This architecture improves accuracy, contextual relevance, and trustworthiness compared to traditional LLM responses.


๐Ÿ›  Tech Stack

Component Technology
Frontend Streamlit
Framework LangChain
LLM Groq LLaMA
Embeddings HuggingFace Sentence Transformers
Vector Database ChromaDB
Language Python

โš™๏ธ Installation

Clone the repository:

git clone https://github.com/aawhan0/PuppetGPT.git
cd PuppetGPT

Create a virtual environment:

python -m venv venv

Activate the environment.

Windows:

venv\Scripts\activate

Mac/Linux:

source venv/bin/activate

Install dependencies:

pip install -r requirements.txt

๐Ÿ”‘ Setup API Key

Create a .env file in the project root and add your Groq API key:

GROQ_API_KEY=your_api_key_here

You can obtain a key from:

https://console.groq.com


โ–ถ๏ธ Run the Application

streamlit run app.py

Then open:

http://localhost:8501

Upload a PDF and start chatting with your document.


๐Ÿ“‚ Project Structure

PuppetGPT
โ”‚
โ”œโ”€โ”€ app.py                # Streamlit interface
โ”œโ”€โ”€ ingest.py             # Document ingestion pipeline
โ”œโ”€โ”€ rag_pipeline.py       # Retrieval + LLM logic
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ README.md
โ”‚
โ”œโ”€โ”€ uploaded_docs/        # Uploaded PDFs
โ”œโ”€โ”€ vectorstore/          # Chroma vector database

๐Ÿ“Š Key Concepts Demonstrated

This project demonstrates important AI engineering concepts:

  • Retrieval-Augmented Generation (RAG)
  • Semantic search using embeddings
  • Vector databases
  • Prompt grounding
  • LLM integration with external knowledge
  • Document-based AI assistants

๐ŸŒŸ Example Use Cases

  • Chat with research papers
  • Extract insights from reports
  • Query technical documentation
  • Summarize books or PDFs
  • Build internal knowledge assistants

๐Ÿงฉ Engineering Challenges & Fixes

While building PuppetGPT, several practical engineering issues arose related to environment setup, dependency management, RAG architecture, and LLM behavior. The key challenges and solutions are summarized below.


1. Python Compatibility

Issue

Streamlit Cloud defaulted to Python 3.14, which caused compatibility issues with AI libraries such as pydantic and LangChain.

Fix

Deployment environment was changed to Python 3.11, which is currently the most stable version for LangChain-based applications.


2. LangChain Package Fragmentation

Issue

LangChain recently split into multiple packages, which caused import errors in the original implementation.

Fix

Imports were updated to the new modular architecture:

from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PyPDFLoader
from langchain_community.vectorstores import Chroma

3. Dependency Conflicts

Issue

Version conflicts between LangChain-related packages caused installation failures.

Fix

Rebuilt the Python virtual environment and reinstalled dependencies to ensure clean resolution.


4. Groq Model Deprecation

Issue

The original model llama3-8b-8192 was deprecated.

Fix

Updated to:

model_name="llama-3.1-8b-instant"

5. RetrievalQA Memory Conflict

Issue

LangChain memory conflicted with RetrievalQA because the chain returns multiple outputs.

Fix

Chat history was managed using Streamlit session state:

st.session_state.chat_history

6. Vectorstore Rebuild Errors

Issue

Rebuilding the Chroma vector database on every query caused runtime errors.

Fix

Vectorstore creation was cached so embeddings are generated once per document upload.


7. Code Architecture Bug

Issue

Some logic was placed after a return statement, making it unreachable.

Fix

Refactored the architecture into two functions:

get_vectorstore()
get_qa_chain()

8. Hallucination Mitigation

Issue

The model occasionally generated answers not present in the document.

Fix

Added a strict prompt rule requiring the model to respond:

"I cannot find this information in the document."

when the answer is not in the retrieved context.


9. Output Formatting Issues

Issue

The model sometimes produced compressed bullet lists.

Fix

Added a formatting step before displaying answers:

answer = answer.replace("โ€ข ", "\nโ€ข ").strip()

๐Ÿ”ฎ Future Improvements

  • Multi-document retrieval
  • Hybrid search (BM25 + embeddings)
  • Conversation memory
  • Streaming responses
  • Evaluation metrics dashboard

๐Ÿ“œ License

This project is licensed under the MIT License.


๐Ÿ™Œ Acknowledgements

  • LangChain
  • Groq
  • HuggingFace
  • Chroma
  • Streamlit

โญ If you found this project useful, consider giving it a star!

About

A RAG-based chatbot that answers only from your uploaded PDF using LLaMA 3 (Groq), LangChain & Streamlit. No internet search! Just smart document-based Q&A.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages