This project provides a Retrieval-Augmented Generation (RAG) solution for processing PDF documents and answering user queries through a web interface. It combines AI-powered responses, PDF management, and session handling. Users can upload PDFs, ask questions, and interact with the application via a modern web interface.
Data Source: The application works with user-uploaded PDF documents and leverages OpenAI's API for intelligent query responses.
- Project Features
- Project Architecture
- Technologies Used
- Running the Application
- Project Structure
- Usage Examples
- Secure Deployment on AWS Lightsail
- Contact
- PDF Upload and Management: Users can upload and switch between multiple PDF documents.
- AI-Powered Querying: Ask questions about the content of uploaded PDFs and receive intelligent answers.
- Session and Cache Handling: Uses Redis to manage user sessions and cache query results.
- Interactive Web Interface: Built with Flask and JavaScript for a responsive user experience.
Below is the high-level architecture, including document upload, processing, querying, and visualization.
- Document Upload: Users upload PDF files via the web interface.
- Backend Processing: The backend extracts text, manages documents, and handles AI queries.
- Query Handling: Integrates OpenAI API for RAG-based answers.
- Session Management: Redis stores session data and caches results.
- Frontend Visualization: Flask and JavaScript display results and manage user interaction.
- Python: Core programming language.
- Flask: Web application framework for the frontend.
- Custom Backend: Handles PDF processing and AI queries.
- Redis: For session and cache management.
- OpenAI API: For AI-powered query responses.
- JavaScript: Enhances frontend interactivity.
- pip: Dependency management.
-
Clone the Repository:
git clone https://github.com/Carlos93U/ai_query_pdf.git cd ai_query_pdf
ai_query_pdf/
├── assets
│ ├── demo.gif
│ └── img.png
├── backend # PDF processing and AI query logic
│ ├── app.py # Backend entry point
│ ├── Dockerfile # Dockerfile for backend
│ ├── requirements.txt
│ ├── uploads
│ │ └── key_word.pdf
│ └── vectorstore
├── compose.yml # Docker compose for project
├── .env # Environment configuration
│
├── frontend # Flask web interface
│ ├── app.py # Frontend entry point
│ ├── Dockerfile # Dockerfile for frontend
│ ├── requirements.txt
│ ├── static
│ │ ├── script.js
│ │ └── style.css
│ └── templates
│ └── index.html
├── LICENSE
└── README.md # Project documentation
Users can upload PDF documents, ask questions, and view AI-generated answers directly in the web interface.
- Go to AWS Lightsail.
- Click Create instance.
- Configure:
- Region: the closest to you.
- Platform: Linux/Unix.
- Blueprint: Ubuntu 22.04 LTS.
- Plan: according to your resource needs.
- Name:
app-chat.
- Click Create instance and wait for it to start.
- Note the Public IP of the instance.
- Download the Lightsail private key for your region, e.g.,
LightsailDefaultKey-us-east-1.pemand move it to ~/.ssh/ directory. - Set read-only permissions:
chmod 400 ~/.ssh/LightsailDefaultKey-us-east-1.pem- Connect to the instance (replace
<PUBLIC_IP>with your instance's IP):
ssh -i ~/.ssh/LightsailDefaultKey-us-east-1.pem ubuntu@<PUBLIC_IP>Inside the VM via SSH:
# Update and prepare
sudo apt update -y
sudo apt upgrade -y
# Install Docker and Docker Compose
sudo apt install -y docker.io docker-compose
# Enable and start Docker
sudo systemctl enable --now docker
# Add 'ubuntu' user to docker group
sudo usermod -aG docker ubuntu
# Verify installation
docker --version
docker compose version || docker-compose --version
docker run --rm hello-worldImportant note:
Remember to log out and back in for the docker group changes to take effect.
mkdir ~/app-chat
cd ~/app-chat
mkdir frontend backend- Go to OpenAI and generate your API key.
- On your local machine, create a
.envfile with:
OPENAI_API_KEY="YOUR_REAL_KEY"
BACKEND_URL=http://backend:8000
REDIS_URL=redis://redis:6379
VECTORSTORE_PATH=/app/vectorstore
UPLOAD_FOLDER=/app/uploads
SECRET_KEY=ultra_secret- Upload
.envto the VM:
scp -i ~/.ssh/LightsailDefaultKey-us-east-1.pem .env ubuntu@<PUBLIC_IP>:/home/ubuntu/app-chat/Alternatively, edit it directly on the VM:
nano ~/app-chat/.envFrom your local machine:
scp -i ~/.ssh/LightsailDefaultKey-us-east-1.pem -r ./frontend ubuntu@<PUBLIC_IP>:/home/ubuntu/app-chat/
scp -i ~/.ssh/LightsailDefaultKey-us-east-1.pem -r ./backend ubuntu@<PUBLIC_IP>:/home/ubuntu/app-chat/
scp -i ~/.ssh/LightsailDefaultKey-us-east-1.pem docker-compose.yml ubuntu@<PUBLIC_IP>:/home/ubuntu/app-chat/If you don't have the files yet, you can create them directly on the VM using
nanoorcat.
- Go to your
app-chatinstance in Lightsail. - Tab Networking → Firewall → Add another.
- Add TCP rules:
| Port | Usage |
|---|---|
| 5000 | Frontend |
| 8000 | Backend |
Save the changes.
From the VM:
cd ~/app-chat
docker compose up -d- This starts the containers in detached mode.
- Verify they are running:
docker psExpected output:
CONTAINER ID IMAGE PORTS
a1b2c3d4e5f6 app-chat_frontend 0.0.0.0:5000->5000/tcp
b2c3d4e5f6a7 app-chat_backend 0.0.0.0:8000->8000/tcp
... ... ...
- Open in your browser:
http://<PUBLIC_IP>:5000
- Verify the frontend works and the backend responds correctly.
- Stop running containers:
docker compose down- Optional: remove images if not needed:
docker system prune -a- Delete the Lightsail instance:
- Go to AWS Lightsail.
- Select your
app-chatinstance. - Click Delete and confirm.
This ensures you don't incur ongoing costs for unused resources.
For questions or feedback, please contact:
Email: huillcas.juan3@gmail.com
GitHub: Carlos93U

