Skip to content

Carlos93U/ai_query_pdf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG PDF Query Application

Overview

This project provides a Retrieval-Augmented Generation (RAG) solution for processing PDF documents and answering user queries through a web interface. It combines AI-powered responses, PDF management, and session handling. Users can upload PDFs, ask questions, and interact with the application via a modern web interface.

RAG PDF Query Demo

Data Source: The application works with user-uploaded PDF documents and leverages OpenAI's API for intelligent query responses.

Table of Contents

Project Features

  • PDF Upload and Management: Users can upload and switch between multiple PDF documents.
  • AI-Powered Querying: Ask questions about the content of uploaded PDFs and receive intelligent answers.
  • Session and Cache Handling: Uses Redis to manage user sessions and cache query results.
  • Interactive Web Interface: Built with Flask and JavaScript for a responsive user experience.

Project Architecture

Below is the high-level architecture, including document upload, processing, querying, and visualization.

Project Architecture Diagram

  1. Document Upload: Users upload PDF files via the web interface.
  2. Backend Processing: The backend extracts text, manages documents, and handles AI queries.
  3. Query Handling: Integrates OpenAI API for RAG-based answers.
  4. Session Management: Redis stores session data and caches results.
  5. Frontend Visualization: Flask and JavaScript display results and manage user interaction.

Technologies Used

  • Python: Core programming language.
  • Flask: Web application framework for the frontend.
  • Custom Backend: Handles PDF processing and AI queries.
  • Redis: For session and cache management.
  • OpenAI API: For AI-powered query responses.
  • JavaScript: Enhances frontend interactivity.
  • pip: Dependency management.

Running the Application

  1. Clone the Repository:

    git clone https://github.com/Carlos93U/ai_query_pdf.git
    cd ai_query_pdf

Project Structure

ai_query_pdf/
├── assets
│     ├── demo.gif
│     └── img.png
├── backend                         # PDF processing and AI query logic
│     ├── app.py                    # Backend entry point
│     ├── Dockerfile                # Dockerfile for backend
│     ├── requirements.txt
│     ├── uploads
│     │     └── key_word.pdf
│     └── vectorstore
├── compose.yml                     # Docker compose for project
├── .env                            # Environment configuration
│
├── frontend                        # Flask web interface
│     ├── app.py                    # Frontend entry point
│     ├── Dockerfile                # Dockerfile for frontend
│     ├── requirements.txt
│     ├── static
│     │     ├── script.js
│     │     └── style.css
│     └── templates
│         └── index.html
├── LICENSE
└── README.md                       # Project documentation

Usage Examples

Users can upload PDF documents, ask questions, and view AI-generated answers directly in the web interface.

RAG PDF Query Example

Secure Deployment of app-chat on AWS Lightsail via SSH

🏗 Step 0 — Create the Instance in Lightsail

  1. Go to AWS Lightsail.
  2. Click Create instance.
  3. Configure:
    • Region: the closest to you.
    • Platform: Linux/Unix.
    • Blueprint: Ubuntu 22.04 LTS.
    • Plan: according to your resource needs.
    • Name: app-chat.
  4. Click Create instance and wait for it to start.
  5. Note the Public IP of the instance.

🧩 Step 1 — Connect via SSH

  1. Download the Lightsail private key for your region, e.g., LightsailDefaultKey-us-east-1.pem and move it to ~/.ssh/ directory.
  2. Set read-only permissions:
chmod 400 ~/.ssh/LightsailDefaultKey-us-east-1.pem
  1. Connect to the instance (replace <PUBLIC_IP> with your instance's IP):
ssh -i ~/.ssh/LightsailDefaultKey-us-east-1.pem ubuntu@<PUBLIC_IP>

🧩 Step 2 — Prepare the VM and Install Docker

Inside the VM via SSH:

# Update and prepare
sudo apt update -y
sudo apt upgrade -y

# Install Docker and Docker Compose
sudo apt install -y docker.io docker-compose

# Enable and start Docker
sudo systemctl enable --now docker

# Add 'ubuntu' user to docker group
sudo usermod -aG docker ubuntu

# Verify installation
docker --version
docker compose version || docker-compose --version
docker run --rm hello-world

Important note:

Remember to log out and back in for the docker group changes to take effect.


🧩 Step 3 — Create the Project on the VM

mkdir ~/app-chat
cd ~/app-chat
mkdir frontend backend

🧩 Step 4 — Generate OpenAI Key and Configure .env

  1. Go to OpenAI and generate your API key.
  2. On your local machine, create a .env file with:
OPENAI_API_KEY="YOUR_REAL_KEY"
BACKEND_URL=http://backend:8000
REDIS_URL=redis://redis:6379
VECTORSTORE_PATH=/app/vectorstore
UPLOAD_FOLDER=/app/uploads
SECRET_KEY=ultra_secret
  1. Upload .env to the VM:
scp -i ~/.ssh/LightsailDefaultKey-us-east-1.pem .env ubuntu@<PUBLIC_IP>:/home/ubuntu/app-chat/

Alternatively, edit it directly on the VM:

nano ~/app-chat/.env

🧩 Step 5 — Upload App Files

From your local machine:

scp -i ~/.ssh/LightsailDefaultKey-us-east-1.pem -r ./frontend ubuntu@<PUBLIC_IP>:/home/ubuntu/app-chat/
scp -i ~/.ssh/LightsailDefaultKey-us-east-1.pem -r ./backend ubuntu@<PUBLIC_IP>:/home/ubuntu/app-chat/
scp -i ~/.ssh/LightsailDefaultKey-us-east-1.pem docker-compose.yml ubuntu@<PUBLIC_IP>:/home/ubuntu/app-chat/

If you don't have the files yet, you can create them directly on the VM using nano or cat.


🌍 Step 6 — Open Ports in Lightsail

  1. Go to your app-chat instance in Lightsail.
  2. Tab NetworkingFirewallAdd another.
  3. Add TCP rules:
Port Usage
5000 Frontend
8000 Backend

Save the changes.


🚀 Step 7 — Run the Project

From the VM:

cd ~/app-chat
docker compose up -d
  • This starts the containers in detached mode.
  • Verify they are running:
docker ps

Expected output:

CONTAINER ID   IMAGE                PORTS
a1b2c3d4e5f6   app-chat_frontend    0.0.0.0:5000->5000/tcp
b2c3d4e5f6a7   app-chat_backend     0.0.0.0:8000->8000/tcp
...            ...                  ...

🧭 Step 8 — Test in the Browser

  1. Open in your browser:
http://<PUBLIC_IP>:5000
  1. Verify the frontend works and the backend responds correctly.

🗑 Step 9 — Clean Up Resources to Avoid Unnecessary Costs

  1. Stop running containers:
docker compose down
  1. Optional: remove images if not needed:
docker system prune -a
  1. Delete the Lightsail instance:
    • Go to AWS Lightsail.
    • Select your app-chat instance.
    • Click Delete and confirm.

This ensures you don't incur ongoing costs for unused resources.

Contact

For questions or feedback, please contact:

Email: huillcas.juan3@gmail.com
GitHub: Carlos93U


About

Chatbot with RAG-enhanced document search, powered by LLMs on AWS

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors