|
| 1 | + |
| 2 | +# Document Analyzer |
| 3 | +An AI-powered application that analyzes PDFs using the Retrieval-Augmented Generation (RAG) technique to deliver accurate, context-aware answers from your documents. |
| 4 | + |
| 5 | +It simply works by uploading a PDF and prompting the LLM questions about the document. |
| 6 | + |
| 7 | +Built with: |
| 8 | +- [Laravel 12](https://laravel.com/docs/12.x) |
| 9 | +- [VueJS](https://vuejs.org/guide/introduction.html) |
| 10 | +- [InertiaJS](https://inertiajs.com/) |
| 11 | +- [PrismPHP](https://prismphp.com/) |
| 12 | + |
| 13 | +## How It works (Client Side) |
| 14 | +- User uploads a document (PDF) |
| 15 | +- The system processes the document |
| 16 | +- Once processing is complete, the user can prompt the LLM for queries related to the document. |
| 17 | +- Users can start new conversations or continue from previous sessions. |
| 18 | + |
| 19 | +## How It Works (Server Side) |
| 20 | +- User uploads a PDF |
| 21 | +- Text is extracted from the PDF using ... package. |
| 22 | +- The extracted text is segmented into smaller chunks for efficient processing and to meet the embedding model context. |
| 23 | +- Each chunk is sent to an embedding model to generate vector representations. |
| 24 | +- The resulting vectors are stored in a vector database for retrieval. |
| 25 | +- When the user submits a query, the system performs a similarity search in the vector database using the RAG (Retrieval-Augmented Generation) technique to find relevant context. |
| 26 | +- The retrieved context is passed to the LLM, which generates a context-aware response. |
| 27 | +- The LLM’s response is returned and displayed to the user through a chatbot interface. |
| 28 | + |
| 29 | +## Notable Features |
| 30 | +- `Manager Design Pattern` - The system uses a Manager pattern, allowing easy configuration and switching between different Vector Databases and LLMs. |
| 31 | + - Vector Database: [Qdrant](https://qdrant.tech/) |
| 32 | + - LLM: [Gemini](https://gemini.google.com/app) |
| 33 | + |
| 34 | +- `Queues` - Each text chunk is enqueued as an individual job. Batchable jobs are grouped together and processed in parallel to optimize throughput and performance. |
| 35 | +- `Job Middleware` - To comply with the LLM’s rate limits, a middleware enforces a rate limit of 9 requests per minute (RPM) for job execution. |
| 36 | +- `Reverb` - A real-time communication layer that enables live tracking of job progress and interactive chatbot updates. It uses event-driven architecture with WebSockets to broadcast job state changes, logs, and chatbot messages as they occur. |
| 37 | + |
| 38 | +## Installation |
| 39 | + |
| 40 | +- Copy this repository |
| 41 | +- Copy the `.env.example` to `.env` |
| 42 | +- Setup the database connection |
| 43 | +- Install `Composer` |
| 44 | + |
| 45 | + ```bash |
| 46 | + composer install |
| 47 | + ``` |
| 48 | +- Generate the application key |
| 49 | + |
| 50 | + ```bash |
| 51 | + php artisan key:generate |
| 52 | + ``` |
| 53 | +- Migrate and seed the database |
| 54 | + |
| 55 | + ```bash |
| 56 | + php artisan migrate:fresh --seed |
| 57 | + ``` |
| 58 | +- Run the project |
| 59 | + |
| 60 | + ```bash |
| 61 | + php artisan serve |
| 62 | + ``` |
0 commit comments