SummARai (Summarizing Arabic with AI) is an innovative web-based platform designed to automatically generate high-quality summaries for Arabic books and documents using advanced AI technologies. As a graduation project from Cairo University's Faculty of Computers and Artificial Intelligence, SummARai addresses the critical gap in Arabic Natural Language Processing (NLP) tools by providing a specialized solution for Arabic content summarization.
- Hybrid Summarization Pipeline: Combines extractive(Graph-Based) and abstractive(AraT5) techniques for optimal results
- Arabic Language Specialization: Specifically designed for Arabic's complex morphology and rich syntax
- User-Friendly Interface: Intuitive web platform with Arabic UI
- Advanced AI Models: Leverages AraT5 transformer and LLaMA 4 for superior summarization
- Comprehensive Library: Searchable database of book summaries
- Reading Progress Tracking: Personal statistics and reading history
- OCR Support: Processes both digital and scanned PDFs
- React.js
- Tailwind CSS
- Axios for API communication
- Java Spring Boot
- MySQL Database
- Redis Caching
- JWT Authentication
- Python FastAPI
- AraT5v2-base-1024 (fine-tuned)
- LLaMA-4-Scout-17B-16E-Instruct
- Mistral AI OCR
- Sentence Transformers for semantic chunking
- AWS Cloud Services
- Theta Labs GPU Resources (NVIDIA H100/A100)
Our hybrid approach outperforms traditional methods:
| Metric | Our Model | Baseline (mT5) |
|---|---|---|
| BERTScore Precision | 77.55% | 70.76% |
| BERTScore Recall | 66.54% | 63.98% |
| BERTScore F1 | 71.73% | 61.72% |
- Node.js (v16+)
- Java JDK (v17+)
- Python (v3.9+)
- MySQL (v8.0+)
- Redis
cd Frontend
npm install
npm run devcd Backend
./mvnw spring-boot:runcd Model/ModelPreprocessig
pip install -r requirements.txt
uvicorn main:app --reloadWe created SummARai v1.0 - a comprehensive dataset of 4,328 Arabic text chunks with aligned summaries, covering various genres. Available at:
SummARai Dataset GitHub
- Fatma Elzahraa Ashraf Samy
- Rowan Madeeh Mohamed
- Abdelrahman Mohamed Ezzat
- Abdelrahman Gomaa
- Momen Mostafa Mabrouk
Supervised by:
Dr. Mohammed ElRamly
TA. Amany Hesham
Explore SummARai: Demo Link
SummARai - Bridging the gap between Arabic literature and AI-powered accessibility.