An AI-powered legal chatbot that leverages Retrieval-Augmented Generation (RAG) with DeepSeek R1 for advanced legal reasoning and document analysis.
- Overview
- Features
- Demo
- Architecture
- Installation & Setup
- Usage
- How It Works
- API Configuration
- Deployment
- Contributing
- Future Improvements
- License
AI Lawyer is a sophisticated legal assistant that combines the power of DeepSeek R1's reasoning capabilities with Retrieval-Augmented Generation (RAG) to provide accurate, context-aware legal insights.
- Document Intelligence: Process and analyze complex legal documents
- Contextual Retrieval: Find relevant legal information using advanced vector search
- Reasoning-Based Responses: Leverage DeepSeek R1's advanced reasoning for nuanced legal analysis
- Hallucination Reduction: Ground responses in actual legal texts for enhanced reliability
- Report Generation: Create comprehensive, downloadable legal analysis reports
| Feature | Description |
|---|---|
| 📂 Document Upload | Support for PDF legal documents with intelligent text extraction |
| 🔍 Smart Retrieval | FAISS-powered vector database for precise information retrieval |
| 🤖 AI Reasoning | DeepSeek R1 integration via Groq API for advanced legal reasoning |
| 📜 Document Summarization | Generate concise summaries of complex legal documents |
| 📄 Report Generation | Create and download AI-generated legal analysis reports |
| 💬 Interactive Chat | Conversational interface for legal Q&A |
| 🔒 Secure Processing | Local document processing with secure API integration |
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Streamlit UI │────│ RAG Pipeline │────│ DeepSeek R1 │
│ (frontend.py) │ │ (rag_pipeline.py)│ │ via Groq │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│ │ │
│ ┌──────────────────┐ │
└──────────────│ Vector Database │─────────────┘
│(vector_database.py)│
│ FAISS Index │
└──────────────────┘
AI-Lawyer---RAG-with-DeepSeek-R1/
├── 📄 frontend.py # Streamlit UI application
├── 🔧 rag_pipeline.py # RAG implementation with DeepSeek R1
├── 🗄️ vector_database.py # FAISS vector database management
├── 📋 requirements.txt # Python dependencies
├── 📖 README.md # Project documentation
├── 🖼️ utils/ # Screenshots and utilities
│ ├── photo1.png
│ ├── photo2.png
│ ├── photo3.png
│ └── photo4.png
└── 📁 .streamlit/ # Streamlit configuration (if exists)
└── config.toml
| Technology | Purpose | Version |
|---|---|---|
| DeepSeek R1 | Advanced AI reasoning model | Latest |
| Groq API | High-speed LLM inference | - |
| LangChain | LLM application framework | 0.1+ |
| Streamlit | Web application framework | 1.28+ |
| FAISS | Vector similarity search | Latest |
| pdfplumber | PDF text extraction | Latest |
| Sentence Transformers | Text embeddings | Latest |
- Python 3.8 or higher
- Groq API key
- Git
git clone https://github.com/danieladdisonorg/AI-Lawyer---RAG-with-DeepSeek-R1.gitcd AI-Lawyer---RAG-with-DeepSeek-R1On macOS/Linux:
python -m venv venvsource venv/bin/activateOn Windows:
python -m venv venvvenv\Scripts\activatepip install -r requirements.txtCreate a .env file in the project root:
echo "GROQ_API_KEY=your_groq_api_key_here" > .envOr set it as an environment variable:
export GROQ_API_KEY="your_groq_api_key_here"- Start the application:
streamlit run frontend.py-
Open your browser and navigate to
http://localhost:8501 -
Upload a legal document (PDF format)
-
Ask questions about the document using natural language
-
Download reports generated by the AI analysis
- "What are the key terms and conditions in this contract?"
- "Summarize the main legal obligations for each party"
- "What are the potential risks mentioned in this document?"
- "Explain the termination clauses in simple terms"
- Upload: User uploads PDF legal documents
- Extraction: Text is extracted using pdfplumber
- Chunking: Documents are split into manageable sections
- Embedding: Text chunks are converted to vector embeddings
- Indexing: FAISS creates searchable vector index
- Storage: Vectors are stored for efficient retrieval
- User Input: Legal questions are received via Streamlit interface
- Retrieval: Relevant document sections are found using vector similarity
- Context: Retrieved information provides context for AI response
- DeepSeek R1: Advanced reasoning model processes query and context
- Groq API: High-speed inference for real-time responses
- Structured Output: Responses are formatted for legal clarity
- Analysis: AI generates comprehensive document analysis
- Formatting: Results are structured in professional format
- Download: Users can download PDF reports
- Get API Key: Visit Groq Console and create an account
- Generate Key: Create a new API key in your dashboard
- Configure: Add the key to your environment variables or
.envfile
deepseek-r1-distill-llama-70b(Recommended)deepseek-r1-distill-qwen-32b- Other DeepSeek R1 variants available via Groq
- Push to GitHub:
git add .git commit -m "Deploy AI Lawyer application"git push origin main- Deploy on Streamlit Cloud:
- Visit Streamlit Cloud
- Connect your GitHub repository
- Set
GROQ_API_KEYin Streamlit Secrets - Click Deploy!
# .streamlit/secrets.toml
GROQ_API_KEY = "your_groq_api_key_here"- Docker: Containerize the application
- Heroku: Deploy with Procfile
- AWS/GCP: Cloud platform deployment
- Local Server: Run on dedicated hardware
We welcome contributions! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
- Follow PEP 8 style guidelines
- Add docstrings to functions
- Include unit tests for new features
- Update documentation as needed
- Multi-format Support: Add DOCX, TXT, and HTML document support
- Batch Processing: Handle multiple documents simultaneously
- Advanced Search: Implement semantic search with filters
- User Authentication: Add user accounts and document history
- Legal Database Integration: Connect to legal precedent databases
- Citation Tracking: Automatic legal citation generation
- Multi-language Support: Support for non-English legal documents
- API Endpoints: RESTful API for programmatic access
- Real-time Collaboration: Multi-user document analysis
- Legal Workflow Integration: Connect with legal practice management tools
- Advanced Analytics: Document comparison and trend analysis
- Mobile Application: Native mobile app development
- Response Time: < 3 seconds for typical queries
- Accuracy: 90%+ for factual legal information retrieval
- Document Size: Supports PDFs up to 50MB
- Concurrent Users: Optimized for 10+ simultaneous users
- Data Privacy: Documents are processed locally and not stored permanently
- API Security: Secure API key management
- No Data Retention: User documents are not retained after session
- Encryption: All API communications are encrypted
This project is licensed under the MIT License - see the LICENSE file for details.
- DeepSeek for the advanced reasoning model
- Groq for high-speed inference infrastructure
- Streamlit for the excellent web framework
- LangChain for LLM application tools
- FAISS for efficient vector search
⚖️ AI Lawyer - Making Legal Analysis Accessible Through AI
🌟 Star this repo | 🐛 Report Bug | 💡 Request Feature
Made with ❤️ by Daniel Addison



