This is a robust FastAPI-based backend designed to power the Multilingual Meeting Assistant. It streamlines the lifecycle of meeting documentation—from raw audio ingestion and high-accuracy transcription to intelligent summarization and professional PDF report generation.
The system now supports a hybrid architecture, offering both cloud-based processing via Alibaba Cloud and a fully local, open-source pipeline (Whisper + HuggingFace) for INDIAN users without cloud console access.
- Dual Transcription Engine: Supports Alibaba Cloud ASR for enterprise-grade speed or a local Whisper-FFmpeg implementation for privacy and cost-efficiency.
- Intelligent Summarization: Leverages LangChain integrated with Qwen/DeepSeek models via HuggingFace pipelines to extract action items and key takeaways.
- Professional Documentation: Automated generation of meeting notes in PDF format.
- Comprehensive API: RESTful endpoints for meeting management (CRUD), advanced search within transcripts, and file exports.
| Component | Technology |
|---|---|
| Framework | FastAPI (Python 3.8+) |
| ORM/Database | SQLAlchemy, SQLite/PostgreSQL |
| AI/ML (Cloud) | Alibaba Cloud ASR, Qwen LLM |
| AI/ML (Local) | OpenAI Whisper, HuggingFace Transformers |
| Orchestration | LangChain |
| Processing | FFmpeg (Audio), Pydantic (Validation) |
- Python 3.8 or higher.
- Optional: Alibaba Cloud account with ASR enabled (if using the cloud method).
- System Level: FFmpeg installed on your machine.
- Clone the Repository
git clone https://github.com/your-username/meeting-assistant-backend.git
cd meeting-assistant-backend
- Environment Setup Create a virtual environment and activate it:
python -m venv venv
# Windows:
venv\Scripts\activate
# macOS/Linux:
source venv/bin/activate
- Install Dependencies
pip install -r requirements.txt
- Configuration
Create a
.envfile in the root directory:
ALIBABA_ACCESS_KEY_ID=your_id
ALIBABA_ACCESS_KEY_SECRET=your_secret
ALIBABA_APP_KEY=your_app_key
- Run the Application
uvicorn main:app --reload
Access the interactive API docs at: http://localhost:8000/docs
For users unable to access the Alibaba Cloud Console (including users in specific regions like India), follow this setup to run the Whisper-FFmpeg and HuggingFace Pipeline. This method runs entirely on your local hardware.
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install git+https://github.com/openai/whisper.git
pip install transformers accelerate langchain huggingface_hub ffmpeg-python einops==0.7.0
By default, the system is configured to use Qwen1.5-0.5B (~1GB) for summarization, which offers a great balance between performance and resource usage. You may also opt for the smaller Qwen-2 models.
├── main.py # Application entry point
├── database.py # Database connection logic
├── models.py # SQLAlchemy database models
├── schemas.py # Pydantic data schemas
├── services/ # Core business logic
│ ├── transcription.py # Alibaba & Whisper integration
│ ├── summarization.py # LangChain & HuggingFace logic
│ └── pdf_service.py # Report generation logic
├── uploads/ # Raw audio storage
└── pdfs/ # Generated meeting notes
| Method | Endpoint | Description |
|---|---|---|
POST |
/meetings/ |
Upload and process a new recording |
GET |
/meetings/ |
List all processed meetings |
GET |
/meetings/{id} |
Retrieve details of a specific meeting |
DELETE |
/meetings/{id} |
Remove a meeting record and files |
POST |
/meetings/{id}/export |
Generate and download PDF notes |
GET |
/meetings/{id}/search |
Query the transcript for specific terms |
When deploying to platforms like Render, Railway, or a VPS:
- Dockerize the App: Use a Dockerfile to ensure a consistent environment containing Python, FFmpeg, and the required Cuda drivers.
- System Binaries: Ensure
ffmpegis installed at the system level (apt install ffmpeg). - Dependency Management: Ensure
openai-whisperandffmpeg-pythonare explicitly listed in yourrequirements.txt. - Local Testing: Always test your containerized build locally to verify that Whisper can access the system path for FFmpeg.