A web application that processes audio and video files to extract and classify vocabulary words by CEFR language levels (A1-C2), providing transcription and vocabulary analysis for language learners.
- Manual Step-by-Step Workflow: Full control over each processing step
- Upload audio/video files
- Manually trigger transcription with Whisper AI
- Extract and review words
- Classify with Groq AI (see raw API responses)
- Save and analyze results
- CEFR Classification: Words are automatically classified by difficulty level (A1-C2)
- Word Context: See each word in context with timestamps
- Full Transcription: View complete transcription of your audio
- Vocabulary Statistics: Get insights about word frequency and difficulty distribution
- Transparency: See exactly what's happening at each step
New in v2.0: We've redesigned the workflow to give you complete control! Instead of automatic processing, you now manually trigger each step and can review intermediate results. See MANUAL_WORKFLOW.md for details.
- Django 4.2+
- Django REST Framework
- Celery + Redis for async processing
- PostgreSQL database
- Whisper AI for transcription
- Groq API for LLM processing
- React 18+ with TypeScript
- Material-UI (MUI)
- Axios for API calls
- React Router
- Docker & Docker Compose
- Gunicorn + Nginx
- Let's Encrypt SSL (optional)
Prerequisites:
- Docker and Docker Compose installed
- Groq API key (get one at groq.com)
3 Simple Steps:
-
Clone and configure
git clone https://github.com/yourusername/HardWordExtractor.git cd HardWordExtractor # Set your GROQ_API_KEY in docker-compose.dev.yml
-
Start all services
docker compose -f docker-compose.dev.yml up --build
-
Access the application
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000/api/
- Admin Panel: http://localhost:8000/admin/
See docs/DOCKER-QUICKSTART.md for detailed Docker deployment guide.
For development without Docker, see QUICKSTART.md for running services manually.
- Docker Quick Start - π Start here! 3 steps to run (173 lines)
- Docker Reference - Detailed Docker configuration reference (600+ lines)
- Manual Setup - Run services manually (without Docker)
- Setup Guide - Detailed development setup
- API Documentation - Complete API endpoints and usage
- Architecture Guide - System architecture and design patterns
- Manual Workflow - Step-by-step API workflow guide
- Groq Setup - How to get your Groq API key
backend/transcription/
βββ models/ # Data models (6 files)
β βββ audio.py, transcription.py, word.py
β βββ statistics.py, processing.py
βββ serializers/ # API serialization (6 files)
βββ views/ # API endpoints (6 files)
βββ services/ # Business logic (organized by domain)
β βββ audio/ # Audio processing
β βββ transcription/ # Whisper & processing
β βββ words/ # Extraction & context
β βββ ai/ # Groq & classification
βββ utils/ # Shared utilities (6 files)
β βββ constants.py, exceptions.py
β βββ validators.py, responses.py, pagination.py
βββ tests/ # Comprehensive test suite
frontend/src/
βββ components/ # UI components (organized by feature)
β βββ audio/, transcription/, words/
β βββ layout/, common/
βββ features/ # Feature modules with hooks
β βββ audio/hooks/ # useAudioUpload, useAudioStatus
β βββ transcription/hooks/ # useTranscription
β βββ words/hooks/ # useWords
βββ hooks/ # Global hooks
β βββ useDebounce, useLocalStorage
β βββ useCache, useCachedApi
βββ services/ # API communication
βββ pages/ # Route components (lazy-loaded)
Performance Features:
- β Code splitting (44% bundle size reduction)
- β React.memo on expensive components
- β API caching with custom hooks
- β Debounced search inputs
See docs/ARCHITECTURE.md for detailed design patterns and data flow.
See SETUP.md for detailed instructions.
# Backend tests
cd backend
python manage.py test
# Frontend tests
cd frontend
npm test
# Test coverage
cd backend && pytest --cov
cd frontend && npm test -- --coverageHardWordExtractor/
βββ backend/
β βββ config/ # Django settings & Celery
β βββ transcription/ # Main app (refactored)
β βββ models/ # 6 model files
β βββ views/ # 6 view files
β βββ serializers/ # 6 serializer files
β βββ services/ # Business logic (4 domains)
β βββ utils/ # Shared utilities (6 files)
β βββ tests/ # Test suite (organized by layer)
βββ frontend/
β βββ src/
β βββ components/ # UI components (5 domains)
β βββ features/ # Feature hooks (3 domains)
β βββ hooks/ # Global hooks (4 files)
β βββ services/ # API services
β βββ pages/ # Route components
β βββ types/ # TypeScript types
βββ docker/ # Docker configurations
βββ docs/ # Documentation
β βββ ARCHITECTURE.md # System architecture
β βββ API.md # API documentation
β βββ SETUP.md # Setup guide
β βββ GROQ_SETUP.md # Groq API guide
βββ scripts/ # Utility scripts
βββ docker-compose.yml # Docker orchestration
βββ README.md
-
Phase 1: MVP (β 100% Complete)
- Audio transcription with Whisper
- CEFR word classification with Groq API
- Manual step-by-step workflow
- React frontend with TypeScript
- Docker deployment
- Complete documentation
-
Phase 2: Video support and user authentication
-
Phase 3: Full local LLM processing
-
Phase 4: Production features and scaling
Current Status: Phase 1 MVP is complete and production-ready! Docker deployment tested and verified with comprehensive documentation.
See PROJECT_OUTLINE.md and PROJECT_STATUS.md for detailed progress tracking.
This project is licensed under the MIT License - see the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request.
For questions or support, please open an issue on GitHub.
- OpenAI Whisper - Speech recognition
- Groq - Fast LLM inference
- Django - Web framework
- React - Frontend framework