A web application that processes audio and video files to extract and classify vocabulary words by CEFR language levels (A1-C2), providing transcription and vocabulary analysis for language learners.
- Manual Step-by-Step Workflow: Full control over each processing step
- Upload audio/video files
- Manually trigger transcription with Whisper AI
- Extract and review words
- Classify with Groq AI (see raw API responses)
- Save and analyze results
- CEFR Classification: Words are automatically classified by difficulty level (A1-C2)
- Word Context: See each word in context with timestamps
- Full Transcription: View complete transcription of your audio
- Vocabulary Statistics: Get insights about word frequency and difficulty distribution
- Transparency: See exactly what's happening at each step
New in v2.0: We've redesigned the workflow to give you complete control! Instead of automatic processing, you now manually trigger each step and can review intermediate results. See MANUAL_WORKFLOW.md for details.
- Django 4.2+
- Django REST Framework
- Celery + Redis for async processing
- PostgreSQL database
- Whisper AI for transcription
- Groq API for LLM processing
- React 18+ with TypeScript
- Material-UI (MUI)
- Axios for API calls
- React Router
- Docker & Docker Compose
- Gunicorn + Nginx
- Let's Encrypt SSL (optional)
Prerequisites:
- Docker and Docker Compose installed
- Groq API key (get one at groq.com)
3 Simple Steps:
-
Clone and configure
git clone https://github.com/yourusername/HardWordExtractor.git cd HardWordExtractor # Set your GROQ_API_KEY in docker-compose.dev.yml
-
Start all services
docker compose -f docker-compose.dev.yml up --build
-
Access the application
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000/api/
- Admin Panel: http://localhost:8000/admin/
See docs/DOCKER-QUICKSTART.md for detailed Docker deployment guide.
For development without Docker, see QUICKSTART.md for running services manually.
- Docker Quick Start - 🚀 Start here! 3 steps to run (173 lines)
- Docker Reference - Detailed Docker configuration reference (600+ lines)
- Manual Setup - Run services manually (without Docker)
- Setup Guide - Detailed development setup
- API Documentation - Complete API endpoints and usage
- Architecture Guide - System architecture and design patterns
- Manual Workflow - Step-by-step API workflow guide
- Groq Setup - How to get your Groq API key
backend/transcription/
├── models/ # Data models (6 files)
│ ├── audio.py, transcription.py, word.py
│ ├── statistics.py, processing.py
├── serializers/ # API serialization (6 files)
├── views/ # API endpoints (6 files)
├── services/ # Business logic (organized by domain)
│ ├── audio/ # Audio processing
│ ├── transcription/ # Whisper & processing
│ ├── words/ # Extraction & context
│ └── ai/ # Groq & classification
├── utils/ # Shared utilities (6 files)
│ ├── constants.py, exceptions.py
│ ├── validators.py, responses.py, pagination.py
└── tests/ # Comprehensive test suite
frontend/src/
├── components/ # UI components (organized by feature)
│ ├── audio/, transcription/, words/
│ ├── layout/, common/
├── features/ # Feature modules with hooks
│ ├── audio/hooks/ # useAudioUpload, useAudioStatus
│ ├── transcription/hooks/ # useTranscription
│ └── words/hooks/ # useWords
├── hooks/ # Global hooks
│ ├── useDebounce, useLocalStorage
│ ├── useCache, useCachedApi
├── services/ # API communication
└── pages/ # Route components (lazy-loaded)
Performance Features:
- ✅ Code splitting (44% bundle size reduction)
- ✅ React.memo on expensive components
- ✅ API caching with custom hooks
- ✅ Debounced search inputs
See docs/ARCHITECTURE.md for detailed design patterns and data flow.
See SETUP.md for detailed instructions.
# Backend tests
cd backend
python manage.py test
# Frontend tests
cd frontend
npm test
# Test coverage
cd backend && pytest --cov
cd frontend && npm test -- --coverageHardWordExtractor/
├── backend/
│ ├── config/ # Django settings & Celery
│ └── transcription/ # Main app (refactored)
│ ├── models/ # 6 model files
│ ├── views/ # 6 view files
│ ├── serializers/ # 6 serializer files
│ ├── services/ # Business logic (4 domains)
│ ├── utils/ # Shared utilities (6 files)
│ └── tests/ # Test suite (organized by layer)
├── frontend/
│ └── src/
│ ├── components/ # UI components (5 domains)
│ ├── features/ # Feature hooks (3 domains)
│ ├── hooks/ # Global hooks (4 files)
│ ├── services/ # API services
│ ├── pages/ # Route components
│ └── types/ # TypeScript types
├── docker/ # Docker configurations
├── docs/ # Documentation
│ ├── ARCHITECTURE.md # System architecture
│ ├── API.md # API documentation
│ ├── SETUP.md # Setup guide
│ └── GROQ_SETUP.md # Groq API guide
├── scripts/ # Utility scripts
├── docker-compose.yml # Docker orchestration
└── README.md
-
Phase 1: MVP (✅ 100% Complete)
- Audio transcription with Whisper
- CEFR word classification with Groq API
- Manual step-by-step workflow
- React frontend with TypeScript
- Docker deployment
- Complete documentation
-
Phase 2: Video support and user authentication
-
Phase 3: Full local LLM processing
-
Phase 4: Production features and scaling
Current Status: Phase 1 MVP is complete and production-ready! Docker deployment tested and verified with comprehensive documentation.
See PROJECT_OUTLINE.md and PROJECT_STATUS.md for detailed progress tracking.
This project is licensed under the MIT License - see the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request.
For questions or support, please open an issue on GitHub.
- OpenAI Whisper - Speech recognition
- Groq - Fast LLM inference
- Django - Web framework
- React - Frontend framework