Hard Word Extractor follows a modern client-server architecture with async task processing.
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
│ React │────▶│ Django │────▶│ PostgreSQL │
│ Frontend │◀────│ Backend │◀────│ Database │
└─────────────┘ └──────────────┘ └─────────────┘
│ │
│ └──────────▶ ┌────────────┐
│ │ Redis │
└─────────────▶│ Cache │
└────────────┘
│
▼
┌────────────┐
│ Celery │
│ Worker │
└────────────┘
│
┌──────────────────┼────────────────────┐
▼ ▼ ▼
┌─────────┐ ┌──────────┐ ┌─────────┐
│ Whisper │ │ Groq API │ │ spaCy │
│ AI │ │ LLM │ │ NLP │
└─────────┘ └──────────┘ └─────────┘
The backend follows a clean architecture pattern with clear separation of concerns:
backend/
├── config/ # Django settings & configuration
│ ├── settings.py
│ ├── urls.py
│ └── celery.py
└── transcription/ # Main Django app
├── models/ # Data models (Domain Layer)
│ ├── audio.py
│ ├── transcription.py
│ ├── word.py
│ ├── statistics.py
│ └── processing.py
├── serializers/ # API serialization (API Layer)
│ ├── audio.py
│ ├── transcription.py
│ ├── word.py
│ ├── statistics.py
│ └── processing.py
├── views/ # API endpoints (API Layer)
│ ├── audio.py
│ ├── transcription.py
│ ├── word.py
│ ├── processing.py
│ └── status.py
├── services/ # Business logic (Service Layer)
│ ├── audio/ # Audio processing services
│ │ ├── storage_service.py
│ │ ├── validation_service.py
│ │ └── transcription_service.py
│ ├── transcription/ # Transcription services
│ │ ├── whisper_service.py
│ │ └── processing_service.py
│ ├── words/ # Word extraction & analysis
│ │ ├── extraction_service.py
│ │ ├── context_service.py
│ │ └── statistics_service.py
│ └── ai/ # AI/LLM services
│ ├── groq_service.py
│ └── classification_service.py
├── utils/ # Shared utilities
│ ├── constants.py # Constants & configuration
│ ├── exceptions.py # Custom exceptions
│ ├── validators.py # Validation helpers
│ ├── responses.py # Response formatters
│ └── pagination.py # Pagination utilities
└── tests/ # Test suite
├── test_models/
├── test_serializers/
├── test_views/
└── test_services/
Business logic is separated into service classes in services/:
- Audio Services: Handle file storage, validation, transcription initiation
- Transcription Services: Process Whisper API calls and results
- Word Services: Extract words, contexts, and generate statistics
- AI Services: Interface with Groq LLM for classification
Models in models/ act as repositories, providing data access:
AudioFile: Uploaded audio filesTranscription: Transcription results with timestampsWord: Extracted vocabulary wordsStatistics: Aggregated word statisticsProcessingTask: Async task tracking
- Serializers: Handle data validation and transformation
- Views: HTTP request handling and routing
- Utils: Cross-cutting concerns (pagination, responses)
Services are injected into views, making testing easier and reducing coupling.
The API follows REST principles:
/api/audio/- Audio file management/api/transcriptions/- Transcription results/api/words/- Word queries with filtering/api/statistics/- Aggregated statistics/api/processing/- Task status and control
Features:
- Pagination on all list endpoints
- Filtering and ordering on word lists
- Consistent APIResponse wrapper format
- Comprehensive error handling
The frontend follows a feature-based architecture:
frontend/src/
├── components/ # Reusable UI components
│ ├── audio/ # Audio-related components
│ │ └── AudioUpload.tsx
│ ├── transcription/ # Transcription display
│ │ └── TranscriptionView.tsx
│ ├── words/ # Word display & stats
│ │ ├── WordList.tsx
│ │ └── Statistics.tsx
│ ├── layout/ # Layout components
│ │ ├── Header.tsx
│ │ ├── Footer.tsx
│ │ └── Layout.tsx
│ └── common/ # Shared components
│ └── StatusIndicator.tsx
├── features/ # Feature modules (hooks + logic)
│ ├── audio/
│ │ └── hooks/
│ │ ├── useAudioUpload.ts
│ │ └── useAudioStatus.ts
│ ├── transcription/
│ │ └── hooks/
│ │ └── useTranscription.ts
│ └── words/
│ └── hooks/
│ └── useWords.ts
├── hooks/ # Global custom hooks
│ ├── useDebounce.ts
│ ├── useLocalStorage.ts
│ ├── useCache.ts
│ └── useCachedApi.ts
├── services/ # API communication
│ └── audioService.ts
├── pages/ # Page components (routes)
│ ├── Home.tsx
│ └── Results.tsx
├── types/ # TypeScript types
│ └── api.ts
├── theme/ # MUI theme configuration
│ └── theme.ts
└── utils/ # Utility functions
└── formatters.ts
Related components, hooks, and logic are grouped by feature:
- Audio: Upload, status tracking
- Transcription: Display and search
- Words: Filtering, sorting, statistics
Business logic is extracted into reusable hooks:
useAudioUpload: File upload with progress trackinguseAudioStatus: Polling for processing statususeTranscription: Fetch and manage transcription datauseWords: Word filtering and paginationuseDebounce: Debounced search inputsuseCache: API response caching
API calls are centralized in services/audioService.ts:
- Consistent error handling
- Response formatting
- Token management (future)
- Code Splitting: React.lazy() for route-based splitting
- Memoization: React.memo() on expensive components
- Caching: useCachedApi hook for repeated requests
- Debouncing: useDebounce for search inputs
1. User uploads file
├─▶ Frontend: AudioUpload component
└─▶ Backend: POST /api/audio/
2. File stored & validated
├─▶ AudioStorageService.save()
└─▶ AudioValidationService.validate()
3. User triggers transcription
├─▶ Frontend: POST /api/audio/{id}/transcribe/
└─▶ Backend: Celery task created
4. Celery worker processes
├─▶ WhisperService.transcribe()
└─▶ TranscriptionProcessingService.process()
5. User triggers word extraction
├─▶ POST /api/transcription/{id}/extract-words/
└─▶ WordExtractionService.extract()
6. User triggers classification
├─▶ POST /api/words/classify/
├─▶ GroqService.classify_batch()
└─▶ ClassificationService.apply()
7. User views results
├─▶ GET /api/words/?filters
└─▶ GET /api/statistics/{id}/
Frontend State:
- React hooks for local component state
- useLocalStorage for persistent UI preferences
- useCache for API response caching
- No global state management (Redux/MobX) needed yet
Backend State:
- PostgreSQL for persistent data
- Redis for:
- Celery task queue
- API caching (future)
- Session storage (future)
| Technology | Purpose | Why? |
|---|---|---|
| Django 5.2 | Web framework | Robust, batteries included, excellent ORM |
| DRF | REST API | Best-in-class REST framework for Django |
| Celery | Async tasks | Industry standard for background processing |
| Redis | Message broker & cache | Fast, reliable, simple |
| PostgreSQL | Database | ACID compliant, excellent for structured data |
| Whisper | Transcription | Best open-source speech-to-text |
| Groq | LLM inference | Ultra-fast, cost-effective |
| Technology | Purpose | Why? |
|---|---|---|
| React 18 | UI framework | Component-based, large ecosystem |
| TypeScript | Type safety | Catches errors early, better DX |
| MUI | UI components | Professional look, comprehensive |
| React Router | Routing | Standard routing solution |
| Axios | HTTP client | Simple API, good error handling |
┌─────────────────────────────────────────────────────┐
│ Nginx (Reverse Proxy) │
│ SSL Termination (Let's Encrypt) │
└────────────┬──────────────────────────┬─────────────┘
│ │
┌──────▼──────┐ ┌──────▼──────┐
│ Static │ │ Gunicorn │
│ Files │ │ (Django) │
│ (React) │ │ Workers │
└─────────────┘ └──────┬──────┘
│
┌──────────────┼──────────────┐
│ │ │
┌─────▼────┐ ┌────▼────┐ ┌────▼────┐
│PostgreSQL│ │ Redis │ │ Celery │
│ │ │ │ │ Workers │
└──────────┘ └─────────┘ └─────────┘
Deployment Options:
- Docker Compose (recommended for simple deployments)
- Kubernetes (for scaling)
- VPS (DigitalOcean, Linode, etc.)
- File size limits (100MB)
- File type validation
- CORS configuration
- SQL injection prevention (Django ORM)
- XSS prevention (React escaping)
- JWT authentication
- Rate limiting
- API key management
- User data isolation
- File encryption at rest
- HTTPS only
- CSP headers
- Whisper Processing: CPU-intensive, blocks Celery worker
- File Storage: Local filesystem (not scalable)
- Database: Single PostgreSQL instance
-
Processing:
- GPU-enabled Whisper processing
- Multiple Celery workers
- Task priority queues
-
Storage:
- S3/MinIO for file storage
- CDN for static files
-
Database:
- Read replicas for queries
- Connection pooling
- Query optimization
-
Caching:
- Redis for API responses
- Browser caching for static content
- Unit Tests: Models, serializers, services
- Integration Tests: API endpoints
- Task Tests: Celery tasks
- Coverage Target: 80%+
- Unit Tests: Hooks, utilities
- Component Tests: React Testing Library
- Integration Tests: User flows
- Coverage Target: 70%+
- Backend: pytest, pytest-django, factory_boy
- Frontend: Jest, React Testing Library, MSW
- Redis running in Docker
- Celery worker in terminal
- Django dev server
- React dev server (hot reload)
- Linting: ESLint (frontend), pylint/flake8 (backend)
- Formatting: Prettier (frontend), black (backend)
- Type Checking: TypeScript, mypy (future)
- Pre-commit Hooks: lint-staged (future)
- Docstrings for all Python functions/classes
- JSDoc comments for complex TypeScript functions
- README in each major directory
- OpenAPI/Swagger (future)
- Postman collection (future)
- docs/API.md (current)
- Setup guides
- API documentation
- User guides
- Architecture overview (this file)
- JWT token authentication
- User models and permissions
- User-specific data isolation
- Usage quotas and limits
- Replace Groq with local LLM (Ollama/LLaMA)
- GPU acceleration
- Model caching and optimization
- Fallback to Groq if local fails
- Video processing support
- Real-time collaboration
- Export functionality (CSV, PDF)
- Study mode and flashcards
- Progress tracking and analytics