Architecture Overview

System Architecture

Hard Word Extractor follows a modern client-server architecture with async task processing.

┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│   React     │────▶│   Django     │────▶│ PostgreSQL  │
│   Frontend  │◀────│   Backend    │◀────│  Database   │
└─────────────┘     └──────────────┘     └─────────────┘
                          │ │
                          │ └──────────▶ ┌────────────┐
                          │              │   Redis    │
                          └─────────────▶│   Cache    │
                                         └────────────┘
                                              │
                                              ▼
                                        ┌────────────┐
                                        │   Celery   │
                                        │   Worker   │
                                        └────────────┘
                                              │
                          ┌──────────────────┼────────────────────┐
                          ▼                  ▼                    ▼
                    ┌─────────┐        ┌──────────┐        ┌─────────┐
                    │ Whisper │        │ Groq API │        │  spaCy  │
                    │   AI    │        │   LLM    │        │   NLP   │
                    └─────────┘        └──────────┘        └─────────┘

Backend Architecture

Layer Organization

The backend follows a clean architecture pattern with clear separation of concerns:

backend/
├── config/                 # Django settings & configuration
│   ├── settings.py
│   ├── urls.py
│   └── celery.py
└── transcription/         # Main Django app
    ├── models/            # Data models (Domain Layer)
    │   ├── audio.py
    │   ├── transcription.py
    │   ├── word.py
    │   ├── statistics.py
    │   └── processing.py
    ├── serializers/       # API serialization (API Layer)
    │   ├── audio.py
    │   ├── transcription.py
    │   ├── word.py
    │   ├── statistics.py
    │   └── processing.py
    ├── views/             # API endpoints (API Layer)
    │   ├── audio.py
    │   ├── transcription.py
    │   ├── word.py
    │   ├── processing.py
    │   └── status.py
    ├── services/          # Business logic (Service Layer)
    │   ├── audio/         # Audio processing services
    │   │   ├── storage_service.py
    │   │   ├── validation_service.py
    │   │   └── transcription_service.py
    │   ├── transcription/ # Transcription services
    │   │   ├── whisper_service.py
    │   │   └── processing_service.py
    │   ├── words/         # Word extraction & analysis
    │   │   ├── extraction_service.py
    │   │   ├── context_service.py
    │   │   └── statistics_service.py
    │   └── ai/            # AI/LLM services
    │       ├── groq_service.py
    │       └── classification_service.py
    ├── utils/             # Shared utilities
    │   ├── constants.py   # Constants & configuration
    │   ├── exceptions.py  # Custom exceptions
    │   ├── validators.py  # Validation helpers
    │   ├── responses.py   # Response formatters
    │   └── pagination.py  # Pagination utilities
    └── tests/             # Test suite
        ├── test_models/
        ├── test_serializers/
        ├── test_views/
        └── test_services/

Key Design Patterns

1. Service Layer Pattern

Business logic is separated into service classes in services/:

Audio Services: Handle file storage, validation, transcription initiation
Transcription Services: Process Whisper API calls and results
Word Services: Extract words, contexts, and generate statistics
AI Services: Interface with Groq LLM for classification

2. Repository Pattern

Models in models/ act as repositories, providing data access:

AudioFile: Uploaded audio files
Transcription: Transcription results with timestamps
Word: Extracted vocabulary words
Statistics: Aggregated word statistics
ProcessingTask: Async task tracking

3. API Layer Separation

Serializers: Handle data validation and transformation
Views: HTTP request handling and routing
Utils: Cross-cutting concerns (pagination, responses)

4. Dependency Injection

Services are injected into views, making testing easier and reducing coupling.

API Design

The API follows REST principles:

/api/audio/ - Audio file management
/api/transcriptions/ - Transcription results
/api/words/ - Word queries with filtering
/api/statistics/ - Aggregated statistics
/api/processing/ - Task status and control

Features:

Pagination on all list endpoints
Filtering and ordering on word lists
Consistent APIResponse wrapper format
Comprehensive error handling

Frontend Architecture

Component Organization

The frontend follows a feature-based architecture:

frontend/src/
├── components/          # Reusable UI components
│   ├── audio/          # Audio-related components
│   │   └── AudioUpload.tsx
│   ├── transcription/  # Transcription display
│   │   └── TranscriptionView.tsx
│   ├── words/          # Word display & stats
│   │   ├── WordList.tsx
│   │   └── Statistics.tsx
│   ├── layout/         # Layout components
│   │   ├── Header.tsx
│   │   ├── Footer.tsx
│   │   └── Layout.tsx
│   └── common/         # Shared components
│       └── StatusIndicator.tsx
├── features/           # Feature modules (hooks + logic)
│   ├── audio/
│   │   └── hooks/
│   │       ├── useAudioUpload.ts
│   │       └── useAudioStatus.ts
│   ├── transcription/
│   │   └── hooks/
│   │       └── useTranscription.ts
│   └── words/
│       └── hooks/
│           └── useWords.ts
├── hooks/              # Global custom hooks
│   ├── useDebounce.ts
│   ├── useLocalStorage.ts
│   ├── useCache.ts
│   └── useCachedApi.ts
├── services/           # API communication
│   └── audioService.ts
├── pages/              # Page components (routes)
│   ├── Home.tsx
│   └── Results.tsx
├── types/              # TypeScript types
│   └── api.ts
├── theme/              # MUI theme configuration
│   └── theme.ts
└── utils/              # Utility functions
    └── formatters.ts

Key Design Patterns

1. Feature-Based Structure

Related components, hooks, and logic are grouped by feature:

Audio: Upload, status tracking
Transcription: Display and search
Words: Filtering, sorting, statistics

2. Custom Hooks Pattern

Business logic is extracted into reusable hooks:

useAudioUpload: File upload with progress tracking
useAudioStatus: Polling for processing status
useTranscription: Fetch and manage transcription data
useWords: Word filtering and pagination
useDebounce: Debounced search inputs
useCache: API response caching

3. Service Layer

API calls are centralized in services/audioService.ts:

Consistent error handling
Response formatting
Token management (future)

4. Performance Optimizations

Code Splitting: React.lazy() for route-based splitting
Memoization: React.memo() on expensive components
Caching: useCachedApi hook for repeated requests
Debouncing: useDebounce for search inputs

Data Flow

Upload & Processing Flow

1. User uploads file
   ├─▶ Frontend: AudioUpload component
   └─▶ Backend: POST /api/audio/

2. File stored & validated
   ├─▶ AudioStorageService.save()
   └─▶ AudioValidationService.validate()

3. User triggers transcription
   ├─▶ Frontend: POST /api/audio/{id}/transcribe/
   └─▶ Backend: Celery task created

4. Celery worker processes
   ├─▶ WhisperService.transcribe()
   └─▶ TranscriptionProcessingService.process()

5. User triggers word extraction
   ├─▶ POST /api/transcription/{id}/extract-words/
   └─▶ WordExtractionService.extract()

6. User triggers classification
   ├─▶ POST /api/words/classify/
   ├─▶ GroqService.classify_batch()
   └─▶ ClassificationService.apply()

7. User views results
   ├─▶ GET /api/words/?filters
   └─▶ GET /api/statistics/{id}/

State Management

Frontend State:

React hooks for local component state
useLocalStorage for persistent UI preferences
useCache for API response caching
No global state management (Redux/MobX) needed yet

Backend State:

PostgreSQL for persistent data
Redis for:
- Celery task queue
- API caching (future)
- Session storage (future)

Technology Choices & Rationale

Backend

Technology	Purpose	Why?
Django 5.2	Web framework	Robust, batteries included, excellent ORM
DRF	REST API	Best-in-class REST framework for Django
Celery	Async tasks	Industry standard for background processing
Redis	Message broker & cache	Fast, reliable, simple
PostgreSQL	Database	ACID compliant, excellent for structured data
Whisper	Transcription	Best open-source speech-to-text
Groq	LLM inference	Ultra-fast, cost-effective

Frontend

Technology	Purpose	Why?
React 18	UI framework	Component-based, large ecosystem
TypeScript	Type safety	Catches errors early, better DX
MUI	UI components	Professional look, comprehensive
React Router	Routing	Standard routing solution
Axios	HTTP client	Simple API, good error handling

Deployment Architecture (Future)

┌─────────────────────────────────────────────────────┐
│                  Nginx (Reverse Proxy)               │
│              SSL Termination (Let's Encrypt)         │
└────────────┬──────────────────────────┬─────────────┘
             │                          │
      ┌──────▼──────┐           ┌──────▼──────┐
      │   Static    │           │  Gunicorn   │
      │   Files     │           │  (Django)   │
      │  (React)    │           │  Workers    │
      └─────────────┘           └──────┬──────┘
                                       │
                        ┌──────────────┼──────────────┐
                        │              │              │
                  ┌─────▼────┐   ┌────▼────┐   ┌────▼────┐
                  │PostgreSQL│   │  Redis  │   │ Celery  │
                  │          │   │         │   │ Workers │
                  └──────────┘   └─────────┘   └─────────┘

Deployment Options:

Docker Compose (recommended for simple deployments)
Kubernetes (for scaling)
VPS (DigitalOcean, Linode, etc.)

Security Considerations

Current (Phase 1 - MVP)

File size limits (100MB)
File type validation
CORS configuration
SQL injection prevention (Django ORM)
XSS prevention (React escaping)

Future (Phase 2+)

JWT authentication
Rate limiting
API key management
User data isolation
File encryption at rest
HTTPS only
CSP headers

Scalability Considerations

Current Bottlenecks

Whisper Processing: CPU-intensive, blocks Celery worker
File Storage: Local filesystem (not scalable)
Database: Single PostgreSQL instance

Future Improvements

Processing:
- GPU-enabled Whisper processing
- Multiple Celery workers
- Task priority queues
Storage:
- S3/MinIO for file storage
- CDN for static files
Database:
- Read replicas for queries
- Connection pooling
- Query optimization
Caching:
- Redis for API responses
- Browser caching for static content

Testing Strategy

Backend Tests

Unit Tests: Models, serializers, services
Integration Tests: API endpoints
Task Tests: Celery tasks
Coverage Target: 80%+

Frontend Tests

Unit Tests: Hooks, utilities
Component Tests: React Testing Library
Integration Tests: User flows
Coverage Target: 70%+

Testing Tools

Backend: pytest, pytest-django, factory_boy
Frontend: Jest, React Testing Library, MSW

Development Workflow

Local Development

Redis running in Docker
Celery worker in terminal
Django dev server
React dev server (hot reload)

Code Quality

Linting: ESLint (frontend), pylint/flake8 (backend)
Formatting: Prettier (frontend), black (backend)
Type Checking: TypeScript, mypy (future)
Pre-commit Hooks: lint-staged (future)

Documentation Standards

Code Documentation

Docstrings for all Python functions/classes
JSDoc comments for complex TypeScript functions
README in each major directory

API Documentation

OpenAPI/Swagger (future)
Postman collection (future)
docs/API.md (current)

User Documentation

Setup guides
API documentation
User guides
Architecture overview (this file)

Future Architecture Plans

Phase 2: Authentication & Multi-user

JWT token authentication
User models and permissions
User-specific data isolation
Usage quotas and limits

Phase 3: Local LLM

Replace Groq with local LLM (Ollama/LLaMA)
GPU acceleration
Model caching and optimization
Fallback to Groq if local fails

Phase 4: Advanced Features

Video processing support
Real-time collaboration
Export functionality (CSV, PDF)
Study mode and flashcards
Progress tracking and analytics

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

Architecture Overview

System Architecture

Backend Architecture

Layer Organization

Key Design Patterns

1. Service Layer Pattern

2. Repository Pattern

3. API Layer Separation

4. Dependency Injection

API Design

Frontend Architecture

Component Organization

Key Design Patterns

1. Feature-Based Structure

2. Custom Hooks Pattern

3. Service Layer

4. Performance Optimizations

Data Flow

Upload & Processing Flow

State Management

Technology Choices & Rationale

Backend

Frontend

Deployment Architecture (Future)

Security Considerations

Current (Phase 1 - MVP)

Future (Phase 2+)

Scalability Considerations

Current Bottlenecks

Future Improvements

Testing Strategy

Backend Tests

Frontend Tests

Testing Tools

Development Workflow

Local Development

Code Quality

Documentation Standards

Code Documentation

API Documentation

User Documentation

Future Architecture Plans

Phase 2: Authentication & Multi-user

Phase 3: Local LLM

Phase 4: Advanced Features

Resources