Base URL: http://localhost:8000/api/
Phase 1 (MVP): No authentication required. All endpoints are publicly accessible.
Phase 2+: JWT authentication will be added.
Upload an audio file for transcription and word extraction.
Endpoint: POST /api/upload/
Content-Type: multipart/form-data
Request Body:
file: <audio file>
Supported Formats:
- MP3 (audio/mpeg)
- WAV (audio/wav)
- M4A (audio/x-m4a, audio/mp4)
File Size Limit: 100MB
Example Request (curl):
curl -X POST http://localhost:8000/api/upload/ \
-F "file=@/path/to/audio.mp3"Example Request (JavaScript):
const formData = new FormData();
formData.append('file', audioFile);
const response = await fetch('http://localhost:8000/api/upload/', {
method: 'POST',
body: formData
});
const data = await response.json();Success Response (201 Created):
{
"id": 1,
"file": "/media/audio/2025/10/08/sample.mp3",
"original_filename": "sample.mp3",
"file_size": 5242880,
"file_size_mb": 5.0,
"duration": null,
"status": "pending",
"error_message": null,
"uploaded_at": "2025-10-08T10:30:00Z",
"processing_started_at": null,
"processing_completed_at": null,
"processing_time": null
}Status Values:
pending- Queued for processingprocessing- Processing startedtranscribing- Running Whisper transcriptionanalyzing- Extracting and classifying wordscompleted- Processing finished successfullyfailed- Processing failed (check error_message)
Error Response (400 Bad Request):
{
"file": ["File size must be less than 100MB"]
}Check the current status of an audio file's processing.
Endpoint: GET /api/status/<audio_id>/
Example Request:
curl http://localhost:8000/api/status/1/Success Response (200 OK):
{
"id": 1,
"status": "analyzing",
"progress": 70,
"error_message": null,
"original_filename": "sample.mp3",
"uploaded_at": "2025-10-08T10:30:00Z",
"processing_started_at": "2025-10-08T10:30:05Z",
"processing_completed_at": null,
"has_transcription": true,
"transcription_id": 1
}Progress Values:
0%- Pending or failed20%- Processing started40%- Transcribing70%- Analyzing words100%- Completed
Polling Recommendation:
Poll this endpoint every 2-5 seconds until status is completed or failed.
Get a list of all transcriptions.
Endpoint: GET /api/transcriptions/
Example Response:
{
"count": 10,
"next": null,
"previous": null,
"results": [
{
"id": 1,
"audio_file": {
"id": 1,
"original_filename": "sample.mp3",
"status": "completed"
},
"text": "Full transcription text here...",
"language": "en",
"word_count": 150,
"unique_word_count": 85,
"statistics": {
"id": 1,
"a1_count": 45,
"a2_count": 20,
"b1_count": 10,
"b2_count": 5,
"c1_count": 3,
"c2_count": 2,
"unknown_count": 0,
"total_words": 85,
"level_distribution": {
"A1": 52.9,
"A2": 23.5,
"B1": 11.8,
"B2": 5.9,
"C1": 3.5,
"C2": 2.4,
"Unknown": 0.0
}
},
"created_at": "2025-10-08T10:31:00Z",
"updated_at": "2025-10-08T10:31:00Z"
}
]
}Get detailed information about a specific transcription, including all extracted words.
Endpoint: GET /api/transcriptions/<id>/
Example Request:
curl http://localhost:8000/api/transcriptions/1/Example Response:
{
"id": 1,
"audio_file": {
"id": 1,
"original_filename": "sample.mp3",
"file_size_mb": 5.0,
"duration": 180.5,
"status": "completed"
},
"text": "Hello everyone, this is a sample transcription. We will learn about difficult paradigms and simple concepts.",
"language": "en",
"word_count": 17,
"unique_word_count": 15,
"statistics": {
"id": 1,
"a1_count": 8,
"a2_count": 3,
"b1_count": 2,
"b2_count": 1,
"c1_count": 1,
"c2_count": 0,
"unknown_count": 0,
"total_words": 15,
"level_distribution": {
"A1": 53.3,
"A2": 20.0,
"B1": 13.3,
"B2": 6.7,
"C1": 6.7,
"C2": 0.0,
"Unknown": 0.0
}
},
"extracted_words": [
{
"id": 1,
"word": {
"id": 1,
"text": "hello",
"lemma": "hello",
"cefr_level": "A1",
"cefr_level_display": "A1 - Beginner",
"global_frequency": 1
},
"context": "Hello everyone, this is a sample transcription.",
"timestamp": 0.5,
"position": 0,
"frequency": 1
},
{
"id": 2,
"word": {
"id": 2,
"text": "paradigm",
"lemma": "paradigm",
"cefr_level": "C1",
"cefr_level_display": "C1 - Advanced",
"global_frequency": 1
},
"context": "We will learn about difficult paradigms and simple concepts.",
"timestamp": 8.2,
"position": 12,
"frequency": 1
}
]
}Get words from a transcription, filtered by CEFR level.
Endpoint: GET /api/transcriptions/<id>/words/?level=<levels>
Query Parameters:
level(optional) - Comma-separated CEFR levels (A1, A2, B1, B2, C1, C2)
Example Requests:
# Get all words
curl http://localhost:8000/api/transcriptions/1/words/
# Get only A1 words
curl http://localhost:8000/api/transcriptions/1/words/?level=A1
# Get B1 and B2 words
curl http://localhost:8000/api/transcriptions/1/words/?level=B1,B2
# Get advanced words (C1 and C2)
curl http://localhost:8000/api/transcriptions/1/words/?level=C1,C2Example Response:
[
{
"id": 5,
"word": {
"id": 5,
"text": "difficult",
"lemma": "difficult",
"cefr_level": "B1",
"cefr_level_display": "B1 - Intermediate",
"global_frequency": 1
},
"context": "We will learn about difficult paradigms.",
"timestamp": 7.8,
"position": 11,
"frequency": 1
},
{
"id": 6,
"word": {
"id": 6,
"text": "concept",
"lemma": "concept",
"cefr_level": "B2",
"cefr_level_display": "B2 - Upper Intermediate",
"global_frequency": 1
},
"context": "difficult paradigms and simple concepts.",
"timestamp": 9.5,
"position": 14,
"frequency": 1
}
]Get word statistics for a transcription.
Endpoint: GET /api/transcriptions/<id>/statistics/
Example Response:
{
"id": 1,
"a1_count": 8,
"a2_count": 3,
"b1_count": 2,
"b2_count": 1,
"c1_count": 1,
"c2_count": 0,
"unknown_count": 0,
"total_words": 15,
"level_distribution": {
"A1": 53.3,
"A2": 20.0,
"B1": 13.3,
"B2": 6.7,
"C1": 6.7,
"C2": 0.0,
"Unknown": 0.0
},
"created_at": "2025-10-08T10:31:00Z"
}Get a list of all words in the database.
Endpoint: GET /api/words/
Query Parameters:
level(optional) - Filter by CEFR level (A1, A2, B1, B2, C1, C2)search(optional) - Search by word text or lemma
Example Requests:
# Get all words
curl http://localhost:8000/api/words/
# Get only C1 words
curl http://localhost:8000/api/words/?level=C1
# Search for words
curl http://localhost:8000/api/words/?search=paradigmExample Response:
{
"count": 100,
"next": "http://localhost:8000/api/words/?page=2",
"previous": null,
"results": [
{
"id": 1,
"text": "hello",
"lemma": "hello",
"cefr_level": "A1",
"cefr_level_display": "A1 - Beginner",
"global_frequency": 5,
"created_at": "2025-10-08T10:31:00Z"
},
{
"id": 2,
"text": "paradigm",
"lemma": "paradigm",
"cefr_level": "C1",
"cefr_level_display": "C1 - Advanced",
"global_frequency": 2,
"created_at": "2025-10-08T10:31:15Z"
}
]
}Get detailed information about a specific word.
Endpoint: GET /api/words/<id>/
Example Response:
{
"id": 1,
"text": "paradigm",
"lemma": "paradigm",
"cefr_level": "C1",
"cefr_level_display": "C1 - Advanced",
"global_frequency": 2,
"created_at": "2025-10-08T10:31:15Z"
}All errors follow a consistent format:
{
"detail": "Error message here"
}Or for validation errors:
{
"field_name": ["Error message for this field"]
}200 OK- Successful GET request201 Created- Successful POST request (resource created)400 Bad Request- Validation error or malformed request404 Not Found- Resource not found500 Internal Server Error- Server error
File Too Large:
{
"file": ["File size must be less than 100MB"]
}Invalid File Format:
{
"file": ["File format not supported. Allowed formats: audio/mpeg, audio/wav, audio/x-m4a, audio/mp4"]
}Resource Not Found:
{
"detail": "Not found."
}Processing Failed:
Check the error_message field in the audio file status:
{
"id": 1,
"status": "failed",
"error_message": "Failed to transcribe audio: Invalid audio format"
}Phase 1 (MVP): No rate limiting
Phase 2+: Rate limiting will be implemented:
- 100 requests per hour per IP
- 10 file uploads per hour per IP
CORS is configured to allow requests from:
http://localhost:3000(React development server)http://localhost:80(Frontend in production)
// 1. Upload file
const formData = new FormData();
formData.append('file', audioFile);
const uploadResponse = await fetch('http://localhost:8000/api/upload/', {
method: 'POST',
body: formData
});
const audioFile = await uploadResponse.json();
const audioId = audioFile.id;
// 2. Poll for status
const pollStatus = async () => {
const statusResponse = await fetch(`http://localhost:8000/api/status/${audioId}/`);
const status = await statusResponse.json();
if (status.status === 'completed') {
return status.transcription_id;
} else if (status.status === 'failed') {
throw new Error(status.error_message);
} else {
// Still processing, poll again
await new Promise(resolve => setTimeout(resolve, 3000));
return pollStatus();
}
};
const transcriptionId = await pollStatus();
// 3. Get transcription and words
const transcriptionResponse = await fetch(
`http://localhost:8000/api/transcriptions/${transcriptionId}/`
);
const transcription = await transcriptionResponse.json();
// 4. Get words by level (e.g., B1 and above)
const wordsResponse = await fetch(
`http://localhost:8000/api/transcriptions/${transcriptionId}/words/?level=B1,B2,C1,C2`
);
const words = await wordsResponse.json();
console.log('Transcription:', transcription.text);
console.log('Advanced words:', words);List endpoints use pagination with a default page size of 50 items.
Response Format:
{
"count": 150,
"next": "http://localhost:8000/api/words/?page=2",
"previous": null,
"results": [...]
}Query Parameters:
page- Page number (default: 1)page_size- Items per page (max: 100)
- All timestamps are in UTC
- All durations are in seconds
- File paths are relative to
MEDIA_ROOT - Audio files are stored in
/media/audio/YYYY/MM/DD/format
API Version: 1.0
Last Updated: October 8, 2025