Skip to content

GSoC 2026 Candidate Submission: End-to-End Narrative Audio Pipeline#39

Open
meganho456 wants to merge 5 commits intohumanai-foundation:masterfrom
meganho456:gsoc2026-narrative-audio
Open

GSoC 2026 Candidate Submission: End-to-End Narrative Audio Pipeline#39
meganho456 wants to merge 5 commits intohumanai-foundation:masterfrom
meganho456:gsoc2026-narrative-audio

Conversation

@meganho456
Copy link
Copy Markdown

This PR contains my GSoC 2026 test submission for a complete narrative-audio workflow, including all required tasks and a bonus storytelling analysis component.

What’s included
Task 1: Audio Processing Pipeline
Loads .wav recordings, normalizes audio, segments clips when needed, and extracts ML-ready features.
Features include MFCCs, pitch, spectral centroid, RMS energy, and duration.
Produces a structured feature dataset and normalized audio outputs.
Task 2: Narrative Tone Classification
Trains a neural-network classifier using labeled emotional-tone data.
Uses train/test split and reports evaluation metrics (accuracy, weighted F1, per-class report).
Task 3: AI-Based Transcription
Implements batch transcription with Whisper.
Exports transcripts to text format.
Measures transcription quality on a subset using WER.
Task 4: Narrative Audio Retrieval
Implements a retrieval prototype for narrative-style queries (e.g., calm narration, high-energy speech, dramatic dialogue).
Combines structured filtering and semantic ranking to return relevant recordings.
Bonus: Storytelling Audio Analysis
Analyzes storytelling-oriented cues: pacing/pauses, pitch variation, energy dynamics, and sentence-length characteristics.
Adds a heuristic storytelling score and ranks clips by storytelling-like expressiveness.
Deliverables in this submission
Full source code for Tasks 1–4 and bonus task, and run_pipeline that chains all the tasks together
Technical report PDF
README with setup and run instructions
Example output artifacts (feature CSVs, transcripts, analysis outputs)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant