[#13] Add audio preprocessing for uploaded recordings by julien731 · Pull Request #33 · nimblehq/audio-transcriber

julien731 · 2026-03-14T09:26:44Z

Story: #13

Summary

Add an audio preprocessing pipeline that automatically cleans uploaded audio before transcription. The pipeline applies three stages: 80Hz high-pass filter (removes rumble), conservative noise reduction via noisereduce (prop_decrease=0.75), and loudness normalization to -23 LUFS via pyloudnorm. Preprocessing is enabled by default and can be toggled off per upload via a new checkbox in the upload form.

Key changes:

New backend/services/audio_preprocessor.py service with the three-stage pipeline
preprocess_audio boolean field added to MeetingMetadata (defaults to True)
Upload endpoint accepts preprocess_audio form parameter
Transcription pipeline calls preprocessor before WhisperX, using a working copy (original preserved)
Preprocessed file is cleaned up in a finally block after transcription
Upload form includes an "Audio preprocessing" checkbox with description
New dependencies: noisereduce>=3.0.0, pyloudnorm>=0.1.1

Approach

Preprocessing is implemented as a separate service module to keep the transcriber focused on transcription. The preprocessed audio is saved as a WAV working copy alongside the original file (preserving the original per T3). Conservative noise reduction settings are used because Whisper was trained on noisy data (BR-3/T1). The preprocessed copy is fed to both transcription and diarization (T2).

Verification

119 tests passing (pytest -x -q)
Unit tests cover: preprocessor output path, original preservation, valid WAV output, stereo-to-mono conversion, spec constants
Integration tests cover: preprocess_audio defaults to true, can be disabled via form field
Ruff check and format pass
Architect review: no critical or major issues

…oudness normalization

julien731 · 2026-03-14T09:28:11Z

QA Confidence Verdict

What Was Verified

Truths:

Truth	Status	Method
T1: Noise reduction uses `prop_decrease=0.75`	PASS	Code inspection: `NOISE_PROP_DECREASE = 0.75` constant used in `nr.reduce_noise()`. Unit test `test_constants_match_spec` asserts this.
T2: Preprocessing applied to both transcription and diarization	PASS	Code inspection: `audio_path` is reassigned to preprocessed path before any WhisperX calls. All downstream steps (transcribe, align, diarize) use the same variable.
T3: Original uploaded file is preserved; preprocessing creates a working copy	PASS	Code inspection: preprocessed file saved as `audio_preprocessed.wav` alongside original. Unit test `test_preserves_original_file` verifies original is untouched. Preprocessed copy is cleaned up in `finally` block.

Acceptance Criteria:

AC	Status	Method
High-pass filter, noise reduction, loudness normalization applied	PASS	Code inspection: all three stages implemented in `audio_preprocessor.py` (80Hz Butterworth, noisereduce, pyloudnorm to -23 LUFS)
Preprocessing enabled by default	PASS	`preprocess_audio: bool = True` in schema. Form default is `"true"`. Integration test confirms.
Upload form includes toggle to disable preprocessing	PASS	Code inspection: checkbox with id `preprocess-checkbox`, checked by default, wired through `api.js` to backend.
Conservative noise reduction does not degrade clean audio	PASS	Enforced by `prop_decrease=0.75` and `stationary=False` settings.
`preprocess_audio` persisted in MeetingMetadata	PASS	Pydantic field added. Integration tests verify persistence for both true and false values.
Dependencies: noisereduce>=3.0.0, pyloudnorm>=0.1.1	PASS	Both added to `requirements.txt` with correct version constraints.

Tests: All 30 tests pass (5 unit tests for preprocessor, 2 integration tests for preprocess_audio persistence).

What Needs Human Eyes

Visual appearance of the preprocessing checkbox in the upload form (alignment, spacing, dark/light theme)
Preprocessing hint text readability

Risk Areas

No test exercises the full transcription pipeline with preprocessing enabled (transcriber tests set preprocess_audio=False). The integration between preprocessor and transcriber is verified by code inspection only.
No test verifies that the preprocessed file is cleaned up after transcription completes.

Suggested QA Focus

Quick glance: Upload form checkbox appearance in both themes
Functional test: Upload a file with preprocessing enabled and disabled, verify both complete successfully
Estimated effort: 10 minutes manual testing

CI does not install ML dependencies. Use pytest.importorskip to gracefully skip when numpy/soundfile/noisereduce are unavailable.

soundfile (libsndfile) cannot read m4a/AAC files, causing preprocessing to fail on common upload formats. Convert unsupported formats to WAV using ffmpeg before applying filters.

julien731 added 8 commits March 14, 2026 15:20

#13 Add implementation plan for audio preprocessing

babee71

#13 Add preprocess_audio field to schema and new dependencies

1e09aa4

#13 Add audio preprocessor service with high-pass, noise reduction, l…

c6fe7e4

…oudness normalization

#13 Wire audio preprocessor into transcription pipeline

afcdfc1

#13 Accept preprocess_audio form field in upload endpoint

b5ccc53

#13 Add audio preprocessing toggle to upload form

bf70c53

#13 Add tests for audio preprocessing and preprocess_audio persistence

6c9131e

#13 Move preprocessed file cleanup to finally block

424fd75

julien731 self-assigned this Mar 14, 2026

julien731 added the feature New feature or enhancement label Mar 14, 2026

julien731 added 2 commits March 14, 2026 16:30

#13 Skip audio preprocessor tests when numpy is not installed

b468d07

CI does not install ML dependencies. Use pytest.importorskip to gracefully skip when numpy/soundfile/noisereduce are unavailable.

#13 Document preprocessing stage in progress UI and CLAUDE.md

f8c8b67

julien731 changed the title ~~#13 Add audio preprocessing for uploaded recordings~~ [#13] Add audio preprocessing for uploaded recordings Mar 16, 2026

#13 Convert non-WAV audio formats via ffmpeg before preprocessing

c872af1

soundfile (libsndfile) cannot read m4a/AAC files, causing preprocessing to fail on common upload formats. Convert unsupported formats to WAV using ffmpeg before applying filters.

julien731 merged commit c547000 into develop Mar 17, 2026
2 checks passed

julien731 deleted the feature/13-audio-preprocessing branch March 17, 2026 06:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[#13] Add audio preprocessing for uploaded recordings#33

[#13] Add audio preprocessing for uploaded recordings#33
julien731 merged 11 commits intodevelopfrom
feature/13-audio-preprocessing

julien731 commented Mar 14, 2026

Uh oh!

julien731 commented Mar 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

julien731 commented Mar 14, 2026

Summary

Approach

Verification

Uh oh!

julien731 commented Mar 14, 2026

QA Confidence Verdict

What Was Verified

What Needs Human Eyes

Risk Areas

Suggested QA Focus

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant