This framework is an autonomous, non-linear audio signal processing (ASP) engine specifically engineered to remediate the "acoustic pathologies" produced by transformer-based audio synthesis (e.g., Suno, Udio, Stable Audio).
Our pipeline systematically addresses:
- Metallic Vocoder Shimmer: High-frequency ringing (8-12kHz) common in latent diffusion decoders.
- Low-Mid Congestion: The "boxy" or "muddy" spectral profile (200-500Hz) inherent in over-compressed generative mixes.
- Spectral Holes: Inpainting missing high-frequency harmonics (16-24kHz) lost during generative bottlenecks.
- Rigid Quantization: Humanizing AI vocals by injecting 4.5Hz vibrato and noise-shaped masking.
We believe in "Noob-Friendly" engineering. The system is designed to be fully operational with zero manual configuration.
git clone https://github.com/mriridescent/aiMusic-PostProd.git
cd aiMusic-PostProd
./AUTOMATE.sh your_track.mp3This script isolates the environment, resolves 100% of dependencies (including Torch, Demucs, and Pedalboard), and executes the pipeline.
python wizard.pyA self-explaining wizard that validates your GPU (CUDA), FFmpeg installation, and Python environment.
- Engine: Meta AI's Demucs v4 (
htdemucs_ft). - Metric: 8.4 dB Signal-to-Distortion Ratio (SDR) for vocal extraction.
- Nuance: Hybrid time-frequency processing with 10s chunking/1s crossfade to prevent OOM on 8GB VRAM GPUs.
- Denoising:
DeepFilterNet3operating in the ERB domain (0.19 RTF). - Super-Resolution:
AudioSRlatent bridge models to synthesize missing harmonics up to 48kHz.
- Engine: Spotify Pedalboard (C++ JUCE wrapper).
- Performance: 300x faster than native Python; bypasses GIL for multi-threaded processing.
- Nuance: Dynamic EQ carving, 1-3ms lookahead compression, and "Vocal Naturalization".
- Framework:
smolagents(Hugging Face) using a ReAct (Reason + Act) loop. - Logic: The agent "listens" to spectral centroid and flatness metrics and autonomously reconfigures the mix parameters until target fidelity is reached.
For deep-dives into the research and operation of this system, please refer to:
- 📑 Granular Technical Specification
- 🧬 Research & Acoustic Pathology
- ⚙️ Operational Manual
- 📊 Interactive Architecture Infographic
Programmer & Visionary: David Akpoviroro Oke
Brand: MrIridescent (The Creative Renaissance Man)
Bridging the gap between avant-garde creative expression and high-performance software engineering.
Licensed under MIT. Built using Meta AI's Demucs, Spotify's Pedalboard, and Hugging Face's smolagents.
Designed for the future of autonomous music production.