Skip to content

musikalkemist/pytorchforaudio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PytorchForAudio

Code for the "PyTorch for Audio + Music Processing" series on The Sound of AI YouTube channel.

This repository is a comprehensive collection of resources and code for understanding and implementing deep learning models for audio tasks using PyTorch and Torchaudio. It serves as a practical guide, moving from basic neural network implementations to building a complete sound classification system (CNN) trained on the UrbanSound8K dataset.

Maintained Python 3.11 PyTorch Torchaudio Pandas License

Note on Versioning

While this v2 release is fully functional and optimized for current environments, it may differ from the original version shown in the course. The codebase has been updated to reflect modern best practices and improved dependency management. Consequently, the original course version has been deprecated; however, it remains available in the legacy branch for those wishing to follow the video content exactly.

Table of Contents


Dataset Setup (UrbanSound8K)

To run the sound classification lessons (8-10), you will need the UrbanSound8K dataset. We provide an automated downloader to handle the acquisition, path sanitization, and folder organization for you.

  • Quick Start: Run python dataset_downloader.py from the root directory.
  • Options: Supports --COPY flag to preserve your Kaggle cache.

Full Instructions: Please check the Instructions UrbanSound8K file for help using the downloader script or manual download steps.

Course Structure

Introduction & Basics

  1. Course Overview: Video | Slides
  2. Implementing and Training a Neural Network: Video | Code
  3. Making Predictions with PyTorch Models: Video | Code

Audio Data Processing

  1. Custom Audio PyTorch Dataset: Video | Code
  2. Extracting Mel Spectrograms: Video | Code
  3. Pre-processing Audio (Padding/Truncating): Video | Code
  4. Pre-processing on GPU: Video | Code

Sound Classification Project (UrbanSound8K)

  1. Implementing a CNN for Sound Classification: Video | Code
  2. Training a Sound Classifier: Video | Code
  3. Predictions with a Sound Classifier: Video | Code

How to Run the Scripts

To ensure the models and scripts execute correctly, please follow these steps from your terminal:

1. Prepare the Environment (Recommended)

Before running inference, ensure you have the necessary dependencies installed:

pip install -r requirements.txt

2. Navigate to the Lesson Folder

Each class is self-contained. Move into the specific directory for the lesson you are studying:

cd 'class/folder/name'  # Replace with the specific directory path (ensure it is enclosed in quotes).

3. Execute the Script

Run the main script using Python:

python inference.py  # Replace with the specific script name

About

Code for the "PyTorch for Audio + Music Processing" series on The Sound of AI YouTube channel.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages