Skip to content

Tools for helping the extraction of features from an audioset

Notifications You must be signed in to change notification settings

fabiofelix/AudioTools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

This code contains routines to easily extract acoustic features from audio files inside a specific directory. I coded in R, with well-known packages, and created a wrapper to invoke the routines with Python. It works with mp3, wav, and flac file formats. Also, some of my researches used this R code and its versions to study natural sound patterns, such as [1], [2], and [3].

Features

Available features and their references are grouped by their respective R packages.

  • Acoustic Complexity Index (ACI) [ref]

  • Acoustic Diversity Index (ADI) [ref]

  • Acoustic Evenness Index (AEI) [ref]

  • Bioacoustic Index (BIO) [ref]

  • Normalized Difference Soundscape Index (NDSI) [ref]

  • Acoustic Entropy Index (H), Temporal Entropy (Ht), and Frequency Entropy (Hf) [ref]

  • Acoustic Richness (AR) and Median of Amplitude Envelope (M) [ref]

  • Number of Peaks (NP) [ref]

  • Roughness [ref]

  • Rugosity [ref]

  • Zero-crossing rate (ZCR) [ref]

  • Mel-frequency Cepstrum Coefficients (MFCC) [ref]

others

  • Background noise index (BGN) [ref]

  • Power Spectral Density (PSD) [ref]

  • Signal-to-noise Ratio (SNR) [ref]

  • Mean Square Pressure (MSP) and Sound Pressure Level (SPL) [ref]

  • Root Mean Square (RMS) [ref]

The code also generate spectrogram images of the recordings.

Prerequisites

Additionally to R and the previous packages, the user will need to install doParallel to perform parallel processing and reduce time consuming. Also, to use Python wrapper, you must install the package rpy2.

Main functions

These functions are available in R to extract features and generate spectrograms of recordings.

#It takes a directory and process all recordings inside it.
process.dir <- function(
  source.path, #directory with recordings to be processed
  target.path, #directory where results will be saved (default = source.path)
  aquatic, #if the audio files are from underwater landscape (default = FALSE). Depending on the landscape, some routines has different parameters.
  generate, #define the output indices ("index") or spectrograms ("spec") (default = "index")
  slice.size, #how many slices a recording (in seconds) should be divided (default = 1). A slice must not be greater than 90s.
  batch.size, #save indices results in batchs of batch.size (default = 100)
  img.dim, #if generate = "spec", it defines the image size (default = c(1366, 758))
  palette #if generate = "spec", it defines the image color palette (default = spectro.colors)
)  
#It process a specific recording.
process.file <- function(
  path, #recording to be processed
  spec.path, #if generate = "spec", it defines where image will be saved (default = NULL, inside path)
  aquatic, #if the audio files are from underwater landscape (default = FALSE). Depending on the landscape, some routines has different parameters.
  generate, #define the output indices ("index") or spectrograms ("spec") (default = "index")
  slice.size, #how many slices a recording (in seconds) should be divided (default = 1). A slice must not be greater than 90s.
  img.dim, #if generate = "spec", it defines the image size (default = c(1366, 758))
  palette, #if generate = "spec", it defines the image color palette (default = spectro.colors)
  start.parallel #if TRUE, process file in parallel mode (default = FALSE)
)

Python code has the class AcousticIndices with the following methods that have the same meaning of previous R functions:

process_dir(
  source_path, 
  target_path = None, 
  aquatic = False, 
  slice_size = 1)
process_file(
  path, 
  aquatic = False, 
  slice_size = 1)

Results

Functions generate a feature set with 35 values and one identifier for each defined slice.size. Indices are represented by a single value and the functions also return NDSI components (anthrophony and biophony), the mean and standard deviation of the PSD, and 12 MFCCs. If you have stereo recordings, functions return the mean of the two channels.

Functions return a matrix of features and generate a CSV file inside target.path. Spectrograms are also generated inside a target.path subdir.

Function process.dir generates log files to track the process. If index calculation stops for any reason, just reinvoke that function and the calculation will be resumed due to log files.

Examples

source("indices.R")

#Process all recordings inside a directory and returns a matrix with the results
files = process.dir("/home/user/recordings")

#Process a specific recording and returns a matrix with the results
one_file = process.file("/home/user/recordings/file.wav")

#Divides recordings into 2-second clips before calculating indices 
process.dir("/home/user/recordings", slice.size = 2)

#Divides recording into 2-second clips before generating spectrograms
process.dir("/home/user/recordings", slice.size = 2, generate = "spec")
from indices import AcousticIndices

extractor = AcousticIndices()

files = extractor.process_dir("/home/user/recordings")

one_file = extractor.process_file("/home/user/recordings/file.wav")

Contact

About

Tools for helping the extraction of features from an audioset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published