|
| 1 | +# Agile Modeling with Perch-Hoplite |
| 2 | + |
| 3 | +This directory contains tools for Agile bird song modeling with Perch-Hoplite. |
| 4 | +These tools are intended to support embedding large audio datasets, adding |
| 5 | +labels and metadata, and training and evaluating audio classifiers. |
| 6 | + |
| 7 | +## Data Organization |
| 8 | + |
| 9 | +The embedding pipeline assumes that audio files are organized into directories, |
| 10 | +where each top-level directory within the `base_path` represents a |
| 11 | +**deployment**. For example, with a directory structure like: |
| 12 | + |
| 13 | +``` |
| 14 | +my_dataset/ |
| 15 | +├── deployment_A/ |
| 16 | +│ ├── recording01.wav |
| 17 | +│ └── recording02.wav |
| 18 | +├── deployment_B/ |
| 19 | +│ └── recording03.wav |
| 20 | +... |
| 21 | +``` |
| 22 | + |
| 23 | +`deployment_A` and `deployment_B` will be treated as deployment names. |
| 24 | + |
| 25 | +**Recordings** are identified by their relative path from the `base_path`, |
| 26 | +including the deployment directory (e.g., `deployment_A/recording01.wav`). |
| 27 | +This relative path serves as the `file_id` for recordings when linking |
| 28 | +metadata or annotations. |
| 29 | + |
| 30 | +## Adding metadata to the Hoplite Database |
| 31 | + |
| 32 | +The Agile embedding pipeline supports adding metadata to deployments and |
| 33 | +recordings in the Hoplite database. Metadata is loaded from CSV files |
| 34 | +located in the `base_path` of each `AudioSourceConfig`. |
| 35 | + |
| 36 | +To add metadata, create the following three files in the root of your dataset |
| 37 | +directory: |
| 38 | + |
| 39 | +1. **`metadata_description.csv`**: This file describes the metadata fields you |
| 40 | + want to add. It should contain the following columns: |
| 41 | + * `field_name`: The name of the metadata field (e.g., `habitat`). |
| 42 | + * `metadata_level`: The level at which metadata applies, either |
| 43 | + `deployment` or `recording`. |
| 44 | + * `type`: The data type of the field. Supported types are `str`, `float`, |
| 45 | + `int`, and `bytes`. |
| 46 | + * `description`: An optional description of the field. |
| 47 | + |
| 48 | +2. **`deployments_metadata.csv`**: This file contains metadata for each |
| 49 | + deployment. The first column must be the deployment identifier (which |
| 50 | + corresponds to the directory name if audio files are in |
| 51 | + `deployment/recording.wav` |
| 52 | + format), and subsequent columns should match `field_name`s from |
| 53 | + `metadata_description.csv` where `metadata_level` is `deployment`. |
| 54 | + |
| 55 | +3. **`recordings_metadata.csv`**: This file contains metadata for each |
| 56 | + recording. The first column must be the recording identifier (e.g. |
| 57 | + `deployment/recording.wav`), |
| 58 | + and subsequent columns should match `field_name`s from |
| 59 | + `metadata_description.csv` where `metadata_level` is `recording`. |
| 60 | + |
| 61 | +### Example |
| 62 | + |
| 63 | +**`metadata_description.csv`** |
| 64 | + |
| 65 | +```csv |
| 66 | +field_name,metadata_level,type,description |
| 67 | +deployment_name,deployment,str,Deployment identifier. |
| 68 | +habitat,deployment,str,Habitat type. |
| 69 | +latitude,deployment,float,Deployment latitude. |
| 70 | +file_id,recording,str,Recording identifier. |
| 71 | +mic_type,recording,str,Microphone type. |
| 72 | +``` |
| 73 | + |
| 74 | +**`deployments_metadata.csv`** |
| 75 | + |
| 76 | +```csv |
| 77 | +deployment_name,habitat,latitude |
| 78 | +DEP01,"forest",47.6 |
| 79 | +DEP02,"grassland",45.1 |
| 80 | +``` |
| 81 | + |
| 82 | +**`recordings_metadata.csv`** |
| 83 | + |
| 84 | +```csv |
| 85 | +file_id,mic_type |
| 86 | +DEP01/rec001.wav,"MicA" |
| 87 | +DEP01/rec002.wav,"MicB" |
| 88 | +DEP02/rec001.wav,"MicA" |
| 89 | +``` |
| 90 | + |
| 91 | +When `EmbedWorker.process_all()` is run, it will detect these files, load the |
| 92 | +metadata, and insert it into the database alongside new deployments and |
| 93 | +recordings. Metadata fields can then be accessed as attributes on `Deployment` |
| 94 | +and `Recording` objects returned by the database interface (e.g., |
| 95 | +`deployment.habitat`, `recording.mic_type`). |
| 96 | + |
| 97 | +## Adding Annotations |
| 98 | + |
| 99 | +If you have existing annotations for your audio data, Hoplite can ingest these |
| 100 | +during the embedding process. Annotations should be stored in CSV files named |
| 101 | +`annotations.csv` alongside your audio data. Each `annotations.csv` should |
| 102 | +contain columns for `recording` (the filename or file_id of the audio), |
| 103 | +`start_offset_s`, `end_offset_s`, `label`, and `label_type` ('positive', |
| 104 | +'negative', or 'uncertain'). When embeddings are generated, Hoplite will find |
| 105 | +any relevant annotations and add them to the database, associating them with the |
| 106 | +appropriate time windows. |
| 107 | + |
| 108 | +### Example |
| 109 | + |
| 110 | +**`annotations.csv`** |
| 111 | + |
| 112 | +```csv |
| 113 | +recording,start_offset_s,end_offset_s,label,label_type |
| 114 | +DEP01/rec001.wav,10.0,15.0,MyBird,positive |
| 115 | +DEP01/rec001.wav,20.0,25.0,OtherBird,negative |
| 116 | +DEP02/rec001.wav,5.0,10.0,MyBird,positive |
| 117 | +``` |
| 118 | + |
| 119 | +## Colab Notebooks |
| 120 | + |
| 121 | +This directory includes Colab notebooks to guide users through embedding audio, |
| 122 | +adding annotations, and training agile classifiers. |
| 123 | + |
| 124 | +These notebooks are designed for use in Google Colab and make use of interactive |
| 125 | +forms (e.g., dropdowns and text fields) via cell parameters (`#@param`). |
| 126 | +While developed for Colab, the notebooks are also compatible with standard |
| 127 | +Jupyter environments, although the interactive form elements will not be |
| 128 | +rendered. |
| 129 | + |
| 130 | +The notebooks provided are: |
| 131 | + |
| 132 | +* **`01_embed_audio.ipynb`**: This notebook guides you through the process of |
| 133 | + embedding audio files from a dataset using a specified pre-trained model |
| 134 | + (e.g., Perch v2, BirdNet) and saving them into a Hoplite database. It |
| 135 | + handles dataset configuration, database initialization, and running the |
| 136 | + embedding process. |
| 137 | +* **`02_agile_modeling.ipynb`**: This notebook focuses on the interactive |
| 138 | + modeling process. It allows you to search the embedding database using |
| 139 | + example audio, display search results, label data as positive or negative, |
| 140 | + and then train or retrain a simple linear classifier based on these labels. |
| 141 | + You can also use the trained classifier to run inference or perform |
| 142 | + margin-based sampling to find examples for further annotation. |
| 143 | +* **`03_call_density.ipynb`**: This notebook shows how to use Hoplite to |
| 144 | + compute aggregate call density statistics, which can act as an indicator |
| 145 | + of species abundance in many cases. |
| 146 | + (As described in: https://arxiv.org/abs/2402.15360) |
| 147 | +* **`99_migrate_db.ipynb`**: A utility notebook for migrating Hoplite |
| 148 | + databases created with `perch-hoplite < 1.0` to the format used by |
| 149 | + `perch-hoplite >= 1.0`. |
0 commit comments