-
Notifications
You must be signed in to change notification settings - Fork 23
Other training data: humpback whales
Over the winter of 2021-2022, Emily Vierling worked with Val and Scott Veirs as a Beam Reach "extern" (mostly-remote internship during COVID) to describe humpback signals within the open data from Orcasound Lab hydrophones (Haro Strait, WA, USA). Leveraging her previous training with Helena Symonds and Paul Spong of OrcaLab, listening to humpbacks in Johnstone Strait (BC, Canada), Emily developed a new online Haro Humpback dictionary and annotated thousands of signals in Orcasound open data.
Presented first to the DCLDE 2022 workshop by Emily in spring, 2022, the catalogue (catalog) contains 12 signals that she found to be most common in recordings made primarily in the late fall (presumably of male humpback whales beginning to vocalize prior to leaving the Salish Sea for tropical wintertime habitat in Hawaii and/or Mexico). In addition to the 2022 version that Emily published via Wordpress, the catalogue is shared via the signal-catalogue Github repo where we hope new versions of code can be maintained to provide a generic tool to the bioacoustic community for building online and offline signal catalogues.
The 12 signal types in version 1.0 of this humpback signal dictionary are:
- Whup
- Grunt
- Ascending Moan
- Descending Moan
- Moan
- Upsweep
- Trumpet
- Growl
- Creak
- Buzz
- Shriek
- Chirp
Emily's annotated data includes ~9,000 labels and is based on ~YY hours of audio data from 20ZZ-2021. These labeled data are part of Orcasound's AWS open data registry and are freely available under Orcasound's Creative Commons license (CC BY-NC-SA). Please attribute any use of the dictionary and/or labeled data to: "Emily Vierling, 2022, Orcasound" with a link back to orcasound.net.
- License/data sharing agreement: Creative Commons license (CC BY-NC-SA)
- Data owner / source: Orcasound
- Location: Acoustic Sandbox (S3 bucket, part of the AWS open data registry)
- No. files: 7
- File length: 12-180 MB
- Time range: 03 Oct 2021 - 28 Oct 2021
- Dataset size: 993 MB
- Description: Version 1 of Haro Humpback bioacoustic bouts for annotation by Emily Vierling (winter-spring 2022)
- Coordinates:
- Water depth: 8 meters
- Format: FLAC
- Codec: FLAC
- Channels: 1
- Sample Rate: 44.1 kHz
- License/data sharing agreement
- Annotator: Emily Vierling
- Method (manual or semi-manual): manual
- Detector (if applicable): N/A
- Filelist
- Granularity (call, file, encounter)
- Resolution (species, ecotype, call type, etc.): species, possibly individual(s) in some cases, depending on sightings data
- Columns (for each column provide description of content and possible values)
We are also sharing the training data that Val developed based on Emily's work. It includes fixed-window audio clips and associated spectrograms. Preliminary documentation of his efforts can be found in the signal-annotation Github repo.