-
Notifications
You must be signed in to change notification settings - Fork 23
Orca ML resources
Dave Thaler edited this page Dec 18, 2025
·
3 revisions
Here you'll find a list of efforts to develop machine learning models related to orca signals. And then there are some catch-all sections with other resources at the bottom...
A roughly-chronological list of open efforts to develop orca-related machine learning algorithms
-
Erika's talk (fall 2018 start; ML4All: Apr 29, 2019)
- "ML models overfit to the training set and are unable to generalize to other sources of data..."
- We need users to help us label data by adding markers of orca sounds to the stream or archived files.
- Val's recent contributions
- Preparations for analyzing FLAC files from EC2 instance
- SRKW call classifier trained on Ford-Osborne call catalog, deployed to process continuous WAV files from Orcasound Lab
- Classifier output for July 5, 2019, SRKW data set
- Abhishek and Jesse (Google 2019 Summer of Code project on Bigg's & Alaskan resident killer whales... & humpbacks)
- OrcaCNN data - data samples & student recruitment
- OrcaCNN project - working repository
- AK orca descriptions
- 300 field recordings with pod ID from the Gulf of Alaska collected by Dan Olsen of North Gulf Oceanic Society in Homer
- Pod-specific call catalog available for pod inference
- Dan's main goal: improve autodetection of killer whales vs humpbacks vs boat noise, improving on PAMGUARD whistle & moan results
- 2ndary goals: distinguish ecotype; find all occurrences of 1 specific call type from 1 pod.
- Hackathon steps forward with Orcasound guidance or data
- UW ocean data hack days (Valentina, Shima, Erica, +WhaleDr group)
- Nov 20, 2019 Ocean acoustic data hack day
- Jan 2019 SRKW RSN Oregon shelf search day
- Democracy lab hackathon ML groups
- Fall, 2018 hackathons
- SRKW call classifier - trained on Ford-Osborne call catalog samples; implemented with pyAudioAnalysis
- Winter-Spring, 2019 hackathons
- July, 2019: [UW+ group's Github workspace] (https://github.com/orcasound/orcadata/tree/master/hackathonJuly27) for the Summer Solve-a-thon
- Dec & Jan hackathons: Subarno-led ML (all orca signals)
- Fall, 2018 hackathons
- Microsoft 2019-2020 hackathon contributions
- Pod.Cast group
- OrcaHello group (with David Bain of Orca Conservancy)
- UW grad students
- Orcadata working directory
- Oct 26, 2019 (Jennifer, John, Wai Sing, Yuhao, Scott)
- Nov 16, 2019 (Jennifer, John, Wai Sing, Akash, Prakruti, Scott)
- Winter 2020 hack day?
- UW ocean data hack days (Valentina, Shima, Erica, +WhaleDr group)
- Canadian SRKW ML efforts
- Ocean Network Canada (ONC)
- 2020: Kristen Kanes plans to publish KW training data set via Science Data
- SRKW call category standardization work group may begin to meet quarterly in 2021?
- Department of Fisheries and Oceans (DFO)
- 2019 DFO/Google/Rainforest Connection SRKW ML activities (internal, closed-source with Rainforest Connection as of 3/2020)
- 2020 DFO/Meridian ML activities (planned, open-source)
- Initial remote only meetings in fall, 2020
- 2 year post-doc starts Jan, 2021
- Ocean Network Canada (ONC)
- Other
- 2021: Earth Species Project adds orca data to their library with labeled data stored via archive.org
- Praful and AI on the Beach (CA)
- EcoAcoustics (Tina Yack in CA) web site mentions "Development of an Acoustic Killer Whale Ecotype Classifier"
You may also create page for more details regarding your classifier, data labeling efforts, etc...
- Mobysound (Un/annotated data with signals from pinnipeds, mysticetes, and odontocetes [but not killer whales!])
- Watkins marine mammal sound database
- Labeling tools
Good precedents in building synergy between human and machine learning, especially with real-time audio data
- Birdnet - live bird call sound and classifier (browser-based, works best in Firefox, not Chrome)
- Listen to the Deep (LIDO) - real-time underwater sound & spectrogram with ML classifiers (Flash-based)
- For real-time Orcasound data (link to TBD guidance in various Orcasound repositories?)
- For archived Orcasound data (link to GSoC repo?)