Skip to content

Data visualization opportunities

Scott Veirs edited this page Dec 15, 2021 · 34 revisions

Here we offer Orcasound data products that data scientists and bioacousticians may enjoy visualizing and analyzing. Acoustic bouts define a time series of valuable signals in the Orcasound audio data streams. Human and machine detections indicate when community scientists or automated algorithms hear signals of interest while listening to Orcasound live audio streams.

Acoustic bouts

These are periods of time when interesting signals were detected within the Orcasound live audio streams. Some were detected by human listeners via the Orcasound web app (version 2 launched May, 2020); others were detected by automated algorithms, like OrcaHello (deployed in Sep/Oct, 2020). Each acoustic bout is ultimately defined first through a combination of human and machine detections, often contextualized by observations by local sighting networks, with start and end times usually extended to include weak signals and background noise conditions (before and after the event) through manual inspection by bioacoustic experts.

The primary focus is on SRKW bouts, but our archive includes bouts of signals from Bigg's killer whales, humpback whales, and other soniferous species of the Salish Sea.

Human detections

15 May 2021 data clip (~2000 rows)

Version 2 of the live-listening web app offered an interactive feature to community scientists: a button to select whenever they heard anything interesting. Free text annotations were stored along with a datetime stamp (the time at which the tag was submitted). The datetime stamps in the database (and exported snapshots below) are stored in the UTC time zone. (Note that in the administrator UI of orcasite the timestamps are converted to and displayed in the local time zone -- e.g. Pacific Standard Time, or PST, for the Orcasound network which is based in Washington State, U.S.A.)

This is a ~9-month clip of the Heroku-hosted PostgreSQL database that holds these human detections. It was generated and analyzed in a preliminary fashion during a DemocracyLab hackathon associated with Western Governors' University. The students have provided some tips on ingesting and processing these data in the hackathon project Google doc.

Machine detections

OrcaHello live inference system

  • OrcaHello dashboard
    • summarizes raw and moderated 60-second candidates
    • offers lists of tags and comments on positive vs negative candidates, with links to audio and spectrograms

16 Sep 2021 OrcaHello detection table snapshot (~3500 candidates)

  • Raw JSON (10 MB, 3476 rows)
  • Acquired quasi-manually using Microsoft Azure Storage Explorer [Cosmo DB Accounts (deprecated)]
  • (CosmoDB = aifororcasmetadatastore; predictions --> metadata --> Documents; query "SELECT * FROM c")
  • 3457 total candidates: 2639 moderated; 818 unmoderated.
  • 2280 false positives (86%); 347 true positives (13%); 12 unknown (1%).

API access

Un/moderated output from the real-time inference system (Cosmos database via Swagger)

Clone this wiki locally