Skip to content

Web app for visualizing language embeddings in 3D space using the novel MCPM metric.

Notifications You must be signed in to change notification settings

harryhancock/PolyGlot_AcousticBiologists

 
 

Repository files navigation

GainForest EcoHackathon - Team: Acoustic Biologists 🦜

Check out the app here: https://harryhancock.github.io/PolyGlot_AcousticBiologists/

This hackathon project is an extension of the Polyglot app, part of the PolyPhy toolkit of network-inspired data science tools (for background on the PolyPhy hub, see here). We modify the original app to display a point cloud of bird songs data (with audio playback available!

Background and Methodology

Functionally, the app enables domain experts to better understand the representation space of bioacoustic samples. In particular, we compute 128-dimensional vector representations of birds songs from the Amazon rainforest using the MFCC coefficients. Of course, such a representation space is impossible for humans to visualize. Instead, we rely on UMAP to create 3D vector representations, which are displayed in an interactive point cloud. However, embedding methods cannot preserve all the information in the original high-dimensional space. Thus, we rely on the notion of "anchor points" to recover some of the information from the high-dimensional space. Read on for a deeper dive into anchor points!

Anchor Points and Recovering High Dimensional Information

Anchor points are a randomly-chosen subset of the original dataset. For each of these points, we take their 128-dimensional vector representation and compute the Euclidean similarities between that anchor point and every other point in the dataset.

Let's get a little more concrete. Denote the entire dataset of 128-dimensional vectors as {D} and our set of anchor points as {A}.

For each anchor point p in {A}
   For each point x in {D - p}  #ie., the entire dataset except the current anchor
      Compute euclidean_dist(p, x)

In our case we had roughly 3500 points in total and just 350 anchor points. Thus, we end up computing the Euclidean distance roughly 350 * 3499 = 1,224,650 times!

So, what's the point of this? Well, in the app, users can actually switch between any anchor point. On a particular anchor point, the entire 3D point cloud is colored such that points with high Euclidean similarity to the anchor point are brighter (red/pink) and the points with lower Euclidean similarity to the current anchor point are darker (purple/blue). In other words, users can see exactly where the most similar points (the bright red/pink) are in the embedding. This lets us peak into the mystical 128-dimensional space, all while remaining in our comfortable 3D universe!

How to Use & Features

1. Double click any anchor point (yellow points) to change the anchor (you will see the rest of the point cloud change color).

Screen Shot 2024-06-02 at 12 02 00 PM

2. If you find a point you want to listen to, hit Control to freeze the tooltip in place. Hit Control again to go back to hovering. Note: If the tooltip does not go away after pressing Control, try clicking outside of the play/pause button first and hitting Control again.

Screen Shot 2024-06-02 at 11 47 58 AM

3. Press and hold Shift to see only the anchor points (all other points are white). If you wish to disable this feature (e.g., to take a screenshot) go to "Visual Parameters" > Dim When Shift and uncheck the box.

Screen Shot 2024-06-02 at 11 52 20 AM

4. In case of colorblindness or trouble disinguishing the colors in the point cloud, you can also use the "Filter Euclidean" slider to filter out least similar points (so only the most similar will be visible)

Screen Shot 2024-06-02 at 11 55 25 AM

5. Open the "Search/Jump to Anchor" folder to search for any species you'd like. The "Zoom to Point" dropdown will let you change the center of the point cloud to the point you choose (just zoom in and you will see it in green!). The "Select Anchor" dropdown is just another way to change the anchor point (in case you do not know the location of your desired point and cannot double click it).

Screen Shot 2024-06-02 at 11 57 00 AM

Data Explainer

In case the data in the app is useful to anyone, we provide an explainer here on how it works. First, open the "data" folder contained in this repo. You will find

  1. CSV file (bird_songs_128_dim.csv): this contains the original 128-dimensional MFCC embeddings

  2. Python Notebook: this contains all the code used to process our data (compute MFCC embeddings, compute Euclidean similarities, etc)

  3. Bird Songs Folder (bird_songs_euclidean):

    a. There are 350 .txt files - the number in the name of the file represents the anchor point. The rows inside the file contain the Euclidean similarity scores for that anchor point

    b. There is a full_data file which simply contains the 3D coordinates and names of each point (plus a 0/1 boolean to indicate whether a point is an anchor point)

    c. There is a meta-main.txt file which simply contains the names of each anchor point

    d. There is a metadata-secondary.csv file which contains all the metadata per point (name, location, duration, url)

Authors

This version of Polyglot was extended as part of the Acoustic Biologist's team submission to the GainForest Ecohackathon in 2024.

Original Credits

This version of Polyglot was extended as part of Kiran Deol's 2023 Google Summer of Code project, mentored by Oskar Elek and Jasmine Otto and is hosted as part of PolyPhy hub of bio-inspired data science tools.

This web visualization tool was originally created by a team of researchers at University of California, Santa Cruz, Dept. of Computational Media:

This work was published as Hongwei Zhou's M.S. thesis.

A version of the original work was published in 2020 IEEE 5th Workshop on Visualization for the Digital Humanities (VIS4DH)

About

Web app for visualizing language embeddings in 3D space using the novel MCPM metric.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • JavaScript 87.2%
  • Jupyter Notebook 9.9%
  • HTML 2.8%
  • Other 0.1%