Skip to content

com-480-data-visualization/WhereWereWhales

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Project of Data Visualization (COM-480)

Student's name SCIPER
Camille Challier 311020
Cyrill Strassburg 377372
Eglantine Vialaneix 324293

Milestone 1 β€’ Milestone 2 β€’ Milestone 3

Milestone 1 (21st March, 5pm)

10% of the final grade

This is a preliminary milestone to let you set up goals for your final project and assess the feasibility of your ideas. Please, fill the following sections about your project.

(max. 2000 characters per section)

Dataset

Find a dataset (or multiple) that you will explore. Assess the quality of the data it contains and how much preprocessing / data-cleaning it will require before tackling visualization. We recommend using a standard dataset as this course is not about scraping nor data processing.

Hint: some good pointers for finding quality publicly available datasets (Google dataset search, Kaggle, OpenSwissData, SNAP and FiveThirtyEight), you could use also the DataSets proposed by the ENAC (see the Announcements section on Zulip).

For this project, we are working with multiple datasets related to cetaceans. Their preprocessing steps are detailed in the EDA part below.

  • Info about Cetacean: Basic information of all cetaceans can be found in the wikipedia list of cetaceans. For each cetacean species, the list includes information about family, genus, species scientific and common name, level of endangerment, where they live, size illustration, and a photograph. Additionally, each cetacean species has its own wikipedia page with images and additional information. The list has been converted to a pandas dataframe. Images can be accessed via the Wikimedia API.
  • Global sightings of Cetaceans: Data on cetacean sightings was downloaded from OBIS Seamap, a data center for various marine animals. Cetacean data from OBIS is originally sourced from HappyWhale and includes sightings spanning from 1972 to now. Attributes include GPS coordinates, species name, unique animal id, Group size, date of sighting, locality and environmental details.
  • Phylogenetic tree of Cetacean: From a paper published in May 2020 in Systematic Biology we could retrieve one of the latest phylogenetic trees of cetaceans. We plan to manually re-transcripted Figures S1 and S3 to display them in a more interactive and playful way along with the rest of the information from this project.
  • To assess the potential threats to cetacean survival, we explored multiple simple datasets covering climate disruption, ship strikes, and whaling activities:
    • Climate Disruption: Copernicus provides Global Monthly Average Sea Surface Temperatures (SST) Anomalies (deviations from long-term averages) from 1993 to 2021.
    • Ship Strikes: IWC Ship Strike Database records incidents of ship collisions with marine mammals since 1954.
    • Whaling Activities: International Whaling Commission (IWC) on direct whale catches since 1986, including catches per year, whale species, geographic area, nation, and the type of operation (Commercial, Aboriginal, Illegal,...)
    • Marine Protected Areas: The World Database on Protected Areas (WDPA), a comprehensive global database of marine and terrestrial protected areas. The WDPA is updated monthly and provides crucial insights into the distribution and extent of protected areas.

Problematic

Frame the general topic of your visualization and the main axis that you want to develop.

  • What am I trying to show with my visualization?
  • Think of an overview for the project, your motivation, and the target audience.

More than a century after the peak of commercial whaling most cetacean populations are still struggling to recover. According to a study published in May 2023 in the journal Conservation Biology, as of 2021, approximately 26% of whale, dolphin, and porpoise species are classified as threatened with extinction.

By creating a playful, engaging and interesting way of navigating information about modern cetaceans, this project aims to make information easily accessible and raise awareness about cetaceans, their phylogeny, their current global condition and the various threats they face.

Through our visualizations, we aim to:

  1. Global Overview: Provide an overview of cetaceans around the world, highlighting the species that are extinct or endangered, using the Red-List status for reference.
  2. Phylogenetic Tree: Present a phylogenetic tree to showcase the evolutionary relationships of cetaceans, highlighting extinct species and their connections to modern counterparts.
  3. Cetacean Sightings: Display sightings of cetaceans around the globe to help users understand where they live and their migration patterns. Additionally, we aim to compare these locations with protected marine areas and regions of high-risk threats to assess conservation efforts and potential dangers.
  4. Timeline of Threats: Illustrate the cumulative and ongoing threats to cetaceans, such as the impact of climate change on oceans, maritime traffic, pollution and plastic contamination, and hunting practices over time.

By presenting a comprehensive visualization of their global distribution, their history, and the cumulative impacts of human activities, we seek to inform the public about the critical state of cetacean populations. The target audience for this project includes environmental activists, marine biologists, educators, and most importantly the general public. By creating an engaging and interactive experience, we aim to captivate a broad audience and encourage a deeper understanding of the challenges cetaceans face, with the hope of fostering greater support for their protection.

Exploratory Data Analysis

Pre-processing of the data set you chose

  • Show some basic statistics and get insights about the data

1. Global info about Cetacean

Because this data will be retrieved by ourselves, its quality depends on our scraping methods. Wikipedia has a clean and standardized structure for cetaceans articles and our downloading mainly relies on it to keep a corresponding structure. As a proof of concept, a few images that were successfully retrieved are present in our repository and we show some examples below. The retrieval of other images (comparison in sizes with humans, endangered index, location in the world) and textual information is still in process.

Photograph of the animal Size comparison with human World location of the species
Atlantic Spotted Dolphin Blainville's Beaked Whale Baird's Beaked Whale

2. Sighting Data

The data processing was primarily performed during the download phase using the OBIS Seamap website, where we filtered for the relevant cetacean species. Two datasets were extracted, each containing similar information but with different column names. To ensure consistency, these datasets were concatenated after aligning their column names and formats. This extracted dataset encompasses records of over 275191 sightings. Some location information, such as country and water zone, is missing for some sightings, but since we have the coordinates, we might not need it or could extract it if necessary. For more details on the exploratory data analysis, refer to the EDA_location.ipynb notebook.

As a really large number of events are present in the dataset and in order to visualize the locations of sightings on a world map, we group sightings of similar species and locations. We will determine whether this approach is necessary for the final website as well.

Note that the marker size represents the number of animals observed at this location.

3. Multiple Threats: Challenges to Cetacean Survival

A- Climate disruption

Anomalies represent deviations from long-term averages. For example, the January 2021 anomaly is calculated as the difference between the sea surface temperature in January 2021 and the climatological average for all January months within the dataset's time span.
B- Maritime traffic

C- Hunting

D- Protected Areas

Related work

  • What others have already done with the data?
  • Why is your approach original?
  • What source of inspiration do you take? Visualizations that you found on other websites or magazines (might be unrelated to your data).
  • In case you are using a dataset that you have already explored in another context (ML or ADA course, semester project...), you are required to share the report of that work to outline the differences with the submission for this class.

Related and existing work

  • Phylogenetic Tree of Cetaceans

    • OneZoom provides an interactive tree of life visualization, inspiring our effort to create a phylogenetic tree specifically for Cetaceans, incorporating additional study features.
  • Global Sightings of Cetaceans

    • OBIS Seamap offers a heatmap of species distribution presence across the world map, allowing users to filter species and examine concentration levels.
    • Whales of Guerrero labs has used this dataset to track North Pacific humpback whale movements.
  • Timeline of Threats: To represent threats to cetaceans, we plan to implement interactive line plots or 2D world maps, allowing users to explore many variables over time. Several visualizations have already been made using the datasets previously mentionned on topics such as: Sea Temperature: Sea Surface Temperature line plot, NASA - 2D Temperature Map; Ship Strikes Evolution: Ship Strikes Evolution Report; Whaling Activities: 2D Map.

Originality

Our approach integrates interactivity, enabling users to adjust parameters, highlight individual species with color coding, and explore seasonal migration patterns. Another unique aspect of our approach is the integration of conservation-challenged animals and protected marine areas, linking sightings with conservation efforts and highlighting the relationship between cetacean presence and protected regions as well as their evolutionary tree and how the different species of cetaceans differ from each other. By combining these elements into a single, integrated visualization, we highlight how various threats collectively impact cetacean populations, offering a more comprehensive understanding of their conservation needs.

Inspiration

  • Phylogenetic Tree of Cetaceans

Similarly to OneZoom , we would like to create an interactive tree of the cetacean life displaying various information alongside by hovering or clicking on a leaf of their choice.

  • Global Sightings of Cetaceans

We aim to develop a 3D Navigable Globe for visualizing cetacean sightings and conservation efforts. Notable JavaScript-based visualizations like Populated Place Visualization in D3.js and Population Heatmap in React showcase interactive 3D globes displaying global datasets, which could be adapted for our project.

Milestone 2 (18th April, 5pm)

pdf file

Milestone 3 (30th May, 5pm)

Deliverable:

  • πŸ“˜ Process Book (PDF)
    A detailed overview of our project goals, design process, methodology, and evaluation.

  • πŸŽ₯ Presentation Video
    A short video walkthrough showcasing our data visualization project and key insights.

  • 🌐 Final Project Website
    Explore the live interactive visualization and learn more about our findings.

Organisation:

.
β”œβ”€β”€ README.md
β”œβ”€β”€ requirements.txt
β”‚
β”œβ”€β”€ data/                     # Raw and processed datasets
β”‚
β”œβ”€β”€ Milestone_1/              # Initial exploratory data analysis and data extraction
β”‚Β Β  β”œβ”€β”€ figures_EDA/          # PNG images from EDA
β”‚Β Β  β”œβ”€β”€ EDA_location.py       # Map visualization and threat exploratory analysis
β”‚Β Β  β”œβ”€β”€ data_extractor.py     # Scripts for extracting map data
β”‚Β Β  β”œβ”€β”€ images.ipynb          # Notebook for scraping cetacean images from Wikipedia
β”‚Β Β  β”œβ”€β”€ utils.py              # Utility functions
β”‚Β Β  └── wikitables.ipynb      # Wikipedia data scraping and tree of life exploration
β”‚
β”œβ”€β”€ Milestone_2/              # Second milestone deliverables
β”‚Β Β  β”œβ”€β”€ tree_of_life/         # Figures related to the tree of life
β”‚Β Β  β”œβ”€β”€ wiki_images/          # PNG images of cetaceans
β”‚Β Β  └── Milestone_2.pdf       # Milestone 2 report
β”‚
β”œβ”€β”€ Milestone_3/              # Final milestone deliverables
β”‚Β Β  β”œβ”€β”€ process book.pdf      # Process book document
β”‚Β Β  └── cetacea_short.mp4          # Presentation video or screencast files

Project Technical Setup and Usage

The website was built using JavaScript, CSS, HTML, and D3.js.
Find the implementation here: https://github.com/eglantine-vialaneix/WhereWereWhalesLFS

Data processing and exploratory data analysis (EDA) were performed using Python.

Intended Usage

  • Explore the datasets and understand the data cleaning and extraction process through the provided scripts and notebooks.
  • Review milestone reports and visualizations to follow project progress and insights.
  • Use the interactive website to explore whale sightings, threats, and species profiles.

Late policy

  • < 24h: 80% of the grade for the milestone
  • < 48h: 70% of the grade for the milestone

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •