Skip to content

eceo-epfl/EcoWikiRS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

🐸 🌿 EcoWikiRS: Learning Ecological Representations of Satellite Images from Weak Supervision with Species Observations and Wikipedia

Valerie Zermatten ORCID logo, Javiera Castillo-Navarro ORCID logo,Pallavi Jain ORCID logo, Devis TuiaORCID logo Diego Marcos ORCID logo

ArXiv DOI

News

January 2026 : EcoWikiRS is available on HuggingFace.

April 2025: 🎉 🎉 EcoWikiRS was accepted at the EARTHVISION 2025 Workshop in conjunction with the Computer Vision and Pattern Recognition (CVPR) 2025 Conference.

How to cite this work:

@InProceedings{Zermatten_2025_WikiRS,
    author    = { Zermatten, Valerie and Castillo-Navarro, Javiera and Jain, Pallavi and Tuia, Devis and Marcos, Diego},
    title     = {EcoWikiRS: Learning Ecological Representations of Satellite Images from Weak Supervision with Species Observations and Wikipedia},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
    month     = {June},
    year      = {2025},
    pages     = {00-00}
}

Code requirements

We use python 3.10 with pytorch 2.2.0 and cuda 12.1.

Required Python packages are listed in the environment.yml, which can be used to build a conda environment by following the instructions below :

conda env create --file environment.yml python==3.10
conda activate wikirs

Model training

All the arguments are described in more detail in the argument-parser function in the utils/argparser.py file. The following command line is an example to launch an experiement with our best model :

# train model with WINCEL loss, SkyCLIP pretrained model : 
python train_multi_text.py --criterion WINCEL --model SkyCLIP 

More training options are provided in the run.sh file.

Overview

We propose a method to learn ecological properties of aerial images by learning an alignment with species habitat descriptions.

  • We release the EcoWikiRS dataset, composed of triplets:
    • high-resolution aerial images (50cm, RGB bands)
    • a list of species observations collected from GBIF, geolocated within the footprint of the aerial image.
    • sentences describing the habitat of the observed species, extracted from the corresponding Wikipedia article.

Overview

  • We propose WINCEL, a weighted version of the InfoNCE loss. WINCEL aims to identify text passages that are relevant to the image from the descriptions. WINCEL filters out text that describes properties that are specific only to part of the species’ niche or are irrelevant to a specific image.

Visual summary

Formally, WINCEL is computed as follows :

$L_{WINCEL}(V_n,G_n) = - \log \frac{ \exp{ ( V_n \cdot G_n / \tau} )}{ \sum_{j=1}^{N} \exp{( V_n \cdot G_j / \tau)} }$

where

$G_n = \sum_{i=0}^K \sigma( V_n \cdot T_{n,i} / \tau) \cdot T_{n,i}$

We evaluate our approach in the task of ecosystem zero-shot classification by following the habitat definitions from the European Nature Information System (EUNIS). Our results show that our approach helps in understanding RS images in a more ecologically meaningful manner.

Visual results

We generate visual features with both the pretrained and the fine-tuned SkyCLIP model and plot the cross-modal similarity on the surface of Switzerland (one image of 100 m by 100m per km2).

temperature

For plots (b), (c) and (d), we observe that the maps generated by the fine-tuned models correctly highlight the warmest region, plateau and coldest regions of Switzerland, which we assess using the temperature map (a) as a proxy.

We also have strong quantitative results: 🎉

  • The proposed WINCEL approach is better than InfoNCE for fine-tuning GeoRSCLIP, SkyCLIP and CLIP, illustrating its capacity to focus on more useful sentences during training.

  • We trained using different sets of passages from Wikipedia articles, including sentences from the habitat section, based on a set of keywords and random sentences. Passages from the “habitat” section consistently outperform the other approaches, highlighting the importance of quality over quantity for improving model performance.

Check out the paper to learn more!

EcoWikiRS dataset

The EcoWikiRS dataset can be retrieved from Zenodo:

DOI:10.1016/j.isprsjprs.2025.01.006

  • The EUNIS ecosystem type map for Switzerland, with a spatial resolution of 100m, comprises a final set of 25 habitats.

EUNIS Switzerland

  • Distribution of samples across EUNIS ecosystem types on a log scale.

Visual summary

  • Number of observations per species in our dataset after filtering. Most species were observed very few times, whereas a few species were observed over 1000 times.

Visual summary

  • The distribution of our training samples across Switzerland is split into training (60%), testing (30%) and validation (10%) sets following a block split approach with a size of 20 km.

Visual summary

Additional data information

  • More information on the EUNIS ecosystem type map is available on the European Environment Agency website : Ecosystem type map (all classes).

  • The raw aerial images with 10cm resolution from the swissIMAGE product can be openly downloaded from the swisstopo website

Contributing

If you are interested in contributing to one of the aforementioned points or working on a similar project and wish to collaborate, please reach out to ECEO.

For code-related contributions, suggestions or inquiries, please open a GitHub issue.

Code acknowledgments

We acknowledge the following code repositories that were useful throughout the EcoWikiRS project :

Other smaller resources are mentioned in the relevant code sections.

About

Code repository for the paper: EcoWikiRS:Learning Ecological Representation of Satellite Images from Weak Supervision with Species Observations and Wikipedia (Zermatten et al., 2025))

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors