Skip to content

Commit 49a568e

Browse files
authored
Merge pull request #21 from quantifyearth/mwd-pip-package
Refactor project for pip
2 parents cbc724a + 841b118 commit 49a568e

21 files changed

+284
-118
lines changed

.github/workflows/python-package.yml

Lines changed: 40 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -19,26 +19,54 @@ jobs:
1919
python-version: ["3.10"]
2020

2121
steps:
22-
- uses: actions/checkout@v4
22+
- uses: actions/checkout@v5
23+
2324
- name: Set up Python ${{ matrix.python-version }}
24-
uses: actions/setup-python@v3
25+
uses: actions/setup-python@v5
2526
with:
2627
python-version: ${{ matrix.python-version }}
27-
- name: Install system
28+
29+
- name: Install system dependencies
2830
run: |
2931
apt-get update -qqy
30-
apt-get install -y git python3-pip
31-
- name: Install dependencies
32+
apt-get install -y git python3-pip r-base libtirpc-dev
33+
34+
- name: Install R dependencies
35+
run: R -e "install.packages(c('lme4', 'lmerTest'), repos='https://cran.rstudio.com/')" || true
36+
37+
- name: Install Python dependencies
3238
run: |
3339
python -m pip install --upgrade pip
34-
python -m pip install gdal[numpy]==3.11.0
35-
python -m pip install -r requirements.txt
40+
python -m pip install 'gdal[numpy]==3.11.0'
41+
python -m pip install -e .[dev,validation]
42+
3643
- name: Lint with pylint
37-
run: |
38-
python3 -m pylint .
44+
run: python3 -m pylint .
45+
46+
- name: Type checking
47+
run: python3 -m mypy .
48+
3949
- name: Test with pytest
50+
run: python3 -m pytest --cov=aoh -vv
51+
52+
- name: Test CLI tools
4053
run: |
41-
python3 -m pytest
42-
- name: Type checking
54+
aoh-calc --help
55+
aoh-habitat-process --help
56+
aoh-species-richness --help
57+
aoh-endemism --help
58+
aoh-collate-data --help
59+
aoh-validate-prevalence --help
60+
61+
- name: Test package imports
4362
run: |
44-
python3 -m mypy .
63+
python3 -c "import aoh; print('✅ Package imports work')"
64+
python3 -c "from aoh import tidy_data; print('✅ Core functions work')"
65+
python3 -c "from aoh.summaries import species_richness; print('✅ Summaries work')"
66+
python3 -c "from aoh.validation import collate_data; print('✅ Validation works')"
67+
68+
- name: Build pip package
69+
run: python3 -m build
70+
71+
- name: Check with twine
72+
run: twine check dist/*

Dockerfile

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -4,25 +4,25 @@ RUN apt-get update -qqy && \
44
apt-get install -qy \
55
git \
66
python3-pip \
7+
r-base \
8+
libtirpc-dev \
79
&& rm -rf /var/lib/apt/lists/* \
810
&& rm -rf /var/cache/apt/*
911

12+
RUN R -e "install.packages(c('lme4', 'lmerTest'), repos='https://cran.rstudio.com/')"
13+
1014
# You must install numpy before anything else otherwise
1115
# gdal's python bindings are sad. Pandas we pull out as its slow
1216
# to build, and this means it'll be cached
1317
RUN rm /usr/lib/python3.*/EXTERNALLY-MANAGED
14-
RUN pip install --upgrade pip
1518
RUN pip install numpy
1619
RUN pip install gdal[numpy]==3.11.0
1720
RUN pip install pandas
1821

19-
COPY requirements.txt /tmp/
20-
RUN pip install -r /tmp/requirements.txt
21-
2222
COPY ./ /root/aoh
2323
WORKDIR /root/aoh
24+
RUN pip install -e .[dev,validation]
2425

25-
RUN pylint *.py
26-
RUN python -m pytest ./tests
27-
28-
RUN chmod 755 *.py
26+
RUN python3 -m pylint .
27+
RUN python3 -m mypy .
28+
RUN python3 -m pytest .

README.md

Lines changed: 106 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,50 @@
11
# AOH Calculator
22

3-
This repository contains code for making Area of Habitat (AOH) rasters from a mix of data sources, following the methodology described in [Brooks et al](https://www.cell.com/trends/ecology-evolution/fulltext/S0169-5347(19)30189-2) and adhearing to the IUCN Redlist Technical Working Group guidance on AoH production. This work is part of the [LIFE biodiversity map](https://www.cambridge.org/engage/coe/article-details/660e6f08418a5379b00a82b2) work at the University of Cambridge. It also contains some scripts for summarising AOH data into maps of species richness and species endemism.
3+
This repository contains code for making Area of Habitat (AOH) rasters from a mix of data sources, following the methodology described in [Brooks et al](https://www.cell.com/trends/ecology-evolution/fulltext/S0169-5347(19)30189-2) and adhering to the IUCN Redlist Technical Working Group guidance on AoH production. This work is part of the [LIFE biodiversity map](https://www.cambridge.org/engage/coe/article-details/660e6f08418a5379b00a82b2) work at the University of Cambridge. It also contains some scripts for summarising AOH data into maps of species richness and species endemism.
4+
5+
## Installation
6+
7+
The AOH Calculator is available as a Python package and can be installed via pip:
8+
9+
```bash
10+
pip install aoh
11+
```
12+
13+
This provides both command-line tools and a Python library for programmatic use.
14+
15+
For validation tools that require R, install with the validation extra:
16+
17+
```bash
18+
pip install aoh[validation]
19+
```
20+
21+
### Prerequisites
22+
23+
You'll need GDAL installed on your system. The Python GDAL package version should match your system GDAL version. You can check your GDAL version with:
24+
25+
```bash
26+
gdalinfo --version
27+
```
28+
29+
Then install the matching Python package:
30+
31+
```bash
32+
pip install gdal[numpy]==YOUR_VERSION_HERE
33+
```
34+
35+
### Library Usage
36+
37+
You can also use AOH Calculator as a Python library:
38+
39+
```python
40+
import aoh
41+
from aoh import tidy_data
42+
from aoh.summaries import species_richness
43+
from aoh.validation import collate_data
44+
45+
# Use core functions programmatically
46+
# See function documentation for parameters
47+
```
448

549
To generate a set of AOH rasters you will need:
650

@@ -14,41 +58,38 @@ For examples on how to run the code see the docs directory.
1458

1559
This project makes heavy use of [Yirgacheffe](https://github.com/quantifyearth/yirgacheffe) to do the numerical work, and the code in this repository is mostly for getting the data to feed to yirgacheffe. The advantages of using Yirgacheffe are that it hides all the offsetting required for the math to keep the AoH logic simple, deals with the archaic GDAL API bindings, and uses map chunking to mean progress can made with minimal memory footprints despite some base map rasters being 150GB and up.
1660

17-
# Scripts
61+
# Command Line Tools
1862

19-
## aohcalc.py
63+
## aoh-calc
2064

21-
This is the main script designed to calculate the AOH of a single species.
65+
This is the main command designed to calculate the AOH of a single species.
2266

23-
```SystemShell
24-
$ python3 ./aohcalc.py -h
25-
usage: aohcalc.py [-h] --habitats HABITAT_PATH
26-
--elevation-min MIN_ELEVATION_PATH
27-
--elevation-max MAX_ELEVATION_PATH
28-
[--area AREA_PATH]
29-
--crosswalk CROSSWALK_PATH
30-
--speciesdata SPECIES_DATA_PATH
31-
[--force-habitat]
32-
--output_directory OUTPUT_PATH
67+
```bash
68+
$ aoh-calc --help
69+
usage: aoh-calc [-h] --habitats HABITAT_PATH
70+
--elevation-min MIN_ELEVATION_PATH
71+
--elevation-max MAX_ELEVATION_PATH [--area AREA_PATH]
72+
--crosswalk CROSSWALK_PATH --speciesdata SPECIES_DATA_PATH
73+
[--force-habitat] --output OUTPUT_PATH
3374

3475
Area of habitat calculator.
3576

3677
options:
3778
-h, --help show this help message and exit
3879
--habitats HABITAT_PATH
39-
set of habitat rasters
80+
Directory of habitat rasters, one per habitat class.
4081
--elevation-min MIN_ELEVATION_PATH
41-
min elevation raster
82+
Minimum elevation raster.
4283
--elevation-max MAX_ELEVATION_PATH
43-
max elevation raster
44-
--area AREA_PATH optional area per pixel raster. Can be 1xheight.
84+
Maximum elevation raster
85+
--area AREA_PATH Optional area per pixel raster. Can be 1xheight.
4586
--crosswalk CROSSWALK_PATH
46-
habitat crosswalk table path
87+
Path of habitat crosswalk table.
4788
--speciesdata SPECIES_DATA_PATH
48-
Single species/seasonality geojson
49-
--force-habitat If set, don't treat an empty habitat layer layer as per IRTWG.
50-
--output OUTPUT_PATH directory where area geotiffs should be stored
51-
89+
Single species/seasonality geojson.
90+
--force-habitat If set, don't treat an empty habitat layer layer as
91+
per IRTWG.
92+
--output OUTPUT_PATH Directory where area geotiffs should be stored.
5293
```
5394
5495
To calculate the AoH we need the following information:
@@ -67,46 +108,42 @@ To calculate the AoH we need the following information:
67108
- Force habitat: An optional flag that means rather than following the IUCN RLTWG guidelines, whereby if there is zero area in the habitat layer after filtering for species habitat preferneces we should revert to range, this flag will keep the result as zero. This is to allow for evaluation of scenarios that might lead to extinction via land use chnages.
68109
- Output directory - Two files will be output to this directory: an AoH raster with the format `{id_no}_{seasonal}.tif` and a manifest containing information about the raster `{id_no}_{seasonal}.json`.
69110
70-
## habitat_process.py
111+
## aoh-habitat-process
71112
72-
Whilst for terrestrial AOH calculations there is normally just one habitat class per pixel, for other realms like marine (which is a 3D space) this isn't the case, and so for these realms there is a requirement . To allow this code to work for all realms, we must split out terrestrial habitat maps that combine all classes into a single raster. To assist with this, we provide the `habitat_process.py` script, which also allows for rescaling and reprojecting.
113+
Whilst for terrestrial AOH calculations there is normally just one habitat class per pixel, for other realms like marine (which is a 3D space) this isn't necessarily the case. To allow this package to work for all realms, we must split out terrestrial habitat maps that combine all classes into a single raster into per layer rasters. To assist with this, we provide the `aoh-habitat-process` command, which also allows for rescaling and reprojecting.
73114

74-
```SystemShell
75-
$ python3 ./habitat_process.py -h
76-
usage: habitat_process.py [-h]
77-
--habitat HABITAT_PATH
78-
--output OUTPUT_PATH
79-
--scale PIXEL_SCALE
80-
[--projection TARGET_PROJECTION]
81-
[-j PROCESSES_COUNT]
115+
```bash
116+
$ aoh-habitat-process --help
117+
usage: aoh-habitat-process [-h] --habitat HABITAT_PATH --scale PIXEL_SCALE
118+
[--projection TARGET_PROJECTION]
119+
--output OUTPUT_PATH [-j PROCESSES_COUNT]
82120
83121
Downsample habitat map to raster per terrain type.
84122
85123
options:
86124
-h, --help show this help message and exit
87125
--habitat HABITAT_PATH
88126
Path of initial combined habitat map.
89-
--output OUTPUT_PATH Destination folder for raster files.
90-
--scale PIXEL_SCALE Optional output pixel scale value, otherwise same as source.
127+
--scale PIXEL_SCALE Optional output pixel scale value, otherwise same as
128+
source.
91129
--projection TARGET_PROJECTION
92130
Optional target projection, otherwise same as source.
131+
--output OUTPUT_PATH Destination folder for raster files.
93132
-j PROCESSES_COUNT Optional number of concurrent threads to use.
94133
```
95134

96-
# Summaries
135+
# Summary Tools
97136

98-
In the `summaries` directory you will find two scripts for taking a set of AOH maps and generating a single summary that can be useful for inferring things about a group of maps.
137+
These commands take a set of AOH maps and generate summary statistics useful for analysing groups of species.
99138

100-
## Species richness
139+
## aoh-species-richness
101140

102-
The species richness map is just an indicator of how many species exist in a given area. It takes each AOH map, converts it to a boolean layer to indicate precense, and then sums the resulting boolean raster layers to give you a count in each pixel of how many species are there.
141+
The species richness map is just an indicator of how many species exist in a given area. It takes each AOH map, converts it to a boolean layer to indicate presence, and then sums the resulting boolean raster layers to give you a count in each pixel of how many species are there.
103142

104-
```SystemShell
105-
$ python3 ./summaries/species_richness.py -h
106-
usage: species_richness.py [-h]
107-
--aohs_folder AOHS
108-
--output OUTPUT
109-
[-j PROCESSES_COUNT]
143+
```bash
144+
$ aoh-species-richness --help
145+
usage: aoh-species-richness [-h] --aohs_folder AOHS --output OUTPUT
146+
[-j PROCESSES_COUNT]
110147
111148
Calculate species richness
112149
@@ -117,17 +154,15 @@ options:
117154
-j PROCESSES_COUNT Number of concurrent threads to use.
118155
```
119156

120-
## Endemism
157+
## aoh-endemism
121158

122-
Endemism is an indicator of how much and area of land contributes to a species overall habitat: for a species with a small area of habitat then each pixel is more precious to it than it is for a species with a vast area over which they can be found. The endemism map takes the set of AoHs and the species richness map to generate, and for each species works out the proportion of its AoH is within a given pixel, and the calculates the geometric mean per pixel across all species in that pixel.
159+
Endemism is an indicator of how much an area of land contributes to a species overall habitat: for a species with a small area of habitat then each pixel is more precious to it than it is for a species with a vast area over which they can be found. The endemism map takes the set of AoHs and the species richness map to generate, and for each species works out the proportion of its AoH is within a given pixel, and calculates the geometric mean per pixel across all species in that pixel.
123160

124-
```SystemShell
125-
$ python3 ./summaries/endemism.py -h
126-
usage: endemism.py [-h]
127-
--aohs_folder AOHS
128-
--species_richness SPECIES_RICHNESS
129-
--output OUTPUT
130-
[-j PROCESSES_COUNT]
161+
```bash
162+
$ aoh-endemism --help
163+
usage: aoh-endemism [-h] --aohs_folder AOHS
164+
--species_richness SPECIES_RICHNESS --output OUTPUT
165+
[-j PROCESSES_COUNT]
131166
132167
Calculate species richness
133168
@@ -140,32 +175,35 @@ options:
140175
-j PROCESSES_COUNT Number of concurrent threads to use.
141176
```
142177

143-
# Validation
178+
# Validation Tools
179+
180+
In [Dahal et al](https://gmd.copernicus.org/articles/15/5093/2022/) there is a method described for validating a set of AoH maps. This is implemented as validation commands, and borrows heavily from work by [Franchesca Ridley](https://www.researchgate.net/profile/Francesca-Ridley).
144181

145-
In [Dahal et al](https://gmd.copernicus.org/articles/15/5093/2022/) there is a method described for validating a set of AoH maps. This is implemented in the validation directory, and borrows heavily from work by [Franchesca Ridley](https://www.researchgate.net/profile/Francesca-Ridley).
182+
## aoh-collate-data
146183

147-
Before running validation, the metadata provided for each AoH map must be collated into a single table using the following script:
184+
Before running validation, the metadata provided for each AoH map must be collated into a single table using this command:
148185

149-
```SystemShell
150-
$ python3 ./validation/collate_data.py -h
151-
usage: collate_data.py [-h] --aoh_results AOHS_PATH --output OUTPUT_PATH
186+
```bash
187+
$ aoh-collate-data --help
188+
usage: aoh-collate-data [-h] --aoh_results AOHS_PATH --output OUTPUT_PATH
152189
153190
Collate metadata from AoH build.
154191
155192
options:
156193
-h, --help show this help message and exit
157194
--aoh_results AOHS_PATH
158-
Path to directory of all the AoH outputs.
195+
Path of all the AoH outputs.
159196
--output OUTPUT_PATH Destination for collated CSV.
160197
```
161198

162-
## Model validation
199+
## aoh-validate-prevalence
163200

164-
To run the model validation use the following script:
201+
To run the model validation use this command:
165202

166-
```SystemShell
167-
$ python3 ./validation/validate_map_prevalence.py -h
168-
usage: validate_map_prevalence.py [-h] --collated_aoh_data COLLATED_DATA_PATH --output OUTPUT_PATH
203+
```bash
204+
$ aoh-validate-prevalence --help
205+
usage: aoh-validate-prevalence [-h] --collated_aoh_data COLLATED_DATA_PATH
206+
--output OUTPUT_PATH
169207
170208
Validate map prevalence.
171209
@@ -177,3 +215,5 @@ options:
177215
```
178216

179217
This will produce a CSV file listing just the AoH maps that fail model validation.
218+
219+
**Note:** The validation tools require R to be installed on your system with the `lme4` and `lmerTest` packages.

aoh/__init__.py

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
"""
2+
AOH Calculator - A library for calculating Area of Habitat for species distribution mapping.
3+
4+
This package provides tools for:
5+
- Calculating Area of Habitat from species range and habitat data
6+
- Processing habitat data for species analysis
7+
- Species richness and endemism calculations
8+
- Validation of habitat maps
9+
"""
10+
11+
from pathlib import Path
12+
13+
import tomli as tomllib
14+
15+
from .cleaning import tidy_data
16+
17+
try:
18+
from importlib import metadata
19+
__version__: str = metadata.version(__name__)
20+
except ModuleNotFoundError:
21+
pyproject_path = Path(__file__).parent.parent / "pyproject.toml"
22+
with open(pyproject_path, "rb") as f:
23+
pyproject_data = tomllib.load(f)
24+
__version__ = pyproject_data["project"]["version"]
25+
26+
# Only export basic utilities by default
27+
# Heavy dependencies are available via explicit imports
28+
__all__ = [
29+
"tidy_data"
30+
]

aohcalc.py renamed to aoh/aohcalc.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,7 @@
77
from pathlib import Path
88
from typing import Dict, List, Optional, Set, Union
99

10-
# import pyshark # pylint: disable=W0611
11-
import numpy as np
1210
import pandas as pd
13-
import yirgacheffe.operators as yo # type: ignore
1411
from yirgacheffe.layers import RasterLayer, VectorLayer, ConstantLayer, UniformAreaLayer # type: ignore
1512
from geopandas import gpd # type: ignore
1613
from alive_progress import alive_bar # type: ignore
File renamed without changes.

aoh/py.typed

Whitespace-only changes.

0 commit comments

Comments
 (0)