Skip to content

Commit 3c8e47b

Browse files
Fix 8 critical bugs in nwpeval.py
- Fix compute_fss(): proper spatial rolling with dimension detection - Fix compute_bss(): correct climatology Brier Score formula - Fix compute_eds(): correct EDS formula with log ratios - Fix compute_rpss(): safe implementation for binary case - Fix compute_aev(): correct formula (1 - err_var/obs_var) - Fix compute_metrics(): updated FSS call with spatial_dims parameter - Remove commented-out GS code (duplicate of GSS) - Remove 'GS' from example1.py metrics list - Clarify harmonic/geometric mean docstrings
0 parents  commit 3c8e47b

20 files changed

+3708
-0
lines changed
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
name: Upload Python Package
2+
3+
on:
4+
push:
5+
branches:
6+
- main
7+
- release/*
8+
release:
9+
types: [published]
10+
11+
permissions:
12+
contents: read
13+
14+
jobs:
15+
deploy:
16+
runs-on: ubuntu-latest
17+
18+
steps:
19+
- uses: actions/checkout@v3 # updated version
20+
21+
- name: Set up Python
22+
uses: actions/setup-python@v3 # updated version
23+
with:
24+
python-version: '3.x'
25+
26+
- name: Install dependencies
27+
run: |
28+
python -m pip install --upgrade pip
29+
pip install build
30+
31+
- name: Build package
32+
run: python -m build
33+
34+
- name: Publish package to PyPI
35+
if: github.event_name == 'release' && github.event.action == 'published'
36+
uses: pypa/gh-action-pypi-publish@27b31702a0e7fc50959f5ad993c78deac1bdfc29
37+
with:
38+
user: __token__
39+
password: ${{ secrets.PYPI_API_TOKEN }}

.gitignore

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
# Compiled Python files
2+
*.pyc
3+
*.pyo
4+
__pycache__/
5+
6+
# Distribution and build directories
7+
dist/
8+
build/
9+
*.egg-info/
10+
11+
# Virtual environment
12+
venv/
13+
env/
14+
.env/
15+
16+
# IDE and editor files
17+
.vscode/
18+
.idea/
19+
*.sublime-project
20+
*.sublime-workspace
21+
22+
# Jupyter Notebook checkpoints
23+
.ipynb_checkpoints/
24+
25+
# macOS files
26+
.DS_Store
27+
28+
# Log files
29+
*.log

CHANGELOG.md

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
# Changelog
2+
3+
## Version 1.5.1beta5
4+
5+
### MAJOR REVISION OF CODE
6+
7+
Did a major fix to ```comute_rpss``` to work for both scalar and non-scalar values.
8+
9+
``` python
10+
11+
def compute_rpss(self, threshold, dim=None):
12+
"""
13+
Compute the Ranked Probability Skill Score (RPSS) for a given threshold.
14+
15+
Args:
16+
threshold (float): The threshold value for binary classification.
17+
dim (str, list, or None): The dimension(s) along which to compute the RPSS.
18+
If None, compute the RPSS over the entire data.
19+
20+
Returns:
21+
xarray.DataArray: The computed RPSS values.
22+
"""
23+
# Convert data to binary based on the threshold
24+
obs_binary = (self.obs_data >= threshold).astype(int)
25+
model_binary = (self.model_data >= threshold).astype(int)
26+
27+
# Calculate the RPS for the model data
28+
rps_model = ((model_binary.cumsum(dim) - obs_binary.cumsum(dim)) ** 2).mean(dim=dim)
29+
30+
# Calculate the RPS for the climatology (base rate)
31+
base_rate = obs_binary.mean(dim=dim)
32+
rps_climo = ((xr.full_like(model_binary, 0).cumsum(dim) - obs_binary.cumsum(dim)) ** 2).mean(dim=dim)
33+
rps_climo = rps_climo + base_rate * (1 - base_rate)
34+
35+
# Calculate the RPSS
36+
rpss = 1 - rps_model / rps_climo
37+
38+
return rpss
39+
40+
```
41+
42+
The updated `compute_rpss` method will work correctly for both scalar and non-scalar `base_rate` values.
43+
44+
In the context of xarray and dimensions/coordinates in a dataset, a scalar value refers to a single value that does not depend on any dimensions. It is a 0-dimensional value. On the other hand, a non-scalar value is an array or a DataArray that depends on one or more dimensions and has corresponding coordinates.
45+
46+
Let's consider an example to illustrate the difference:
47+
48+
Suppose we have a dataset with dimensions "time", "lat", and "lon". The dataset contains a variable "temperature" with corresponding coordinates for each dimension.
49+
50+
- Scalar value: If we calculate the mean temperature over all dimensions using `temperature.mean()`, the resulting value will be a scalar. It will be a single value that does not depend on any dimensions.
51+
52+
- Non-scalar value: If we calculate the mean temperature over a specific dimension, such as `temperature.mean(dim="time")`, the resulting value will be a non-scalar DataArray. It will have dimensions "lat" and "lon" and corresponding coordinates, but it will not depend on the "time" dimension anymore.
53+
54+
In the updated `compute_rpss` method, the line `base_rate = obs_binary.mean(dim=dim)` calculates the mean of `obs_binary` over the specified dimensions `dim`. If `dim` is None, it will calculate the mean over all dimensions, resulting in a scalar value. If `dim` is a specific dimension or a list of dimensions, it will calculate the mean over those dimensions, resulting in a non-scalar DataArray.
55+
56+
The subsequent lines of code in the `compute_rpss` method handle both cases correctly:
57+
58+
```python
59+
rps_climo = ((xr.full_like(model_binary, 0).cumsum(dim) - obs_binary.cumsum(dim)) ** 2).mean(dim=dim)
60+
rps_climo = rps_climo + base_rate * (1 - base_rate)
61+
```
62+
63+
If `base_rate` is a scalar value, it will be broadcasted to match the shape of `rps_climo`, and the calculation will be performed element-wise. If `base_rate` is a non-scalar DataArray, it will be aligned with `rps_climo` based on the common dimensions, and the calculation will be performed element-wise.
64+
65+
Now, whether this will work with data of different coordinates??? The updated `compute_rpss` method should work correctly as long as the dimensions and coordinates of `obs_binary` and `model_binary` are compatible. The method relies on xarray's broadcasting and alignment rules to handle data with different coordinates.
66+
67+
However, it's important to note that if the coordinates of `obs_binary` and `model_binary` are completely different or incompatible, you may encounter issues with dimension alignment or broadcasting. In such cases, you would need to ensure that the coordinates are properly aligned or resampled before applying the `compute_rpss` method.
68+
69+
In summary, the updated `compute_rpss` method should work correctly for both scalar and non-scalar `base_rate` values, and it should handle data with different coordinates as long as the dimensions and coordinates are compatible between `obs_binary` and `model_binary`.
70+
71+
### Bug Fixes
72+
73+
- Fixed minor bugs and improved code stability.
74+
75+
### Other Changes
76+
77+
- The package has been moved from the 3-Alpha stage to the 4-Beta stage in development, indicating that it has undergone further testing and refinement.
78+
79+
Please note that this is a beta release (version 1.5.1beta5), and while it includes significant enhancements and bug fixes, it may still have some known limitations or issues. We encourage users to provide feedback and report any bugs they encounter.
80+
81+
We appreciate your interest in the NWPeval package and thank you for your support!

LICENSE

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
MIT License
2+
3+
Copyright (c) 2024 DEBASISH MAHAPATRA
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.
22+
23+
If the Software is used in any scientific publications or research, proper
24+
citation must be provided as follows:
25+
26+
Mahapatra, D. (2024). NWPeval: A Python Package for Evaluating Numerical Weather
27+
Prediction Models. GitHub repository. https://github.com/Debasish-Mahapatra/nwpeval

0 commit comments

Comments
 (0)