Skip to content

Commit edd86a1

Browse files
committed
Update docs
1 parent aca212c commit edd86a1

File tree

6 files changed

+378
-42
lines changed

6 files changed

+378
-42
lines changed

docs/get-started/data-model.qmd

Lines changed: 29 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -66,9 +66,9 @@ When you pass an `xarray.Dataset` (rather than a `DataArray`), the code searches
6666

6767
| Variable type | Auto-detected names |
6868
|--------------|---------------------|
69-
| Precipitation | `precip`, `prcp`, `pr`, `ppt`, `rainfall` |
69+
| Precipitation | `precip`, `prcp`, `precipitation`, `pr`, `ppt`, `rainfall` |
7070
| PET | `pet`, `eto`, `et`, `evap` |
71-
| Temperature | `temp`, `tas`, `t2m` |
71+
| Temperature | `temp`, `tas`, `tasmin`, `tasmax`, `t2m`, `tmean`, `tmin`, `tmax` |
7272

7373
You can always override auto-detection with explicit parameters: `precip_var_name=`, `pet_var_name=`, `temp_var_name=`.
7474

@@ -88,6 +88,33 @@ The code enforces these minimums per grid cell during calibration. If a cell fai
8888
| Minimum non-zero values | 10 | Fitting skipped (NaN output) |
8989
| Maximum zero proportion | 95% | Fitting skipped (NaN output) |
9090

91+
::: {.callout-warning}
92+
## Arid regions and zero precipitation
93+
94+
**Problem:** In deserts and semi-arid regions, many months have zero precipitation. If >95% of months are zero, fitting fails entirely (NaN output). Even with 80-95% zeros, fitting quality degrades.
95+
96+
**Why this matters:** Pearson Type III distribution requires at least 4 non-zero values for L-moments computation. When this fails, the code falls back to Method of Moments, then to a normal approximation — but results may be unreliable.
97+
98+
**Diagnosis:** Use `diagnose_data()` before running SPI/SPEI to check zero proportions:
99+
100+
```python
101+
from distributions import diagnose_data
102+
103+
# Check a single location
104+
ts = precip.isel(lat=50, lon=100).values
105+
diag = diagnose_data(ts)
106+
print(f"Zero proportion: {diag.zero_proportion:.1%}")
107+
print(f"Recommendation: {diag.recommendation}")
108+
```
109+
110+
**Options for arid regions:**
111+
112+
1. **Accept NaN** — If a region is truly hyper-arid, SPI may not be meaningful there
113+
2. **Use longer time scales** — SPI-12 or SPI-24 aggregates more precipitation, reducing zero proportion
114+
3. **Mask desert pixels** — Exclude cells with mean annual precipitation < 50 mm
115+
4. **Use Gamma distribution** — More robust than Pearson III for high-zero data (but still limited)
116+
:::
117+
91118
::: {.callout-tip}
92119
## Quick sanity checks (recommended)
93120
```python

docs/user-guide/spei.qmd

Lines changed: 26 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -192,8 +192,32 @@ print(f"Mean P-PET: {water_balance.mean().values:.1f} mm/month")
192192

193193
### Issue 3: PET > Precipitation Always
194194

195-
**Problem:** In deserts, PET always exceeds precipitation
196-
**Solution:** SPEI is designed for this — shows persistent dry conditions
195+
**Problem:** In deserts, PET always exceeds precipitation
196+
**Solution:** SPEI is designed for this — shows persistent dry conditions
197+
198+
### Issue 4: NaN in Arid Regions (Distribution Fitting Failure)
199+
200+
**Problem:** Hyper-arid grid cells return NaN even with valid data
201+
**Cause:** Pearson Type III requires sufficient non-zero values for L-moments computation. When >95% of water balance values cluster near zero or are identical, fitting fails.
202+
203+
**Diagnosis:**
204+
```python
205+
from distributions import diagnose_data
206+
207+
# Check the water balance distribution
208+
water_balance = (precip - pet).isel(lat=50, lon=100).values
209+
diag = diagnose_data(water_balance)
210+
print(f"Zero proportion: {diag.zero_proportion:.1%}")
211+
print(f"Suitable for Pearson III: {diag.is_suitable_pearson3}")
212+
```
213+
214+
**Solutions:**
215+
216+
- Use **Gamma distribution** instead of Pearson III (more robust for extreme cases)
217+
- Use longer time scales (SPEI-12, SPEI-24)
218+
- Mask hyper-arid pixels where SPEI is not meaningful
219+
220+
See [Data Model — Arid Regions](../get-started/data-model.qmd#arid-regions-and-zero-precipitation) for detailed guidance.
197221

198222
## Visualization
199223

docs/user-guide/spi.qmd

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -242,7 +242,7 @@ if precip.dims != ('time', 'lat', 'lon'):
242242

243243
### Issue 3: Memory Error
244244

245-
**Problem:** Out of memory for large datasets
245+
**Problem:** Out of memory for large datasets
246246
**Solution:** Use Dask-enabled version
247247

248248
```python
@@ -253,6 +253,29 @@ precip = xr.open_dataset('precip.nc', chunks={'time': 100})['precip']
253253
spi_12 = spi(precip, scale=12) # Automatically uses Dask if input is chunked
254254
```
255255

256+
### Issue 4: NaN in Arid Regions
257+
258+
**Problem:** Desert/semi-arid grid cells return NaN even with valid data
259+
**Cause:** Too many zero-precipitation months (>95% zeros triggers fitting failure)
260+
261+
**Diagnosis:**
262+
```python
263+
from distributions import diagnose_data
264+
265+
# Check zero proportion at a location
266+
ts = precip.isel(lat=50, lon=100).values
267+
diag = diagnose_data(ts)
268+
print(f"Zero proportion: {diag.zero_proportion:.1%}")
269+
```
270+
271+
**Solutions:**
272+
273+
- Use longer time scales (SPI-12, SPI-24) to reduce zero proportion
274+
- Mask hyper-arid pixels (mean annual precip < 50 mm)
275+
- Accept that SPI may not be meaningful for true deserts
276+
277+
See [Data Model — Arid Regions](../get-started/data-model.qmd#arid-regions-and-zero-precipitation) for detailed guidance.
278+
256279
## Visualization
257280

258281
### Quick Map

src/config.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -135,9 +135,9 @@ def unit(self) -> str:
135135

136136
# Variable name patterns for auto-detection in NetCDF datasets.
137137
# Add patterns here if your data uses different variable names.
138-
PRECIP_VAR_PATTERNS = ['precip', 'prcp', 'pr', 'ppt', 'rainfall']
138+
PRECIP_VAR_PATTERNS = ['precip', 'prcp', 'precipitation', 'pr', 'ppt', 'rainfall']
139139
PET_VAR_PATTERNS = ['pet', 'eto', 'et', 'evap']
140-
TEMP_VAR_PATTERNS = ['temp', 'tas', 't2m']
140+
TEMP_VAR_PATTERNS = ['temp', 'tas', 'tasmin', 'tasmax', 't2m', 'tmean', 'tmin', 'tmax']
141141

142142

143143
# =============================================================================

src/indices.py

Lines changed: 47 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -582,18 +582,21 @@ def spei(
582582
precip_var_name: Optional[str] = None,
583583
pet_var_name: Optional[str] = None,
584584
temp_var_name: Optional[str] = None,
585-
distribution: str = DEFAULT_DISTRIBUTION
585+
distribution: str = DEFAULT_DISTRIBUTION,
586+
pet_method: str = 'thornthwaite',
587+
temp_min: Optional[Union[np.ndarray, xr.DataArray]] = None,
588+
temp_max: Optional[Union[np.ndarray, xr.DataArray]] = None
586589
) -> Union[xr.DataArray, Tuple[xr.DataArray, Dict[str, np.ndarray]]]:
587590
"""
588591
Calculate Standardized Precipitation Evapotranspiration Index (SPEI).
589592
590593
SPEI uses the water balance (P - PET) instead of just precipitation.
591594
PET can be provided directly or calculated from temperature using
592-
the Thornthwaite method.
595+
Thornthwaite or Hargreaves-Samani method.
593596
594597
:param precip: precipitation data in mm
595598
:param pet: potential evapotranspiration in mm (optional if temperature provided)
596-
:param temperature: temperature in °C for PET calculation (optional if PET provided)
599+
:param temperature: mean temperature in C for PET calculation (optional if PET provided)
597600
:param latitude: latitude for PET calculation (required if using temperature)
598601
:param scale: accumulation period in time steps
599602
:param periodicity: 'monthly' or 'daily'
@@ -608,6 +611,11 @@ def spei(
608611
:param distribution: distribution type ('gamma', 'pearson3', 'log_logistic',
609612
'gev', 'gen_logistic'). Default: 'gamma'.
610613
Note: Pearson III or Log-Logistic are recommended for SPEI.
614+
:param pet_method: PET calculation method ('thornthwaite' or 'hargreaves').
615+
- 'thornthwaite': Uses only mean temperature (default)
616+
- 'hargreaves': Uses mean, min, max temperature (better for arid regions)
617+
:param temp_min: minimum temperature in C (required for Hargreaves method)
618+
:param temp_max: maximum temperature in C (required for Hargreaves method)
611619
:return: SPEI values as xarray DataArray, or tuple (SPEI, params)
612620
613621
Example:
@@ -617,9 +625,13 @@ def spei(
617625
>>> # With Pearson III distribution (recommended for SPEI)
618626
>>> spei_12 = spei(precip_da, pet=pet_da, scale=12, distribution='pearson3')
619627
620-
>>> # With temperature (auto-compute PET)
628+
>>> # With temperature - Thornthwaite method (default)
621629
>>> spei_12 = spei(precip_da, temperature=temp_da, latitude=lat_da, scale=12)
622630
631+
>>> # With temperature - Hargreaves method (better for arid regions)
632+
>>> spei_12 = spei(precip_da, temperature=temp_mean, latitude=lat_da, scale=12,
633+
... pet_method='hargreaves', temp_min=tmin, temp_max=tmax)
634+
623635
>>> # Save and reuse parameters
624636
>>> spei_12, params = spei(precip_da, pet=pet_da, scale=12, return_params=True)
625637
>>> save_fitting_params(params, 'spei_params.nc', scale=12,
@@ -666,11 +678,18 @@ def spei(
666678
pet_array = np.asarray(pet)
667679
elif temperature is not None:
668680
# Compute PET from temperature
669-
_logger.info("Computing PET from temperature using Thornthwaite method")
681+
pet_method = pet_method.lower()
682+
_logger.info(f"Computing PET from temperature using {pet_method.capitalize()} method")
670683

671684
if latitude is None:
672685
raise ValueError("latitude required for PET calculation from temperature")
673686

687+
if pet_method == 'hargreaves' and (temp_min is None or temp_max is None):
688+
raise ValueError(
689+
"Hargreaves method requires temp_min and temp_max parameters. "
690+
"Use pet_method='thornthwaite' if only mean temperature is available."
691+
)
692+
674693
# Handle temperature input
675694
if isinstance(temperature, xr.Dataset):
676695
if temp_var_name is None:
@@ -693,8 +712,13 @@ def spei(
693712
if data_start_year is None:
694713
raise ValueError("data_start_year required for PET calculation")
695714

696-
# Calculate PET
697-
pet_da = calculate_pet(temp_da, latitude, data_start_year)
715+
# Calculate PET using specified method
716+
pet_da = calculate_pet(
717+
temp_da, latitude, data_start_year,
718+
method=pet_method,
719+
temp_min=temp_min,
720+
temp_max=temp_max
721+
)
698722
pet_array = pet_da.values if isinstance(pet_da, xr.DataArray) else pet_da
699723
else:
700724
raise ValueError("Either 'pet' or 'temperature' (with 'latitude') must be provided")
@@ -822,14 +846,17 @@ def spei_multi_scale(
822846
pet_var_name: Optional[str] = None,
823847
temp_var_name: Optional[str] = None,
824848
distribution: str = DEFAULT_DISTRIBUTION,
825-
global_attrs: Optional[Dict] = None
849+
global_attrs: Optional[Dict] = None,
850+
pet_method: str = 'thornthwaite',
851+
temp_min: Optional[Union[np.ndarray, xr.DataArray]] = None,
852+
temp_max: Optional[Union[np.ndarray, xr.DataArray]] = None
826853
) -> Union[xr.Dataset, Tuple[xr.Dataset, Dict[int, Dict[str, np.ndarray]]]]:
827854
"""
828855
Calculate SPEI for multiple time scales.
829856
830857
:param precip: precipitation data
831858
:param pet: potential evapotranspiration (optional if temperature provided)
832-
:param temperature: temperature for PET calculation
859+
:param temperature: mean temperature for PET calculation
833860
:param latitude: latitude for PET calculation
834861
:param scales: list of accumulation scales (e.g., [1, 3, 6, 12])
835862
:param periodicity: 'monthly' or 'daily'
@@ -844,6 +871,9 @@ def spei_multi_scale(
844871
'gev', 'gen_logistic'). Default: 'gamma'
845872
:param global_attrs: optional dict of global attributes to override defaults
846873
(e.g., {'institution': 'My Org', 'source': 'My Project'})
874+
:param pet_method: PET calculation method ('thornthwaite' or 'hargreaves')
875+
:param temp_min: minimum temperature (required for Hargreaves)
876+
:param temp_max: maximum temperature (required for Hargreaves)
847877
:return: Dataset with SPEI for all scales
848878
849879
Example:
@@ -852,6 +882,10 @@ def spei_multi_scale(
852882
>>> # With Pearson III (recommended for SPEI)
853883
>>> spei_ds = spei_multi_scale(precip_da, pet=pet_da, scales=[3, 12],
854884
... distribution='pearson3')
885+
>>> # With Hargreaves PET
886+
>>> spei_ds = spei_multi_scale(precip_da, temperature=tmean, latitude=lat,
887+
... scales=[3, 12], pet_method='hargreaves',
888+
... temp_min=tmin, temp_max=tmax)
855889
"""
856890
dist = distribution.lower()
857891
_logger.info(f"Computing SPEI for scales: {scales} (distribution={dist})")
@@ -879,7 +913,10 @@ def spei_multi_scale(
879913
precip_var_name=precip_var_name,
880914
pet_var_name=pet_var_name,
881915
temp_var_name=temp_var_name,
882-
distribution=dist
916+
distribution=dist,
917+
pet_method=pet_method,
918+
temp_min=temp_min,
919+
temp_max=temp_max
883920
)
884921

885922
var_name_out = get_variable_name('spei', s, periodicity, distribution=dist)

0 commit comments

Comments
 (0)