Skip to content

Commit 4b90809

Browse files
authored
Add files via upload
1 parent 41923bc commit 4b90809

File tree

2 files changed

+245
-2
lines changed

2 files changed

+245
-2
lines changed

README.md

Lines changed: 204 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,204 @@
1-
# MannKS
2-
MannKS (Mann-Kendall Sen slope) - Robust Trend Analysis in Python
1+
<div align="center">
2+
<img src="assets/logo.png" alt="MannKS Logo" width="600"/>
3+
4+
# MannKS
5+
### (Mann-Kendall Sen)
6+
7+
**Robust Trend Analysis in Python**
8+
</div>
9+
10+
---
11+
12+
## 📦 Installation
13+
14+
```bash
15+
pip install -r requirements.txt
16+
pip install -e .
17+
```
18+
19+
**Requirements:** Python 3.7+, NumPy, Pandas, SciPy, Matplotlib
20+
21+
---
22+
23+
## ✨ What is MannKS?
24+
25+
**MannKS** (Mann-Kendall Sen) is a Python package for detecting trends in time series data using non-parametric methods. It's specifically designed for environmental monitoring, water quality analysis, and other fields where data is messy, irregular, or contains detection limits.
26+
27+
### When to Use MannKS
28+
29+
Use this package when your data has:
30+
- **Irregular sampling intervals** (daily → monthly → quarterly)
31+
- **Censored values** (measurements like `<5` or `>100`)
32+
- **Seasonal patterns** you need to account for
33+
- **No normal distribution** (non-parametric methods don't require it)
34+
- **Small to moderate sample sizes** (n < 5,000 recommended)
35+
36+
**Don't use** for highly autocorrelated data (test first) or if you need n > 46,340 observations.
37+
38+
---
39+
40+
## 🚀 Quick Start
41+
42+
```python
43+
import pandas as pd
44+
from MannKS import prepare_censored_data, trend_test
45+
46+
# 1. Prepare data with censored values
47+
# Converts strings like '<5' into a structured format
48+
values = [10, 12, '<5', 14, 15, 18, 20, '<5', 25, 30]
49+
dates = pd.date_range(start='2020-01-01', periods=len(values), freq='ME')
50+
data = prepare_censored_data(values)
51+
52+
# 2. Run trend test
53+
# slope_scaling converts slope from "per second" to "per year"
54+
result = trend_test(
55+
x=data,
56+
t=dates,
57+
slope_scaling='year',
58+
x_unit='mg/L',
59+
plot_path='trend.png'
60+
)
61+
62+
# 3. Interpret results
63+
print(f"Trend: {result.classification}")
64+
print(f"Slope: {result.slope:.2f} {result.slope_units}")
65+
print(f"Confidence: {result.C:.2%}")
66+
```
67+
68+
**Output:**
69+
```
70+
Trend: Highly Likely Increasing
71+
Slope: 24.57 mg/L per year
72+
Confidence: 98.47%
73+
```
74+
75+
![Trend Analysis Plot](assets/quick_start_trend.png)
76+
77+
---
78+
79+
## 🎯 Key Features
80+
81+
### Core Functionality
82+
- **Mann-Kendall Trend Test**: Detect monotonic trends with statistical significance
83+
- **Sen's Slope Estimator**: Calculate trend magnitude with confidence intervals
84+
- **Seasonal Analysis**: Separate seasonal signals from long-term trends
85+
- **Regional Aggregation**: Combine results across multiple monitoring sites
86+
87+
### Data Handling
88+
- **Censored Data Support**: Native handling of detection limits (`<5`, `>100`)
89+
- Three methods: Standard, LWP-compatible, Akritas-Theil-Sen (ATS)
90+
- Handles left-censored, right-censored, and mixed censoring
91+
- **Unequal Spacing**: Uses actual time differences (not just rank order)
92+
- **Missing Data**: Automatically handles NaN values and missing seasons
93+
- **Temporal Aggregation**: Multiple strategies for high-frequency data
94+
95+
### Statistical Features
96+
- **Continuous Confidence**: Reports likelihood ("Highly Likely Increasing") not just p-values
97+
- **Data Quality Checks**: Automatic warnings for tied values, long runs, insufficient data
98+
- **Robust Methods**: ATS estimator for heavily censored data
99+
- **Flexible Testing**: Kendall's Tau-a or Tau-b, custom significance levels
100+
101+
---
102+
103+
## 📊 Example Use Cases
104+
105+
### Seasonal Water Quality Trend
106+
```python
107+
from MannKS import seasonal_trend_test, check_seasonality
108+
109+
# Check if seasonality exists
110+
seasonality = check_seasonality(x=data, t=dates, period=12, season_type='month')
111+
print(f"Seasonal pattern detected: {seasonality.is_seasonal}")
112+
113+
# Run seasonal trend test
114+
result = seasonal_trend_test(
115+
x=data,
116+
t=dates,
117+
period=12,
118+
season_type='month',
119+
agg_method='robust_median', # Aggregates multiple samples per month
120+
slope_scaling='year'
121+
)
122+
```
123+
124+
### Regional Analysis Across Sites
125+
```python
126+
from MannKS import regional_test
127+
128+
# Run trend tests for each site
129+
site_results = []
130+
for site in ['Site_A', 'Site_B', 'Site_C']:
131+
result = trend_test(x=site_data[site], t=dates)
132+
site_results.append({
133+
'site': site,
134+
's': result.s,
135+
'C': result.C
136+
})
137+
138+
# Aggregate regional trend
139+
regional = regional_test(
140+
trend_results=pd.DataFrame(site_results),
141+
time_series_data=all_site_data,
142+
site_col='site'
143+
)
144+
print(f"Regional trend: {regional.DT}, confidence: {regional.CT:.2%}")
145+
```
146+
147+
---
148+
149+
## ⚠️ Important Limitations
150+
151+
### Sample Size
152+
- **Recommended maximum: n = 5,000** (triggers memory warning)
153+
- **Hard limit: n = 46,340** (prevents integer overflow)
154+
- For larger datasets, use `regional_test()` to aggregate multiple smaller sites
155+
156+
### Statistical Assumptions
157+
- **Independence**: Data points must be serially independent
158+
- Autocorrelation violates this and causes spurious significance
159+
- Pre-test with ACF or use block bootstrap methods if autocorrelated
160+
- **Monotonic trend**: Cannot detect U-shaped or cyclical patterns
161+
- **Homogeneous variance**: Most powerful when variance is constant over time
162+
163+
---
164+
165+
## 📚 Documentation
166+
167+
### Detailed Guides
168+
- **[Trend Test Parameters](./Examples/Detailed_Guides/trend_test_parameters_guide.md)** - Full parameter reference
169+
- **[Seasonal Analysis](./Examples/Detailed_Guides/seasonal_trend_test_parameters_guide.md)** - Season types and aggregation
170+
- **[Regional Tests](./Examples/Detailed_Guides/regional_test_guide/README.md)** - Multi-site aggregation
171+
- **[Analysis Notes](./Examples/Detailed_Guides/analysis_notes_guide.md)** - Interpreting data quality warnings
172+
- **[Trend Classification](./Examples/Detailed_Guides/trend_classification_guide.md)** - Understanding confidence levels
173+
174+
### Examples
175+
The [Examples](./Examples/README.md) folder contains step-by-step tutorials from basic to advanced usage.
176+
177+
---
178+
179+
## 🔬 Validation
180+
181+
Extensively validated against:
182+
- **LWP-TRENDS R script** (34 test cases, 99%+ agreement)
183+
- **NADA2 R package** (censored data methods)
184+
- Edge cases: missing data, tied values, all-censored data, insufficient samples
185+
186+
See [validation/](./validation/) for detailed comparison reports.
187+
188+
---
189+
190+
## 🙏 Acknowledgments
191+
192+
This package is heavily inspired by the excellent work of **[LandWaterPeople (LWP)](https://landwaterpeople.co.nz/)**. The robust censored data handling and regional aggregation methods are based on their R scripts and methodologies.
193+
194+
---
195+
196+
## 📖 References
197+
198+
1. **Helsel, D.R. (2012).** *Statistics for Censored Environmental Data Using Minitab and R* (2nd ed.). Wiley.
199+
2. **Gilbert, R.O. (1987).** *Statistical Methods for Environmental Pollution Monitoring*. Wiley.
200+
3. **Hirsch, R.M., Slack, J.R., & Smith, R.A. (1982).** Techniques of trend analysis for monthly water quality data. *Water Resources Research*, 18(1), 107-121.
201+
4. **Mann, H.B. (1945).** Nonparametric tests against trend. *Econometrica*, 13(3), 245-259.
202+
5. **Sen, P.K. (1968).** Estimates of the regression coefficient based on a particular kind of rank correlation. *Journal of the American Statistical Association*, 63(324), 1379-1389.
203+
6. **Fraser, C., & Whitehead, A. L. (2022).** Continuous measures of confidence in direction of environmental trends at site and other spatial scales. *Environmental Challenges*, 9, 100601.
204+
7. **Fraser, C., Snelder, T., & Matthews, A. (2018).** State and trends of river water quality in the Manawatu-Whanganui region. Report for Horizons Regional Council.

pyproject.toml

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
[build-system]
2+
requires = ["setuptools"]
3+
build-backend = "setuptools.build_meta"
4+
5+
[project]
6+
name = "MannKS"
7+
version = "0.1.0"
8+
description = "Non-parametric trend analysis for unequally spaced time series with censored data"
9+
authors = [{name = "Luke Fullard"}]
10+
readme = "README.md"
11+
requires-python = ">=3.7"
12+
license = {text = "MIT"}
13+
keywords = ["mann-kendall", "sen-slope", "trend-analysis", "censored-data", "time-series"]
14+
classifiers = [
15+
"Development Status :: 4 - Beta",
16+
"Intended Audience :: Science/Research",
17+
"Topic :: Scientific/Engineering :: Mathematics",
18+
"Programming Language :: Python :: 3.7",
19+
"Programming Language :: Python :: 3.8",
20+
"Programming Language :: Python :: 3.9",
21+
"Programming Language :: Python :: 3.10",
22+
]
23+
24+
dependencies = [
25+
"numpy>=1.19.0",
26+
"pandas>=1.1.0",
27+
"scipy>=1.5.0",
28+
"matplotlib>=3.3.0",
29+
"seaborn>=0.11.0",
30+
]
31+
32+
[project.optional-dependencies]
33+
dev = ["pytest>=6.0", "pytest-cov"]
34+
35+
[project.urls]
36+
Homepage = "https://github.com/LukeAFullard/MannKS"
37+
Documentation = "https://github.com/LukeAFullard/MannKS/tree/main/Examples"
38+
Repository = "https://github.com/LukeAFullard/MannKS"
39+
40+
[tool.setuptools]
41+
packages = ["MannKS"]

0 commit comments

Comments
 (0)