Releases: FZJ-IEK3-VSA/tsam
v3.1.2
What's Changed
- durationRepresentation by @jo-omoyele in #172
- fix(deps): update dependency pyomo to >=6.4.8,<=6.10.0 by @renovate[bot] in #170
- Add tests for same clusters as input data in
same_cluster_as_input_data.pyUpdate CI by @julian-belina in #173
Full Changelog: v3.1.1...v3.1.2
v3.1.1
TSAM v3.1.1 is the first stable v3 release. Versions 3.0.0 and 3.1.0 were removed from PyPI.
It introduces a modern functional API and makes significant improvements to performance,
Plotting, hyperparameter tuning, and overall code quality have also been improved.
For further details, see the migration guide: https://tsam.readthedocs.io/en/latest/migrationGuideDoc.html.
v.3.1.0
-
Added
preserve_n_clustersoption toExtremeConfig. When set toTrue, extreme periods
count towardn_clustersinstead of being added on top, giving exact control over the final
number of representative periods. Default behavior is unchanged. -
Removed
matplotlibdependency; all plotting and notebooks now useplotly -
The tdqm test pipeline has been reduced due to the number of versions and the maturity of the library.
-
Renamed the examples for more clarity
v3.0.0
Summary
-
New Features
- Redesigned API with functional
aggregate()function and structured configuration objects (ClusterConfig,SegmentConfig,ExtremeConfig) - New
AggregationResultobject providing:cluster_representatives,cluster_weights,cluster_assignmentspropertiesreconstruct()method for time series reconstructionaccuracymetrics (RMSE, MAE, etc.)- Built-in plotting via
result.plot.*
ClusteringResultfor clustering transfer and persistence:- Save/load clustering with
to_json()/from_json() - Apply saved clustering to new data with
apply() - All clustering state bundled in one immutable object
- Save/load clustering with
- Interactive Plotly-based visualizations (
tsam.plot.*): heatmaps, duration curves, time slices, comparisons - New
cluster_assignments()plot to visualize which cluster each period belongs to - Hyperparameter tuning with
find_optimal_combination():- Sparse Pareto frontier exploration (faster than exhaustive search)
- Parallel execution with file-based data passing (memory safe)
- Segment transfer: ability to reuse segment configurations across aggregations
- Redesigned API with functional
-
Documentation
- Comprehensive getting started guide with new API examples
- 10 example notebooks moved to
docs/source/examples_notebooks/for Sphinx integration - New notebooks:
tuning_example.ipynb,aggregation_method_showcase.ipynb
-
Chores
- Version bump to 3.0.0
- Made Matplotlib optional; added Plotly (
>=5.0.0) as primary plotting backend - Reorganized test fixtures to
test/data/ - Added GitHub issue templates (bug report, feature request)
-
Breaking Changes
- The new API is additive - existing
TimeSeriesAggregationclass remains unchanged - Notebooks moved from
examples/todocs/source/examples_notebooks/
- The new API is additive - existing
New Modules
| Module | Purpose |
|---|---|
src/tsam/api.py |
Functional aggregate() entry point |
src/tsam/config.py |
ClusterConfig, SegmentConfig, ExtremeConfig, ClusteringResult dataclasses |
src/tsam/result.py |
AggregationResult with reconstruction & metrics |
src/tsam/plot.py |
Plotly-based visualization functions |
src/tsam/tuning.py |
find_optimal_combination() with parallelization |
New vs Old API Comparison
# Old API
aggregation = tsam.TimeSeriesAggregation(
raw, noTypicalPeriods=8, hoursPerPeriod=24,
clusterMethod='hierarchical',
extremePeriodMethod='new_cluster',
addPeakMin=['T'], addPeakMax=['Load']
)
aggregation.createTypicalPeriods()
typical = aggregation.typicalPeriods
# New API
result = tsam.aggregate(
raw, n_clusters=8, period_duration=24, # or '24h', '1d'
cluster=ClusterConfig(method="hierarchical"),
extremes=ExtremeConfig(
method="new_cluster",
min_value=["T"],
max_value=["Load"]
)
)
typical = result.cluster_representativesClustering Transfer
# Run aggregation on subset of data
result = tsam.aggregate(df_wind, n_clusters=8)
# Save clustering for later use
result.clustering.to_json("clustering.json")
# Load and apply to different data - all parameters come from the file
clustering = ClusteringResult.from_json("clustering.json")
result2 = clustering.apply(df_all)
# Visualize cluster assignments
result.plot.cluster_assignments()API Naming Conventions
v2 → v3 Parameter Mapping
| Old (v2) | New (v3) | Reason |
|---|---|---|
noTypicalPeriods |
n_clusters |
Clearer terminology: we create clusters, each with one representative. |
hoursPerPeriod |
period_duration |
Accepts int/float (hours) or pandas Timedelta strings ('24h', '1d'). |
resolution |
timestep_duration |
More explicit about what "resolution" refers to. |
typicalPeriods |
cluster_representatives |
Renamed to reflect that each cluster has one representative period. |
clusterPeriodNoOccur |
cluster_weights |
Clearer: weights indicate how often each cluster occurs. |
clusterOrder |
cluster_assignments |
"Assignments" better describes mapping periods to clusters. |
Naming Patterns
| Pattern | Usage | Examples |
|---|---|---|
n_* |
Count parameters | n_clusters, n_segments, n_timesteps_per_period |
*_assignments |
Mapping arrays (which cluster/segment each item belongs to) | cluster_assignments, segment_assignments |
*_centers |
Representative indices | cluster_centers |
*_durations |
Duration/length arrays | segment_durations |
*_duration |
Time length parameters (accept int/float hours or pandas strings) | period_duration, timestep_duration |
find_* |
Search/optimization functions | find_optimal_combination, find_pareto_front |
*Config |
Configuration dataclasses | ClusterConfig, SegmentConfig, ExtremeConfig |
Glossary
Note: The new API uses "cluster representative" instead of "typical period" to better reflect that each cluster is represented by one period.
| Concept | Description |
|---|---|
| Period | A fixed-length time window (e.g., 24 hours = 1 day). The original time series is divided into periods for clustering. |
| Cluster | A group of similar periods. Each cluster has one representative period (cluster_representatives). |
| Segment | A subdivision within a period. Consecutive timesteps are grouped into segments to reduce temporal resolution. |
| Timestep | A single time point within a period (e.g., one hour in a 24-hour period). |
| Duration Curve | A sorted representation of values within a period (highest to lowest). Used with use_duration_curves=True or representation="distribution". |
Motivation and Context
The tsam API is quite complex and doesn't follow modern Python best practices (naming conventions like lower_case_with_underscores). The large number of parameters makes it hard to discover related options. I grouped parameters into small dataclasses (ClusterConfig, SegmentConfig, ExtremeConfig) for better discoverability and IDE autocomplete.
The same applies to results - I found it unintuitive to retrieve results and relevant metrics. The new AggregationResult object provides a clean interface with properties and methods.
I added Plotly-based plotting methods for quick interactive inspection.
The hyperparameter tuning was useful but unnecessarily slow. I added:
- Sparse Pareto frontier search (explores promising configurations first)
- Parallel execution with file-based data passing (avoids pickling large DataFrames)
I also was missing the option to predefine segments and transfer clustering results between datasets. The new ClusteringResult class with to_json()/from_json() and apply() methods addresses this.
v.2.3.9
Improve time series aggregation speed with segmentation, mentioned in issue #96 by @phil-fzj
Fix issue #99 reported by @adbuerger
Improve time series aggregation speed with segmentation
2.3.7
2.3.6
- I have transitioned from setup.py to pyproject.toml . #89
- Changed the layout from flat to source. See here for advantages
- Updated the installation description in the README.md
- Fixed a couple of deprecation and future warnings, reported in #91
2.3.5
Replaces 2.3.4 due to differences between github and pypi
2.3.4
-
Extend the reporting if time series tolerances are exceeded and add the option to silence them with a tolerance value.
-
set default tolerance value to 1e-13