Skip to content

Commit b6291cb

Browse files
jvddjonasvdd
andauthored
🖊️ improve readme documentation (#22)
* 🖊️ first draft of updated readme * ♻️ change to table * 🧹 * 🖊️ review * 🧹 Co-authored-by: Jonas Van Der Donckt <[email protected]>
1 parent f0abd6c commit b6291cb

File tree

1 file changed

+43
-5
lines changed

1 file changed

+43
-5
lines changed

README.md

Lines changed: 43 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,13 +7,14 @@
77
[![Testing](https://github.com/predict-idlab/tsdownsample/actions/workflows/ci-tsdownsample.yml/badge.svg)](https://github.com/predict-idlab/tsdownsample/actions/workflows/ci-tsdownsample.yml)
88
<!-- TODO: codecov -->
99

10-
Extremely fast **📈 time series downsampling** for visualization, written in Rust.
10+
Extremely fast **time series downsampling 📈** for visualization, written in Rust.
1111

1212
## Features ✨
1313

1414
* **Fast**: written in rust with PyO3 bindings
1515
- leverages optimized [argminmax](https://github.com/jvdd/argminmax) - which is SIMD accelerated with runtime feature detection
1616
- scales linearly with the number of data points
17+
<!-- TODO check if it scales sublinearly -->
1718
- multithreaded with Rayon (in Rust)
1819
<details>
1920
<summary><i>Why we do not use Python multiprocessing</i></summary>
@@ -62,14 +63,51 @@ s_ds = MinMaxLTTBDownsampler().downsample(y, n_out=1000)
6263
s_ds = MinMaxLTTBDownsampler().downsample(x, y, n_out=1000)
6364
```
6465

65-
## Limitations
66+
## Downsampling algorithms & API
67+
68+
### Downsampling API 📑
69+
70+
Each downsampling algorithm is implemented as a class that implements a `downsample` method.
71+
The signature of the `downsample` method:
72+
73+
```
74+
downsample([x], y, n_out, **kwargs) -> ndarray[uint64]
75+
```
76+
77+
**Arguments**:
78+
- `x` is optional
79+
- `x` and `y` are both positional arguments
80+
- `n_out` is a mandatory keyword argument that defines the number of output values<sup>*</sup>
81+
- `**kwargs` are optional keyword arguments *(see [table below](#downsampling-algorithms-📈))*:
82+
- `parallel`: whether to use multi-threading (default: `False`)<sup>**</sup>
83+
- ...
84+
85+
**Returns**: a `ndarray[uint64]` of indices that can be used to index the original data.
86+
87+
<sup>*</sup><i>When there are gaps in the time series, fewer than `n_out` indices may be returned.</i>
88+
<sup>**</sup><i>`parallel` is not supported for `LTTBDownsampler`.</i>
89+
### Downsampling algorithms 📈
90+
91+
The following downsampling algorithms (classes) are implemented:
92+
93+
| Downsampler | Description | `**kwargs` |
94+
| ---:| --- |--- |
95+
| `MinMaxDownsampler` | selects the **min and max** value in each bin | `parallel` |
96+
| `M4Downsampler` | selects the [**min, max, first and last**](https://dl.acm.org/doi/pdf/10.14778/2732951.2732953) value in each bin | `parallel` |
97+
| `LTTBDownsampler` | performs the [**Largest Triangle Three Buckets**](https://skemman.is/bitstream/1946/15343/3/SS_MSthesis.pdf) algorithm |
98+
| `MinMaxLTTBDownsampler` | (*new two-step algorithm 🎉*) first selects `n_out` * `minmax_ratio` **min and max** values, then further reduces these to `n_out` values using the **Largest Triangle Three Buckets** algorithm | `parallel`, `minmax_ratio`<sup>*</sup> |
99+
100+
<sup>*</sup><i>Default value for `minmax_ratio` is 30, which is empirically proven to be a good default. (More details in our upcomming paper)</i>
101+
102+
103+
## Limitations & assumptions 🚨
66104

67105
Assumes;
68-
(i) x-data monotinically increasing (i.e., sorted)
69-
(ii) no NaNs in the data
106+
1. `x`-data is (non-strictly) monotonic increasing (i.e., sorted)
107+
2. no `NaNs` in the data
70108

71109
---
72110

73111
<p align="center">
74112
👤 <i>Jeroen Van Der Donckt</i>
75-
</p>
113+
</p>

0 commit comments

Comments
 (0)