Code to the paper [1]. The approach is based on the time series classification algorithm MiniROCKET [2], MultiROCKET [3] and HYDRA [4] and extend it with explicit time encoding by HDC.
[1] K. Schlegel, D. A. Rachkovskij, D. Kleyko, R. W. Gayler, P. Protzel, and P. Neubert, “Structured temporal representation in time series classification with ROCKETs and Hyperdimensional Computing,” Data Mining and Knowledge Discovery, 2025.
[2] A. Dempster, D. F. Schmidt, and G. I. Webb, “MiniRocket: A Very Fast (Almost) Deterministic Transform for Time Series Classification,” Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., pp. 248–257, 2021.
[3] C. W. Tan, A. Dempster, C. Bergmeir, and G. I. Webb, “MultiRocket: multiple pooling operators and transformations for fast and effective time series classification,” Data Mining and Knowledge Discovery, 2022, doi: 10.1007/s10618-022-00844-1.
[4] A. Dempster, D. F. Schmidt, and G. I. Webb, “Hydra: competing convolutional kernels for fast and accurate time series classification,” Data Mining and Knowledge Discovery, vol. 37, no. 5, pp. 1779–1805, 2023, doi: 10.1007/s10618-023-00939-3.
└── /
├── LICENSE
├── README.md
├── requirements.txt
├── configs/
│ ├── defaults.yaml
│ ├── HYDRA.yaml
│ ├── MINIROCKET.yaml
│ ├── MULTIROCKET.yaml
│ ├── MULTIROCKET_HYDRA.yaml
├── create_figures_tables/
│ ├── cd_diagram.py
│ ├── create_eval_data.py
│ ├── dataset_shapes_UCR_NEW.csv
│ ├── figure_sec_4_3.py
│ ├── figures_sec_5_2.py
│ ├── figures_sec_5_3.py
│ ├── figures_sec_5_3_example_data.py
│ ├── figures_sec_5_56.py
│ ├── plot_config.py
│ ├── plot_ramifications.py
│ ├── tables/
│ └── images/
├── data/
│ ├── constants.py
│ └── dataset_utils.py
├── experimental_runs/
│ ├── runs_hydra.sh
│ ├── runs_minirocket.sh
│ ├── runs_multirocket.sh
│ ├── runs_multirocket_hydra.sh
├── models/
│ ├── HYDRA_utils/
│ ├── Minirocket_utils/
│ ├── Multirocket_utils/
│ ├── Multirocket_HYDRA/
│ ├── hdc_utils.py
│ ├── fit_beta_utils.py
│ ├── Model_Pipeline.py
│ └── model.pth
├── main.py
├── net_trail.py
├── results/
The table below provides an overview of the main configuration parameters defined in configs/defaults.yaml
. These parameters control dataset loading, encoding, model settings, and experimental execution.
Category | Parameter | Description |
---|---|---|
Dataset and Execution | dataset |
Dataset name to use (UCR , UCR_NEW , synthetic , etc.) |
dataset_idx |
Index of dataset to use (for ensemble-style runs) | |
complete_UCR |
Whether to run all datasets in the UCR collection | |
complete_UCR_NEW |
Whether to run all datasets in the UCR_NEW collection | |
complete_UEA |
Whether to run all datasets in the UEA collection | |
hard_case |
Use challenging version of synthetic dataset | |
dataset_path |
Path to dataset folder | |
results_path |
Path to output results | |
log_level |
Logging level (e.g., INFO ) |
|
General Parameters | model |
Model type to run (MINIROCKET , MULTIROCKET , etc.) |
variant |
Variant of the model (e.g., orig , hdc_oracle , hdc_auto ) |
|
seed |
Random seed for reproducibility | |
seed_for_HDC |
Whether to seed HDC vector generation | |
seed_for_splits |
Whether to seed data splits | |
stat_iterations |
Number of iterations (e.g., seeds) to evaluate | |
n_time_measures |
How many times to repeat timing for computational runtime | |
batch_size |
Evaluation batch size | |
max_jobs |
Maximum parallel jobs for multiprocessing | |
vsa |
Vector Symbolic Architecture (MAP ) |
|
fpe_method |
Fractional Power Encoding method (sinusoid ) |
|
beta |
Temporal encoding beta value | |
multi_beta |
Run multiple beta values at once | |
best_beta |
Automatically select best beta based on grid search | |
kernels |
Number of kernels to use | |
nan_to_zero |
Replace NaNs with zero in data | |
Normalization | normalize_input |
Enable input normalization |
predictors_min_max_norm |
Min-max normalize predictors | |
predictors_z_score_with_mean |
Apply mean normalization | |
predictors_z_score_with_std |
Apply std normalization | |
predictors_norm |
Apply vector length normalization | |
norm_test_individually |
Normalize test samples independently | |
MultiROCKET | multi_dim |
Output dimension for MultiROCKET |
predictors_sparse_scaler |
Use sparse scaler for predictors (if available) | |
Hydra | HDC_dim_hydra |
Dimensionality used in HYDRA variants |
Dependencies
Dependencies are listed in the requirements.txt file.
We recommend a virtual environment to run the code and using Python 3.10.
- Clone the . repository:
git clone https://github.com/scken/HDC_ROCKETS.git
- Change to the project directory:
cd HDC_ROCKETS
- Install the dependencies: We recommend using mamba (conda) as virtual environment to run the code and using Python 3.10.
mamba create -n hdc-rockets python=3.10
mamba activate hdc-rockets
mamba install numpy=1.26 scipy=1.10.0 pandas=2.0.3 matplotlib seaborn=0.13.2 scikit-learn=1.5.2 sktime=0.34.0 aeon=0.11.1 h5py
pip install multi-comp-matrix rocket-fft tsai torch-hd pytorch-lightning hydra-core==1.3.2 tables
You can download the datasets and resample indices in two ways:
-
Official TSML Benchmark Repository
Follow the instructions at https://tsml-eval.readthedocs.io/en/latest/publications/2023/tsc_bakeoff/tsc_bakeoff_2023.html to download the full benchmark datasets (112 + 30) and the corresponding resample indices. -
Direct Download from Our Server
Alternatively, you can download a prepackaged version directly from our cloud server:
https://tuc.cloud/index.php/s/z5kZKSxose35sdK
wget https://tuc.cloud/index.php/s/z5kZKSxose35sdK/download -O dataset.zip
unzip dataset.zip -d data
rm dataset.zip
After downloading, extract and store all files in a single folder, e.g., data/
.
- main.py is reading the config files specified in "configs" folder contains more specific parameters for running the experiment (default and model-specific ones)
- all parameters which are not specified in the command line will be taken from the config files
- to overwrite a parameter from the config file, specify it in the command line (e.g. beta=0) or in the files itself
UCR / NEW-UCR: single dataset vs full benchmark & original vs HDC variants
The extended UCR archive with the datasets from UCR plus 30 new, as explained in the paper, is indicated as UCR_NEW.
Variants:
variant=orig
→ Original MiniROCKET (no HDC temporal encoding).variant=hdc_oracle
→ HDC‑MiniROCKET, evaluates a grid of betas and reports the best per dataset ("oracle").variant=hdc_auto
→ HDC‑MiniROCKET, automatically selects the best beta during the run.Scope:
- Single dataset → set
dataset_idx=<i>
- Full UCR benchmark → set
complete_UCR_NEW=True
Run one dataset with the original MiniROCKET (here: dataset index 0):
python3 main.py --config-name=MINIROCKET variant=orig dataset_idx=0
Run one dataset with HDC‑MiniROCKET (auto beta):
python3 main.py --config-name=MINIROCKET variant=hdc_auto dataset_idx=0
Run the full UCR benchmark with HDC‑MiniROCKET (oracle betas):
python3 main.py --config-name=MINIROCKET variant=hdc_oracle complete_UCR=True
Run the full UCR benchmark with HDC‑MiniROCKET (auto beta):
python3 main.py --config-name=MINIROCKET variant=hdc_auto complete_UCR=True
Note: If you are using the NEW‑UCR archive (default in
configs/defaults.yaml
), adddataset=UCR_NEW
to the command line to be explicit, e.g.:python3 main.py --config-name=MINIROCKET variant=hdc_auto dataset=UCR_NEW complete_UCR_NEW=True
- run normal synthetic dataset 1
python3 main.py --config-name=MINIROCKET variant=orig dataset=synthetic1
- run hard synthetic dataset 1
python3 main.py --config-name=MINIROCKET variant=orig dataset=synthetic1 hard_case=True
To run multiple configurations of the model, use the scripts in the experimental_runs folder. The scripts are used to run multiple configurations of the model on different datasets. They include options for synthetic datasets and the UCR dataset, with different configurations for each.
e.g.:
sh ./experimental_runs/runs_minirocket.sh
- this will run all synthetic datasets and UCR in different model configurations of MiniROCKET
- the results will be written and saved in /results
- for each run a new folder will be created
- within this folder the results will be saved in from of Excel spreadsheets, text files for logging and json file for storing the used hyperparameters
- in addition, the code will copy all Python files to the result folder "Code" for reproducibility (snapshot of the code)
The folder create_figures_tables/
contains scripts for visualizing results and generating LaTeX tables.
For example, run the following to generate synthetic result plots:
python3 ./create_figures_tables/figures_sec_5_2.py
This project is licensed under the GNU General Public License v3.0.
See the LICENSE file for details.