The code is highly dependent on and compatible with HoTPP.
pip install --no-build-isolation .The code for HT-Transformer can be found at:
pretpp/nn/encoder/history_token_transformer.py
pretpp/nn/encoder/history_token_strategy.py
Some datasets are inherited from HoTPP. For them just make a symlink to the data folder:
cd experiments/DATASET
ln -s <hotpp>/experiments/DATASET/data .To make datasets, specific to PreTPP, use the following command:
cd experiments/DATASET
spark-submit --driver-memory 16g -c spark.network.timeout=100000s --master 'local[12]' scripts/make-dataset.pyAll configs are placed at experiments/DATASET/configs.
All results are stored in experiments/DATASET/results.
Example training of HT-Transformer on the Churn dataset:
cd experiments/transactions-rosbank-full-3s
CUDA_VISIBLE_DEVICES=0 python3 -m hotpp.train_multiseed --config-dir configs --config-name next_item_hts_transformerFine-tune:
CUDA_VISIBLE_DEVICES=0 python3 -m hotpp.train_multiseed --config-dir configs --config-name htl_transformer_ft_multi base_name=next_item_hts_transformerExample training of NTP-Transformer on the Taobao dataset:
cd experiments/taobao
CUDA_VISIBLE_DEVICES=0 python3 -m hotpp.train_multiseed --config-dir configs --config-name next_item_transformerFine-tune:
CUDA_VISIBLE_DEVICES=0 python3 -m hotpp.train_multiseed --config-dir configs --config-name transformer_ft_multi base_name=next_item_transformerIf you encounter problems with downstream evaluation, such as seeing the message “waiting XX unfinished evaluation jobs” while CPU usage remains at zero, try setting the following environment variable:
export OMP_NUM_THREADS=1@article{karpukhin2025httransformer,
title={HT-Transformer: Event Sequences Classification by Accumulating Prefix Information with History Tokens},
author={Karpukhin, Ivan and Savchenko, Andrey},
journal={arXiv preprint arXiv:2508.01474v1},
year={2025},
url ={https://arxiv.org/abs/2508.01474v1}
}