GitHub - Qingrenn/TSFM-ScalingLaws: [ICLR 2025] Official implementation of "Towards Neural Scaling Laws for Time Series Foundation Models"

(ICLR'25) Towards Neural Scaling Laws for Time Series Foundation Models

[Paper Page] [Poster Page] [时序人中文解读]

1. 🚀 Install dependencies

pip install -r requirements.txt

2. 📚 Prepare the dataset

Download datasets from the Qingren/TSFM-ScalingLaws-Dataset. The directory organization structure is as follows:

- dataset_train
    |- Lotsa16B
    |- Lotsa1B
    |- Lotsa100M
    |- Lotsa10M
- dataset_test
    |- Lotsa16B
    |- Lotsa1B
    |- Lotsa100M
    |- Lotsa10M
    |- LSF
    |- Monash

Create a .env file to indicate the pretraining dataset paths.

LOTSA_16B_PATH=PATH/TO/LOTSA_16B
LOTSA_1B_PATH=PATH/TO/LOTSA_1B
LOTSA_100M_PATH=PATH/TO/LOTSA_100M
LOTSA_10M_PATH=PATH/TO/LOTSA_10M

Test data is composed of three parts: in-distribution data dataset_test/Lotsa[DataSize], out-of-distribution data dataset_test/LSF and dataset_test/Monash.

Take the test data of Lotsa16B as an example, the storage_path fields in config file cli/conf/pretrain/val_data/Lotsa16B_multi.yaml indicate the test data path. The default path is given as follows:

- _target_: tsfm.data.builder.ConcatDatasetBuilder
  _args_:
    - _target_: tsfm.data.builder.simple.SimpleEvalDatasetBuilder
      ...
      storage_path: dataset_test/Monash
- _target_: tsfm.data.builder.ConcatDatasetBuilder
  _args_:
    - _target_: tsfm.data.builder.simple.SimpleEvalDatasetBuilder
      ...
      storage_path: dataset_test/LSF
- _target_: tsfm.data.builder.ConcatDatasetBuilder
  _args_:
    - _target_: tsfm.data.builder.simple.SimpleEvalDatasetBuilder
      ...
      storage_path: dataset_test/Lotsa16B

3. 🛠 Training Models

The hyperparameters of the model are defined in cli/conf/pretrain/model/[Model]_[ModelSize].yaml.

The general training config is defined in cli/conf/pretrain/default_[ddp/fsdp]_val.yaml

# train an encoder
python -m cli.train_val -cp conf/pretrain -cn default_ddp_val_enc \
model=encoder_10M \
data=lotsa16B_weighted \
val_data=lotsa16B_lsf_monash \
trainer.logger.project=demo_scalinglaws \
run_name=encoder10M_lotsa16B

# train a decoder
python -m cli.train_val -cp conf/pretrain -cn default_ddp_val_dec \
model=decoder_10M \
data=lotsa16B_weighted \
val_data=lotsa16B_lsf_monash \
trainer.logger.project=demo_scalinglaws \
run_name=decoder10M_lotsa16B

4. 📈 Data Analysis

When training models varying different numbers of parameters and different pretraining datasizes, the loss and metrics will be recorded via wandb. We need to rename each experiment in wandb following the format [encoder/decoder]_[ModelSize]_[DataSize], such as encoder_10M_16B.

After collecting a series of experiments, download the wandb log and use the Jupyter scripts under analysis to fit and visualize the scaling laws.

5. 📦 Well-trained Models

The well-trained models are available in the PeacefulData/TSFM-ScalingLaws-Checkpoints. You can try using the models with the Jupyter scripts in the demo directory.

Citation

🙋 Please let us know if you find out a mistake or have any suggestions!

🌟 If you find the codebase helpful in your research, please consider to star this repository and cite the corresponding paper:

@misc{yao2024towards,
      title={Towards Neural Scaling Laws for Time Series Foundation Models},
      author={Yao, Qingren and Yang, Chao-Han Huck and Jiang, Renhe and Liang, Yuxuan and Jin, Ming and Pan, Shirui},
      year={2024}
      eprint={2410.12360},
      archivePrefix={arXiv},
      url={https://arxiv.org/abs/2410.12360}
}

@inproceedings{shi2024time,
  title={Time-moe: Billion-scale time series foundation models with mixture of experts},
  author={Shi, Xiaoming and Wang, Shiyu and Nie, Yuqi and Li, Dianqi and Ye, Zhou and Wen, Qingsong and Jin, Ming},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2025}
}

🌟 Please also check out our team’s latest research projects listed below.

TimeOmni-1: Incentivizing Complex Reasoning with Time Series in Large Language Models [paper]
T2S: High-resolution Time Series Generation with Text-to-Series Diffusion Models, in IJCAI 2025. [paper] [GitHub Repo]
Time-MQA: Time Series Multi-Task Question Answering with Context Enhancement, in ACL 2025. [paper] [Hugging Face]
TimeMixer++: A General Time Series Pattern Machine for Universal Predictive Analysis, in ICLR 2025. [paper] [GitHub Repo]

Acknowledgments

Our implementation builds upon the codebases of Uni2ts, which have been extensively modified to suit our specific requirements. We thank the authors of these implementations for sharing their code and providing related resources, which have been invaluable to this work.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
analysis		analysis
cli		cli
demo		demo
scripts		scripts
tools		tools
tsfm		tsfm
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

(ICLR'25) Towards Neural Scaling Laws for Time Series Foundation Models

1. 🚀 Install dependencies

2. 📚 Prepare the dataset

3. 🛠 Training Models

4. 📈 Data Analysis

5. 📦 Well-trained Models

Citation

Further Reading

Acknowledgments

About

Uh oh!

Releases

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

(ICLR'25) Towards Neural Scaling Laws for Time Series Foundation Models

1. 🚀 Install dependencies

2. 📚 Prepare the dataset

3. 🛠 Training Models

4. 📈 Data Analysis

5. 📦 Well-trained Models

Citation

Further Reading

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Contributors

Uh oh!

Languages