Evaluation of TimesFM-2.5 and clarification about leakage.

Firstly, thanks a lot for creating this great benchmark. 

We had a small clarification about the leakage column for TimesFM-2.5. 

According to the leaderboard TiReX has 1% leakage and TimesFM-2.5 has 8% leakage. However according to the model cards of TimesFM-2.5 and of TiREX the datasets used by them are:

1. TimesFM-2.5
--[GiftEvalPretrain](https://huggingface.co/datasets/Salesforce/GiftEvalPretrain)
--[Wikimedia Pageviews](https://meta.wikimedia.org/wiki/Pageviews_Analysis), cutoff Nov 2023 (see [paper](https://arxiv.org/abs/2310.10688) for details).
--[Google Trends](https://trends.google.com/trends/) top queries, cutoff EoY 2022 (see [paper](https://arxiv.org/abs/2310.10688) for details).
--Synthetic and augmented data.

2. TiReX
--[chronos_datasets](https://huggingface.co/datasets/autogluon/chronos_datasets) (Subset - Zero Shot Benchmark data is not used for training - details in the paper)
--[GiftEvalPretrain](https://huggingface.co/datasets/Salesforce/GiftEvalPretrain) (Subset - details in the paper)
--Synthetic Data

1. Since TimesFM-2.5 does not use chronos_datasets and GiftEvalPretrain is common to both, we were wondering why there is a difference in leakage percentage. Would you be able to clarify which 7 datasets are there in TimesFM-2.5's pretraining corpus and not in TiReX's? Thanks in advance. This might also have an impact on the evaluation rankings as the leakage ones are replaced by chronos-bolt numbers.

Also thanks for pointing out inference speed related issues [here](https://github.com/google-research/timesfm/issues/313). We have made the inference speed around 7x faster.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation of TimesFM-2.5 and clarification about leakage. #55

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Evaluation of TimesFM-2.5 and clarification about leakage. #55

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions