Skip to content

Evaluation of TimesFM-2.5 and clarification about leakage.Β #55

@rajatsen91

Description

@rajatsen91

Firstly, thanks a lot for creating this great benchmark.

We had a small clarification about the leakage column for TimesFM-2.5.

According to the leaderboard TiReX has 1% leakage and TimesFM-2.5 has 8% leakage. However according to the model cards of TimesFM-2.5 and of TiREX the datasets used by them are:

  1. TimesFM-2.5
    --GiftEvalPretrain
    --Wikimedia Pageviews, cutoff Nov 2023 (see paper for details).
    --Google Trends top queries, cutoff EoY 2022 (see paper for details).
    --Synthetic and augmented data.

  2. TiReX
    --chronos_datasets (Subset - Zero Shot Benchmark data is not used for training - details in the paper)
    --GiftEvalPretrain (Subset - details in the paper)
    --Synthetic Data

  3. Since TimesFM-2.5 does not use chronos_datasets and GiftEvalPretrain is common to both, we were wondering why there is a difference in leakage percentage. Would you be able to clarify which 7 datasets are there in TimesFM-2.5's pretraining corpus and not in TiReX's? Thanks in advance. This might also have an impact on the evaluation rankings as the leakage ones are replaced by chronos-bolt numbers.

Also thanks for pointing out inference speed related issues here. We have made the inference speed around 7x faster.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions