Skip to content

FEAT: Training window filtering#1344

Merged
marcopeix merged 19 commits intomainfrom
feature/training-window-filtering
Jul 17, 2025
Merged

FEAT: Training window filtering#1344
marcopeix merged 19 commits intomainfrom
feature/training-window-filtering

Conversation

@marcopeix
Copy link
Contributor

Currently, we create the maximum number of training windows, meaning that we might have windows with only 1 available insample data point and 1 available outsample data point.

These are technically low quality windows.

This PR adds the parameter available_sample_fractions to control how many available insample and outsample data points should be available as a fraction of input size and horizon for insample and outsample respectively.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@marcopeix
Copy link
Contributor Author

It's basically a rework of #1059 . Many thanks to @jasminerienecker for the initial idea!

@marcopeix marcopeix marked this pull request as ready for review June 20, 2025 13:47
@marcopeix marcopeix requested a review from elephaint June 20, 2025 13:47
Copy link
Contributor

@elephaint elephaint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, few comments!

@marcopeix marcopeix requested a review from elephaint July 9, 2025 16:16
Copy link
Contributor

@elephaint elephaint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good work! 2 things:

  • It needs a proper test (the current one has no value)
  • Minor adjustment in the implementation is required for multivariate models I think (which becomes apparent in a test)

@marcopeix marcopeix requested a review from elephaint July 16, 2025 17:14
elephaint
elephaint previously approved these changes Jul 17, 2025
Copy link
Contributor

@elephaint elephaint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Good for me if you agree with the final changes I made (cosmetic and I changed the tests + added explanation to the tests)

Copy link
Contributor

@elephaint elephaint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcopeix good to go once you verify/check my explanations in the test

@elephaint elephaint dismissed their stale review July 17, 2025 08:43

Incorrect

@elephaint elephaint self-requested a review July 17, 2025 08:43
@marcopeix marcopeix merged commit f0eeab6 into main Jul 17, 2025
18 checks passed
@marcopeix marcopeix deleted the feature/training-window-filtering branch July 17, 2025 14:56
@Antoine-Schwartz
Copy link

Sampling quality is a really interesting subject. Thank you for taking the time to add this first option @marcopeix !

By the way, it would be great to have a dedicated documentation section.
Sampling with time series, and particularly the neuralforecast implementation with the mask system is not obvious to understand.
And if you also look at the interactions with other options such as start_padding_enable, you can quickly get lost :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants