Skip to content

Conversation

soffer-anyscale
Copy link
Contributor

Why are these changes needed?

The current XGBoostTrainer is limited in scalability due to having to materialize the dataset. This PR adds the ability to scale using the external memory feature and iterate over a larger dataset.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: soffer-anyscale <[email protected]>
@soffer-anyscale soffer-anyscale requested a review from a team as a code owner August 12, 2025 22:04
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and valuable feature to XGBoostTrainer, enabling it to handle datasets larger than memory by leveraging XGBoost's external memory capabilities. The implementation is well-structured, with new utility modules for system detection, parameter optimization, and data iteration. The high-level APIs in train_loop_utils.py make this advanced feature very easy to use.

My review includes a few suggestions to improve robustness and clarity. I've pointed out a potential memory issue in the custom iterator, suggested improvements to error handling and docstrings, and noted some minor code quality issues like unused imports. Overall, this is a great contribution that significantly enhances the scalability of XGBoost training in Ray.

@ray-gardener ray-gardener bot added docs An issue or change related to documentation train Ray Train Related Issue labels Aug 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs An issue or change related to documentation train Ray Train Related Issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant