Skip to content

fix: error for large sample sizes in Multinomial NB#51

Merged
yoshoku merged 1 commit intoyoshoku:mainfrom
keegnotrub:multinomial-nb-with-large-sample-sizes
Oct 21, 2025
Merged

fix: error for large sample sizes in Multinomial NB#51
yoshoku merged 1 commit intoyoshoku:mainfrom
keegnotrub:multinomial-nb-with-large-sample-sizes

Conversation

@keegnotrub
Copy link
Contributor

I was seeing SystemStackError issues with Multinomial NB when running with larger sample sizes. I believe this is due to splatting the samples into Numo::DFloat as params. That param list can be very large with large sample sets, which overflows Ruby's default heap size.

I think instead we can use Numo::DFloat.cast here, passing in the array, as that will just be a single param.

@keegnotrub keegnotrub marked this pull request as ready for review October 20, 2025 15:08
@yoshoku
Copy link
Owner

yoshoku commented Oct 21, 2025

@keegnotrub Thank you for your contribution. The CI is failing due to RuboCop and commitlint validations. Please fix the issues detected by RuboCop. Also, this project adopts conventional commits. Please squash the commits and revise the commit messages accordingly.

@keegnotrub
Copy link
Contributor Author

Please squash the commits and revise the commit messages accordingly.

No problem. I should be able to get to this today. CI wasn't running yesterday (AWS issues) so I didn't notice these, apologies!

@keegnotrub keegnotrub force-pushed the multinomial-nb-with-large-sample-sizes branch from 9ed577d to 73e9296 Compare October 21, 2025 15:20
@keegnotrub keegnotrub changed the title Fix SystemStackError for large sample sizes in Multinomial NB fix: SystemStackError for large sample sizes in Multinomial NB Oct 21, 2025
@keegnotrub
Copy link
Contributor Author

Should be ready now, TY!

I was seeing SystemStackError issues with Multinomial NB when running
with larger sample sizes. I believe this is due to splatting the
samples into `Numo::DFloat` as params. That param list can be very
large with large sample sets, which overflows Ruby's default heap
size.

I think instead we can use `Numo::DFloat.cast` here, passing in the
array, as that will just be a single param.

I've also lifted this cast to the `bin_x` variable so we don't have to
do it per class.
@keegnotrub keegnotrub force-pushed the multinomial-nb-with-large-sample-sizes branch from 73e9296 to 3be5ed0 Compare October 21, 2025 22:45
@keegnotrub keegnotrub changed the title fix: SystemStackError for large sample sizes in Multinomial NB fix: error for large sample sizes in Multinomial NB Oct 21, 2025
Copy link
Owner

@yoshoku yoshoku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your work 👍

@yoshoku yoshoku merged commit 60d2b48 into yoshoku:main Oct 21, 2025
7 checks passed
@yoshoku
Copy link
Owner

yoshoku commented Oct 22, 2025

@keegnotrub Thanks for this fix. This has been included in version 2.0.1: https://rubygems.org/gems/rumale-naive_bayes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants