Skip to content

[file_packager] split data files when file size exceeds 2Gi ArrayBuffer limit #24802

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

arsnyder16
Copy link
Contributor

@arsnyder16 arsnyder16 commented Jul 29, 2025

Addresses #24691

Currently ArrayBuffer cannot be larger than Number.MAX_SAFE_INTEGER. So bundling a large amount of files using file_packager.py the .data will not be allowed to load. Breaking up files into multiple data files by passes this issue.

error while handling : http://localhost:3040/Gtest.data Error: Unexpected error while handling : http://localhost:3040/Gtest.data 
RangeError: Array buffer allocation failed

@arsnyder16
Copy link
Contributor Author

arsnyder16 commented Jul 29, 2025

@sbc100 Is this aligned with what you were thinking?

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some tests for this? What are the specific limits you are running into? Are those limits not fixed? i.e. does it need to be configurable?

@arsnyder16
Copy link
Contributor Author

Can we add some tests for this? What are the specific limits you are running into? Are those limits not fixed? i.e. does it need to be configurable?

I wanted to make sure i was on the right track before adding tests.
The limits are described in conversation in the issue i mentioned. We can hard code the limit to 2Gi as a likely guess at a true limit. If that is what you prefer its simpler from testing perspective . Seemed more flexible to make it configurable to anyone calling the utility, but i get the simplicity side of it

@sbc100
Copy link
Collaborator

sbc100 commented Jul 29, 2025

This does seem like a reasonable approach.

I'm still a little fuzzy on exactly why and when this might be needed in the real world, I think I need to go re-read our original discussion, but I also think including more information in the PR (i.e. in the description, or in comments) would be good.

@arsnyder16
Copy link
Contributor Author

This does seem like a reasonable approach.

I'm still a little fuzzy on exactly why and when this might be needed in the real world, I think I need to go re-read our original discussion, but I also think including more information in the PR (i.e. in the description, or in comments) would be good.

Updated the description, based on that if you want me to hard code the limit into the package i can take that approach

@arsnyder16 arsnyder16 changed the title [file_packager] split data files when files exceeds configured limit [file_packager] split data files when file size exceeds 2Gi ArrayBuffer limit Jul 31, 2025
@arsnyder16
Copy link
Contributor Author

@sbc100 Going to look at adding tests for this. Do you have an recommendations? Would you like me to generate temporary files (s) so that i can reach the 2Gi limit?

@arsnyder16 arsnyder16 marked this pull request as ready for review August 10, 2025 18:48
@arsnyder16
Copy link
Contributor Author

@sbc100 This should be ready for review, not sure about the test failure if they are flaky or if its related to my change. I don't see them fail locally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants