-
-
Notifications
You must be signed in to change notification settings - Fork 8.6k
16612 download large files #16627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
16612 download large files #16627
Conversation
PR Compliance Guide 🔍Below is a summary of compliance checks for this PR:
Compliance status legend🟢 - Fully Compliant🟡 - Partial Compliant 🔴 - Not Compliant ⚪ - Requires Further Human Verification 🏷️ - Compliance label |
|||||||||||||||||||||||||||
PR Code Suggestions ✨Explore these optional code suggestions:
|
||||||||||||||||
e993ed2 to
f42d85e
Compare
I added a new Grid endpoint "/se/files/:name" which allows downloading the file directly, without encoding it to Base64 and adding to Json. This transformation kills the performance and causes OutOfMemory errors for large files (e.g. 256+ MB). NB! Be sure that `toString()` method of objects (HttpRequest, HttpResponse, Contents.Supplier) never returns too long string - it spam debug logs and can cause OOM during debugging.
…ier` to separate classes It makes debugging easier. You can easily see what instances they are and where they come from.
Instead of reading the whole file to a byte array, just save given InputStream directly to the file. Now it can download large files (I tried 4GB) while consuming very low memory.
… deleted After stopping a Grid node, the folder is deleted asynchronously (by cache removal listener). So we need to wait for it in test.
…opped At least on my machine, stopping the node takes some time, and any checks right after `node.stop(sessionId)` often can fail.
f42d85e to
2c190cc
Compare
User description
🔗 Related Issues
Fixes #16612
💥 What does this PR do?
This PR allows downloading very large files (I tried up to 4GB) from Selenium Grid, while consuming very low memory.
🔄 Types of changes
PR Type
Enhancement, Bug fix
Description
Add new Grid endpoint
/se/files/:namefor direct file downloads without Base64 encodingOptimize
RemoteWebDriver.downloadFile()to stream files directly instead of loading into memoryExtract anonymous
Contents.Supplierimplementations into separate named classes for better debuggingSupport downloading large files (tested up to 4GB) with minimal memory consumption
Add URL decoding support and improve file download error messages with available files list
Diagram Walkthrough
File Walkthrough
19 files
Add toString method for better debuggingAdd new GET endpoint for file downloadsImplement streaming file download with URL decodingAdd URL decoding utility methodExtract anonymous Contents.Supplier implementationUse FileBackedOutputStreamContentSupplier and thread-safe lengthtrackingAdd new GET_DOWNLOADED_FILE command constantStream file downloads directly without Base64 encodingMap new GET_DOWNLOADED_FILE command to endpointHandle binary OCTET_STREAM responses and avoid logging large contentExtract anonymous bytes supplier implementationChange length to long, add file and stream suppliers, deprecate unsafemethodsNew supplier for streaming file content with size metadataAdd toString and contentAsString methodsImprove toString to include parent content informationUse parent toString to avoid loading large contentNew supplier for streaming input with size limit protectionUse InputStream handler instead of byte array for responsesCreate response with streaming content from InputStream1 files
Add test for file name extraction from URI1 files
Add jspecify dependency for annotations