Skip to content

Multipart upload: read directly from fileΒ #494

@ALRBP

Description

@ALRBP

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue, please leave a comment

Prerequisites

Question Description

I am currently developing a program that will (among various things) upload large files to S3.

I found how to perform normal uploads directly from files (without having to load them in RAM) in the doc, but I have an issue with multipart uploads (required for larger files).

If I understand things correctly, Client.put_object() will not perform a multipart upload and, unlike for other languages, the SDK does not provide a high-level API to do them automatically, you have to do them manually using Client.create_multipart_upload(), Client.upload_part() and Client.complete_multipart_upload().

The issue is that, while when performing a normal upload, I can just do .body(ByteStream::from_path("some/path"), it does not seem possible to do the same with a multipart upload. Since .body() must be called for each part, creating a new ByteStream from a path each time won't work (only the beginning of the file will be read) and I did not find a proper way to seek in a ByteStream (except reading individual bytes in a loop, which would lead to useless IO). The .take() function of the trait StreamExt sounds interesting, but its result is not usable as parameter for .body(), and simple trying to pass the same ByteStream at each iteration will not work, as the Rust compiler consider, likely for good reasons, that the ByteStream is "moved" when passing it to .body() (not to mention the fact that the function may not even know when to stop reading, since it is not clear for me whether .content_length() has an effect on stream reading). The only workaround I can imagine is to read the whole part from the file to RAM and use ByteStream::from(), but this is definitely not a proper way to perform an upload from a file.

So, did I miss something? Is there a proper way to perform a multipart upload reading data directly from a file, is there an issue with the API or is this a feature left for a future version?

I am not very experienced with Rust, and I am discovering the AWS SDK as well as the 3rd party libraries it relies on, so maybe it's just me who do not understand how to do things properly, but in that case, I would appreciate some help, in the other case, I hope this SDK could be improved to allow multipart uploads directly from a file.

Platform/OS/Device

GNU/Linux (amd64)

Language Version

Rust 1.59.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature-requestA feature should be added or improved.p2This is a standard priority issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions