-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Which package is the feature request for? If unsure which one to select, leave blank
@crawlee/core
Feature
It would be great to have a plugin or API that allows accessing the binary data directly, or at least the option to store it in a Dataset without serialization.
I see that the current interface allows restoring data via Buffer.from, but I’m not sure about the efficiency of this approach or the acceptable data size it can handle.
Motivation
I’d like to share my use case.
I have the Crawlee logic extracted into a separate package, and the result of this package’s work is a screenshot.
Right now, I have to upload it to S3 directly inside the requestHandler, which feels like a terrible anti-pattern.
Ideal solution or implementation, and any additional constraints
Ideally, the run method would return a promise with the result from the requestHandler.
I understand this isn’t a simple task and could potentially lead to race conditions.
The most straightforward solution would be to either allow storing binary data in a Dataset, or provide an interface for working with S3.
Alternative solutions or implementations
No response
Other context
No response