Decouple file upload from BE #155
Replies: 3 comments
-
What we should take into consideration is that in this case, when bucekt/file is already present in S3, we would also need to have accessKey/secretKey for user/policy who can access this bucekt/object. |
Beta Was this translation helpful? Give feedback.
-
Investigate how current logic works if the file is not yet uploaded. Can you obtain the presignedURL i the upload is still ongoing? Can you start downloading if the file is still uploading? Same goes for listing all objects from S3. |
Beta Was this translation helpful? Give feedback.
-
Idea might be to have once flow for dataset creation:
This will result in unique dataset creation flow, to avoid this use case from:
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
When adding a new dataset through the UI, we can type in an external URL or select a file. That file will be sent thorugh the UI to the BE and uploaded to S3.
For the moment the size limitation on the BE is "unlimited" (tested with 7gb file). It is achieved by chunking the file inputStream and using these properties (not sure if all are needed):
spring.servlet.multipart.max-file-size=-1 spring.servlet.multipart.max-request-size=-1 spring.servlet.multipart.enabled=true server.tomcat.max-http-form-post-size=-1
Another possibility and improvement could be to first upload the file to S3 and create an API on BE that will send a list of files from the preffered bucket to the UI, then on the UI when creating a new Dataset we would directly select a file and "atach" it to the Dataset.
This could make the upload process faster and give the possibility of easier integration with solutions that already use S3 as a storage.
Pros:
Cons:
Beta Was this translation helpful? Give feedback.
All reactions