Skip to content

[Feature] Enable Entropy Inject for data file path to prevent being throttled by object storage #6825

@qingfei1994

Description

@qingfei1994

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

When hadoop s3 filesystem is trying to upload a file larger than 128MB, it would convert from put object into multipart upload and sometimes it would stuck, especially during full compaction, Paimon will upload more files to object storage and being throttled.
Iceberg has provided an options write.object-storage.enabled to add a computed hash component for data path to prevent being throttled.
https://iceberg.apache.org/docs/nightly/docs/configuration/?h=write.object+storage.enabled#write-properties
It would be better if Paimon also provide the same fuctionality.

Solution

  • adding computed hash for data files and leverage the ExternalPathProvider

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions