Skip to content

[enhancement] Detect blob-name collisions in sinks #2053

@mkeskells

Description

When a sink uploads a file, there is a chance that the blobnames will collide
In that case (currently) the blob is overwritten with a new blob

I suggest that thiis behavior should be enhanced to allow the result to be determined by a policy

this could occur because of the resuart of a task, trecovery from timeout etc where the content is the same, or other situations where the content has changed

e.g.

  • OVERWRITE (default, current behaviour) generate a new version
  • NEW_NAME - generate a new blob name, and retry
  • CHECK_NEW_NAME - check the file content, if it matches then do nothing, otherwise do the same as NEW_FILE

there may be other options - I presume that for the exactly once there may be a CHECK_FAIL - as it muct be the same content

Having CHECK_NEW_NAME would reduce the number of blobs uploaded, the therefore the TCO

Happy to assist with a PR/discuss further if the maintainers consider this is a viable direction

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions