Skip to content

Revert to a single COPY operation per table instead of per chunk #308

@wemrysi

Description

@wemrysi

The per-chunk strategy appears to result in rather poor performance once data size increases. We'd like to revert to using a single COPY operation per-table to attempt to reclaim some performance. Some of the previous reliability concerns can be ameliorated via source buffering.

In the case that we still run into reliability issues due to timeouts/long-running transactions a possible solution would be to define a maximum duration between writes to the COPY stream. If the threshold is reached, we commit the current operation and begin anew on the next chunk from upstream. This should avoid timeouts for slow sources while preserving performance where possible.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions