[Postgres] Optimize transaction replication throughput #228
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Previously, if we got 100k transactions each with a single operation, we'd commit/flush that 100k times to the bucket storage. Flushing is slow - in the case of a storage cluster under load, this can take 100-500ms per flush, resulting in throughput of less than 10 transactions per second.
Lucky for us, Postgres already chunks messages together in the replication stream. So now we look ahead in the current chunk to see if there are any more commit messages. If there are, we only flush/commit on the last one. This means if we get many transactions in a single chunk, they are all batched together in a single flush to the bucket storage.
For reference, we already have similar behavior in our MongoDB replication implementation.
In theory we could also batch transactions across multiple replication chunks, but I don't expect significant further gains from that, and it could increase complexity and memory usage.
This also adds Postgres 17 to the test matrix (released Sept 2024).