Throttle full sync replication on master

When testing replication with SSD tiered replica, we saw that the latency on the master jumps.

The reason for this is that the back pressure from the replica back propagates to the master, and then write commands can not proceed with writes as they wait on socket even with very modest write traffic, as the majority of the bandwidth is used by the snapshotting process.

Recently we had a similar problem with outgoing slots migrations and I fixed it inside `RestoreStreamer::Run` by adding the `(pending_buf_.Size() >= replication_stream_output_limit_cached / 3)` check that postpones the migrations if the socket is overfilled.  It was successful to make slot migrations less intrusive to the outgoing writes (SETs).

It's not straightforward to do such check with full sync replication as we do not employ the async interface there. But I would love adding some heuristic that regulates snapshotting pace so that outgoing writes won't be throttled. Wth async interface we have an explicit `JournalStreamer::ThrottleIfNeeded`  but with sync interface it happens in the OS level when the networking output buffer is full and it pushed the packets out. 

## Read latency
Another problem that also requires throttling of the writes inside full sync - outgoing bandwidth limits of the host. When replication happens it easily reaches GB/s bandwidth that can saturate the network limits of the master host. When it happens then with read heavy workloads, the outbound bandwidth is throttled by the cloud and the latency of read requests (MGETs)  skyrockets as well. I would like us to introduce the `outgoing_replication_bandwidth_limit` flag that would allow us to limit the pace of the snapshotting (but not the outgoing writes). 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Throttle full sync replication on master #5874

Read latency

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Throttle full sync replication on master #5874

Description

Read latency

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions