Skip to content

Throttle full sync replication on master #5874

@romange

Description

@romange

When testing replication with SSD tiered replica, we saw that the latency on the master jumps.

The reason for this is that the back pressure from the replica back propagates to the master, and then write commands can not proceed with writes as they wait on socket even with very modest write traffic, as the majority of the bandwidth is used by the snapshotting process.

Recently we had a similar problem with outgoing slots migrations and I fixed it inside RestoreStreamer::Run by adding the (pending_buf_.Size() >= replication_stream_output_limit_cached / 3) check that postpones the migrations if the socket is overfilled. It was successful to make slot migrations less intrusive to the outgoing writes (SETs).

It's not straightforward to do such check with full sync replication as we do not employ the async interface there. But I would love adding some heuristic that regulates snapshotting pace so that outgoing writes won't be throttled. Wth async interface we have an explicit JournalStreamer::ThrottleIfNeeded but with sync interface it happens in the OS level when the networking output buffer is full and it pushed the packets out.

Read latency

Another problem that also requires throttling of the writes inside full sync - outgoing bandwidth limits of the host. When replication happens it easily reaches GB/s bandwidth that can saturate the network limits of the master host. When it happens then with read heavy workloads, the outbound bandwidth is throttled by the cloud and the latency of read requests (MGETs) skyrockets as well. I would like us to introduce the outgoing_replication_bandwidth_limit flag that would allow us to limit the pace of the snapshotting (but not the outgoing writes).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions