Allow pipeline offloading to disk

# The problem

If a client dispatches a lot of pipeline commands before it attempts to read any of the responses back it might deadlock itself. Why? The AsyncFb will eventually block on a command response because the socket's send buffers are saturated (because the client is not reading). This will halt command execution on the AsyncFb because the fiber is blocked on io.  The ConnFb will keep reading from the socket and dispatch the requests asynchronously to the AsyncFb until it gets saturated eventually as well. At this point, both ConnFb snd AsyncFb are not any making progress and client is stuck on sending more pipeline commands. Client will not start reading until it sends all of its pipeline requests and dragonfly will not read any of them as its dispatch_q_ is saturated and its ConnFb blocked.

# The solution

Is to unblock the ConnFb by offloading socket backpressure to disk. Client requests won't freeze that way and `ConnFb` will keep serving until the client switches to reader mode (and when that happens, AsyncFb will eventually unblock).

# First approach: offload parsed commands to disk directly

As seen in the original prototype in https://github.com/dragonflydb/dragonfly/pull/6011.  To summarize the flow here:

a. ConnFb -> b. ReadFromSocket -> c. ParseRequest -> d. Offload to disk or SendAsync

The problem with this approach is now the utility class that writes and reads to disk in step (d) needs to serialize it again when offloaded to disk. The reason for that is that our in memory structure for a parsed command can't be deserialized as is (and if written as is) because binary strings lack separation characters. We would like to avoid this ping-pong behavior. Note that resp3 is already a serialization protocol and is at least redundant and inefficient to parse and deserialize twice a given command.

# Second approach: offload socket data as is to disk. This way we don't need to serialize etc.

This solves any of the redundancies discussed in the previous section but it does come with its own set of side effects.

Notice, that the flow within the ConnFb is synchronous. The fiber first reads from the socket and then dispatches. What happens if the ConnFb is stuck on client read and the AsyncFb must make progress and the only data available exists on disk (because it drained the in memory backpressure) ?  Naturally the answer should be, AsyncFb reads from disk. Great, but who does the parsing if ConnFb is blocked waiting for client input ?

The problem here is that parsing is the responsibility of the ConnFb -- the AsyncFb is just there for action. Without separation of concerns the producer/consumer behaviour of the two falls apart. But it's not only the logical semantics of the two that are important -- had we delegated some of the parsing responsibility to AsyncFb we would have had a much more complicated state machine that is shared between the two. This is not a good direction.

https://github.com/dragonflydb/dragonfly/issues/6006 discusses a redesign of pipelining. 

Specifically, I quote:

> 1. Stop using FiberRead and switch to OnRecv interface (https://github.com/romange/helio/pull/480). As OnRecv separates read notification from reading data, we can reduce reliance on per connection io_buf.
OnRecv can be called at any time when a fiber is preempted. Meaning that it will be able to read data during its other preemption points like cmd.dispatch.

By switching the ConnFb socket read path to use OnRecv we can unblock it from waiting for data to be available on the socket. That way, ConnFb can now parse data solving "the parsing in separate fibers" issue.

Bonus points is that now we have the first step in check for #6006

- [ ] Integrate and test OnRecv in ConnFb #6028 
- [ ] Implement a utility class that writes or reads a stream of data to disk [#6029]
- [ ] TlsSocket and OnRecv in helio https://github.com/romange/helio/pull/480/files#diff-e23cdb5eee2e5d12ad3b9853e48dcfdc99ddbb3670ee960206c491dbcbe79c5aR472
- [ ] Offload connection backpressure to disk + test #6030

We can also use https://github.com/dragonflydb/dragonfly/pull/6011 to experiment and prototype.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow pipeline offloading to disk #5952

The problem

The solution

First approach: offload parsed commands to disk directly

Second approach: offload socket data as is to disk. This way we don't need to serialize etc.

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Allow pipeline offloading to disk #5952

Description

The problem

The solution

First approach: offload parsed commands to disk directly

Second approach: offload socket data as is to disk. This way we don't need to serialize etc.

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions