Skip to content

io: Scatter/gather / zero-copy APIs #677

@chrysn

Description

@chrysn

Checking with the Ariel OS team on whether there's anything amiss in embedded-io, the request for vectored I/O came up – revolving around "Are those APIs suitable for avoiding needless intermediate copies?".

There is some vagueness in what this entails; some aspects that seem important are:

  • Doing the equivalent of std::io::Write::write_vectored is almost trivial because we write to &mut self (so there's no danger of interleaving), and barring optimizations of using a single syscall, a write_vectored can be provided by calling single writes.
  • Same goes for a simple read_vectored that merely writes to multiple caller-allocated buffers.
  • A more powerful form of a vectored read (call it zero-copy read) would produce an iterator view of the data as it sits eg. in some TCP buffer; that data may be non-contiguous eg. when straddling a ring buffer's wraparound. That would not (as the std::io form) populate the moral equivalent of &mut [&mut [u8]], but would return roughly an impl Iterator<Item = &[u8]>.
  • There might be a powerful form of the Write as well, where some longer-lived type (something something AsRef<[u8]>) is sent to the Writer, which it may then consume at the rate of the device writing, but there's some uncertainty about whether that is relevant.

The pressing first point, in my current context of #566 (embedded-io 1.0) is: Would any of that need breaking changes if we released the current main as 1.0?

I think not, because:

  • For the simple forms, a read_vectored / write_vectored could be added to the trait at a later time as provided methods.
  • For the zero-copy read, a user implementing an application on a generic embedded_io::Read implementation could not know whether vectored reads are available -- for all they know, this would most likely come from a byte-oriented device that has no buffers of their own. (From that PoV, those who use embedded-io as the interface to a TCP connection are the weird ones out because their reads are actually buffered internally in the network stack).
    Nonetheless, an interface could be added as a provided method: The user would provide a fallback buffer to read into, which is what the provided method would populate through a regular read call. This may look wasteful in light of use with zero-copy-capable backends (after all, the user may not be motivated by the CPU cycles copying takes, but because their stack is limited), but an example implementation at https://godbolt.org/z/heEs3jG64 shows that through monomorphization, if the concrete Read impl's method never produces the "I wrote to your buffer" outcome, that allocation is dead code, and gets eliminated at build time.
  • I expect that any shenanigans with powerful Write would work with provided methods that just instantly block on writing the data; at worst, they'd need an associated type, and would thus depend on the stabilization of Tracking issue for RFC 2532, "Associated type defaults" rust-lang/rust#29661.

So my questions to the community / maintainers are:

  • Is this something you'd like to see considered / added / would review PRs for?
    Note that while this is primarily about embedded-io now, it may also help us build the equivalent tools for embedded-nal's UDP side. (Its TCP side is what'd profit from these methods already).
  • Do you agree that anything that'd be added could be added after an 1.0 release, as argued above?

CC'ing @kaspar030 with whom I've discussed this for the Ariel use cases.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions