-
Notifications
You must be signed in to change notification settings - Fork 24
Description
Proposal
Problem statement
There exists in the ecosystem structs that are both:
- Iterators (they implement
std::iter::Iterator) - "Entry-point" of some crate specific logic. This logic may even be independent of the Iterator's state of advancement.
Sometimes, one wants to wrap such an iterator in another struct to implement new logic on it via methods. This is all very fine until one wants to also use iterator combinator (say filter). As of now, once an iterator I is wrapped in an adapter A<I>, there is no way to access the I again.
This proposal discusses adding a method inner() and a method into_inner() to some iterator adapters of the standard library.
Motivation, use-cases
The example in my case is xmlparser::Tokenizer. This is an iterator on XML Tokens, but also the entry point of the crate, created with Tokenizer::from(&str). I am using it to parse a format built on top of XML.
In a yet to be published version of the xmlparser crate, Tokenizer has a stream(&self) -> Stream<'a> method, which in turns gives access to the underlying Stream's gen_text_pos_from(&self, pos: usize) which allows to compute a position in line/col from a position in bytes in the text. This is useful to get clear error messages when there are error in the outmost format.
Initially, my FormatParser wrapped a Peekable<Filter<Tokenizer, _>>, because some tokens are irrelevant for this format, and because the architecture required a look ahead of one. My problem was that I could not access the Stream this way.
Solution sketches
Proposed solution
The solution would be to add an inner(&self) -> &I to Adapter<I> in cases where it is possible.
Addition of a into_inner(self) -> I method can be also discussed but is more worthy of attention. Indeed, as pointed out in discussion elsewhere (see links), the state of the inner iterator is not necessarily obvious from the wrapper. Giving back ownership of the iterator could lead to logic error. Only allowing a & getter mitigates these issues, as one cannot iterate on a non mutable reference. (Clonable iterator would however be an issue).
Another issue to discuss is, quoting @the8472 comment:
At least for
Zipand possibly other adapters this is unsound becauseTrustedRandomAccessspecializations bring the iterator into a state that is not consistent unless used exclusively through the methods whitelisted in theTrustedRandomAccessdocumentation.It would also preclude some future optimizations that would similarly assume that internals are not observable once an iterator has been wrapped into an adapter.
Other examples of the stronger than first thought commitment that such an API would create is given by @scottmcm here
For a bunch of them, though, this being an infallible API is more of a commitment than you might be thinking. For example,
Take<I>is currently(usize, I)internally, but that's not really fundamental. It could also beOption<(NonZeroUsize, I)>internally, for example, so that it drops the wrapped iterator once it'll no longer take anything from it. And that would mean thatinto_inner(self) -> Icould no longer be implemented.Even for
Skip, where taking out the inner thing is comparatively reasonable because after the first call it doesn't do anything, the behaviour isn't necessarily obvious. It would be very easy for someone to writeit = it.skip(10).into_inner();and expect that to be equivalent toit.advance_by(10), but it's very much not.
In conclusion, much of the discussion to be had revolves around deciding for which, if any, adapter the proposed methods would make sense. It coudl also be envisionned that once the reasons preventing the implementation of these methods are clearer, a alternative and more constraining API is designed.
Alternative solution
The alternative is to preserve the status quo, which, in my opinion, officialize that iterator adapters are intended to be used as "lightweight and discardable".
Links and related work
- Draft: Add inner and into_inner methods to iterator adapters rust#103294
- Tracking Issue for (into_)inner() methods on iterator adapters rust#103302
- https://rust-lang.zulipchat.com/#narrow/stream/122652-new-members/topic/.5BPre.20RFC.5D.20.28into_.29inner.28.29.20for.20Iterator.20Adapters
What happens now?
This issue is part of the libs-api team API change proposal process. Once this issue is filed the libs-api team will review open proposals in its weekly meeting. You should receive feedback within a week or two.