-
Notifications
You must be signed in to change notification settings - Fork 36
Description
I understand this would be breaking change, and maybe worth discussing how this could be done, but currently if I want to put a buffered reader as the source for Saphyr parser (0.0.6), I finally need to implement T: Iterator<Item = char>. My source is T in this your code:
pub struct BufferedInput<T: Iterator<Item = char>> {
/// The iterator source,
input: T,
/// Buffer for the next characters to consume.
buffer: ArrayDeque<char, BUFFER_LEN>,
}This means my T must provide the method
fn next(&mut self) -> Option<char>
The problem with this approach is:
- What should I do if I encounter IO error? The value
Noneis reserved for EOF. As YAML end of document marker is optional, returning EOF on IO error may silently pass the wrong (truncated) YAML document as valid, potentially with far reaching consequences.panic!I also do not like. - Same if my budget checker concludes that our raw input is already too long (this is how I actually discovered the limitation). While
Stringcan be checked for the total length before parsing, theReadernot.
This can be fixed by changing the signature to return either Result<char, Error> where Error is the enum with one value reserved for EOF, or returning Result<Option<char>, io::Error> when None in place of the char would mean EOF has been reached while error should also be reported.
Finally, "non breaking" but still not without concerns idea I am currently attempting is to return FFFD that is invalid Unicode char. This also forces parser to stop, but the only way to report the reason why I did so is eprintln!() that is also not very elegant.
I suggest to think about possibility of error reporting. The parser itself could return this error as any other kind of ScanError, this functionality on the event listening is already implemented.