-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Description
buffered::bufreader::ReadBuffer implements axio::Read in arceos-main/modules/axio/src/buffered/bufreader.rs. In the implementation of read_to_string:
axio/src/buffered/bufreader.rs
Lines 110 to 143 in 1ea2440
| // The inner reader might have an optimized `read_to_end`. Drain our buffer and then | |
| // delegate to the inner implementation. | |
| #[cfg(feature = "alloc")] | |
| fn read_to_string(&mut self, buf: &mut String) -> Result<usize> { | |
| // In the general `else` case below we must read bytes into a side buffer, check | |
| // that they are valid UTF-8, and then append them to `buf`. This requires a | |
| // potentially large memcpy. | |
| // | |
| // If `buf` is empty--the most common case--we can leverage `append_to_string` | |
| // to read directly into `buf`'s internal byte buffer, saving an allocation and | |
| // a memcpy. | |
| if buf.is_empty() { | |
| // `append_to_string`'s safety relies on the buffer only being appended to since | |
| // it only checks the UTF-8 validity of new data. If there were existing content in | |
| // `buf` then an untrustworthy reader (i.e. `self.inner`) could not only append | |
| // bytes but also modify existing bytes and render them invalid. On the other hand, | |
| // if `buf` is empty then by definition any writes must be appends and | |
| // `append_to_string` will validate all of the new bytes. | |
| unsafe { crate::append_to_string(buf, |b| self.read_to_end(b)) } | |
| } else { | |
| // We cannot append our byte buffer directly onto the `buf` String as there could | |
| // be an incomplete UTF-8 sequence that has only been partially read. We must read | |
| // everything into a side buffer first and then call `from_utf8` on the complete | |
| // buffer. | |
| let mut bytes = Vec::new(); | |
| self.read_to_end(&mut bytes)?; | |
| let string = core::str::from_utf8(&bytes).map_err(|_| { | |
| axerrno::ax_err_type!(InvalidData, "stream did not contain valid UTF-8") | |
| })?; | |
| *buf += string; | |
| Ok(string.len()) | |
| } | |
| } | |
| } |
When buf is empty, read_to_string calls append_to_string, which reads raw bytes directly into the buf and performs UTF-8 validation afterwards. If validation fails, the already written bytes are not rolled back, leaving the buf in an invalid UTF-8 state and violating Rust’s String invariant.
When the destination buf is not empty, append_to_string uses a temporary buffer and validates UTF-8 before appending, so this issue does not occur. This makes the behavior depend on buf.is_empty(), which is inconsistent.
Reproduction
This issue can be reproduced by the PoC following:
#[test]
fn poc_read_to_string_empty() {
let a = [0xFFu8];
let mut s = String::new();
let mut reader = BufReader::new(&a[..]);
let res = reader.read_to_string(&mut s);
// read_to_string return Err, but s is already changed!
assert!(res.is_err());
// Now s = [0xFF], which is an invalid string
// String::from_utf8 will fail
String::from_utf8( s.as_bytes().to_vec()).unwrap(); // fail here
}