Skip to content

Commit 6d079ff

Browse files
committed
Fix a crash involving unknown YAML aliases
I finally tried fuzzing xt, and immediately discovered this issue in the chunker. It assumed that every YAML document would contain at least one scalar or collection between the start and end of a document, and unwrapped an Option based on that faulty knowledge. I haven't committed the fuzzing setup at this point, but I probably should after releasing this.
1 parent b623788 commit 6d079ff

File tree

3 files changed

+31
-9
lines changed

3 files changed

+31
-9
lines changed

CHANGELOG.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,15 @@
1+
## Unreleased
2+
3+
### Fixed
4+
5+
- **Crashes on certain invalid YAML inputs.** Previous versions of xt may have
6+
aborted with a Rust panic message when handling a YAML document consisting
7+
solely of an alias to an undefined anchor (e.g. `*anchor`). This bug was
8+
discovered through automated [fuzzing][fuzzing]; with the fix applied, xt now
9+
survives far longer fuzzing runs without further crashes.
10+
11+
[fuzzing]: https://rust-fuzz.github.io/book/introduction.html
12+
113
## v0.19.3 (2025-02-21)
214

315
### Changed

src/yaml.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ pub(crate) fn input_matches(mut input: Ref) -> io::Result<bool> {
2626
Ref::Reader(r) => Chunker::new(Encoder::new(BufReader::new(r), encoding)).next(),
2727
};
2828
match chunk {
29-
Some(Ok(doc)) => Ok(!doc.is_scalar()),
29+
Some(Ok(doc)) => Ok(doc.is_collection()),
3030
Some(Err(err)) if err.kind() == io::ErrorKind::InvalidData => Ok(false),
3131
Some(Err(err)) => Err(err),
3232
None => Ok(false),

src/yaml/chunker.rs

Lines changed: 18 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,7 @@ where
9898
let chunk = self.parser.reader_mut().take_to_offset(event.end_offset());
9999
self.last_document = Some(Document {
100100
content: String::from_utf8(chunk).unwrap(),
101-
kind: self.current_document_kind.take().unwrap(),
101+
kind: self.current_document_kind.take(),
102102
});
103103
}
104104
YAML_STREAM_END_EVENT => {
@@ -114,7 +114,7 @@ where
114114
/// A UTF-8 encoded YAML document.
115115
pub(super) struct Document {
116116
content: String,
117-
kind: DocumentKind,
117+
kind: Option<DocumentKind>,
118118
}
119119

120120
/// The type of content contained in a YAML document.
@@ -129,10 +129,10 @@ impl Document {
129129
&self.content
130130
}
131131

132-
/// Returns true if the content of the document is a scalar rather than a
133-
/// collection (sequence or mapping).
134-
pub(super) fn is_scalar(&self) -> bool {
135-
matches!(self.kind, DocumentKind::Scalar)
132+
/// Returns true if the content of the document is a collection (sequence or
133+
/// mapping).
134+
pub(super) fn is_collection(&self) -> bool {
135+
matches!(self.kind, Some(DocumentKind::Collection))
136136
}
137137
}
138138

@@ -219,8 +219,18 @@ test: true
219219
]
220220
);
221221

222-
let scalars = docs.iter().map(|doc| doc.is_scalar()).collect::<Vec<_>>();
223-
assert_eq!(&scalars, &[false, true, false]);
222+
let collections: Vec<_> = docs.iter().map(Document::is_collection).collect();
223+
assert_eq!(&collections, &[true, false, true]);
224+
}
225+
226+
// Tests that a YAML document consisting solely of an unknown anchor doesn't
227+
// crash the chunker. Fuzzing revealed this bug (via this exact input) in a
228+
// previous implementation.
229+
#[test]
230+
fn chunker_unknown_anchor() {
231+
const INPUT: &str = "*y";
232+
let chunker = Chunker::new(INPUT.as_bytes());
233+
chunker.collect::<Result<Vec<_>, io::Error>>().unwrap();
224234
}
225235

226236
#[test]

0 commit comments

Comments
 (0)