PE: Add support for ARM64 unwind info#517
Conversation
|
Yesss! I’ll try to give this a review this weekend, thanks @kkent030315 :) |
|
Thanks, this patch would need some generalization over ARM and ARM64 (e.g., the scale of function length is 2 and xdata header is 1-2 bytes in ARM) so we can add support for ARM as well? I don't think we should support 32-bit ARM (i.e, Windows CE binaries) though...
Footnotes |
m4b
left a comment
There was a problem hiding this comment.
this looks good to me, is there any blocking issue you can think of/why we shouldn't merge?
| } | ||
|
|
||
| /// Returns an iterator over ARM64 runtime function entries. | ||
| pub fn functions_arm64(&self) -> Arm64RuntimeFunctionIterator<'a> { |
There was a problem hiding this comment.
it might be slightly more idiomatic to write iter_functions_arm64 ?
I would be tempted to rename this to arm64_functions ? though the downside is it starts with less interesting information (arm64() as opposed to (functions) 🤔
There was a problem hiding this comment.
Ideally, overall, we would do something like made up of wrapper enum, but it's obviously breaking change. arm64_functions is acceptable, however as you mentioned, it's less informative for developers, specifically when type func... in their IDE.
pub enum RuntimeFunction {
X86(X86RuntimeFunction),
Arm(ArmRuntimeFunction),
}| &self, | ||
| function: Arm64RuntimeFunction, | ||
| sections: &[section_table::SectionTable], | ||
| ) -> Option<error::Result<Arm64UnwindInfo<'a>>> { |
There was a problem hiding this comment.
Might be worth checking what other parts of goblin return in cases like this, Option<Result> or Result<Option> is probably better? Also, it might even be worth considering if the error is actionable or useful at all (e.g., effectively whether it's an OOB), to just ok() it into an Option and drop any error message info (again if it's not useful or actionable for the user)
I say this believing that more information the better, but if it's just satisfying the type system, might be better to coalesce the return type into a single one?
There was a problem hiding this comment.
Uh, I agree it's a bit unfortunate to introduce Option<Result<T>. The reason for that is, for binary parsing, "absent of the data" v. "data present, but parsing fail" can be informative. For example, if I want to make a binary visualizer, I don't really want to display the data as "absent" when it might be an error parsing the data.
This might only cost a bit for developers while being informative, so I'd rather keep it, since it's public API we can't make changes later (meaning I'd want to keep the "absent" v. "error" meta regardless of what errors are in).
Especially when it comes to unwind info, it's often more of more important for the developer to know whether the info is absent or parsing fails!
foo.get_unwind_info_arm64(...) // Option<Result<T>>
.map(Result::ok) // Option<Option<T>>
.flatten() // Option<T>
foo.get_unwind_info_arm64(...)
.and_then(|r| r.ok())| (1u32 << bits) - 1 | ||
| } | ||
|
|
||
| const fn bits(v: u32, shr: u32, bits: u32) -> u32 { |
There was a problem hiding this comment.
I assume the const part guarantees this is inlined?
There was a problem hiding this comment.
Unfortunately, it has no such guarantee. We can introduce #[inline(always)].
| self.bytes | ||
| .gread_with::<u32>(&mut self.offset, scroll::LE) | ||
| .map(Arm64EpilogScope) | ||
| .map_err(Into::into), |
There was a problem hiding this comment.
nice; this pattern comes up so much it makes me want to write a ScrollIterator<T> that does all this for you.
There was a problem hiding this comment.
It... might, might be made better, by just doing bytes.chunks_exact(size_of::<Foo>()).map(...) so we can fold redundant Iterator impls. Here is the siginifcant example from my own mutable-aware PE/ELF parser that every parser is dereffered and has only single slice reference (so-called View auto datastructures).
impl<'buf, T> PortableExecutable<'buf, T>
where
T: BufRef,
{
/// Returns an iterator over sections.
pub fn sections(
&self,
) -> error::Result<impl Iterator<Item = ImageSectionHeaderView<'_, &[u8]>> + '_> {
// ...
Ok(buf
.chunks_exact(size_of::<ImageSectionHeader>())
.map(ImageSectionHeaderView::new))
}
}impl<'buf, T> PortableExecutable<'buf, T>
where
T: BufMut,
{
/// Returns a mutable iterator over sections.
pub fn sections_mut(
&mut self,
) -> error::Result<impl Iterator<Item = ImageSectionHeaderView<'_, &mut [u8]>> + '_> {
// ...
Ok(buf
.chunks_exact_mut(size_of::<ImageSectionHeader>())
.map(ImageSectionHeaderView::new))
}
}| epilog_scope_bytes: &'a [u8], | ||
| /// Bytes of unwind codes. | ||
| unwind_code_bytes: &'a [u8], |
There was a problem hiding this comment.
nit: I like to put non-pub fields after pub ones but ymmv
| #[derive(Clone, Copy, Debug, PartialEq, Eq)] | ||
| pub struct Arm64UnwindExtension(pub u32); | ||
|
|
||
| const _: () = assert!(size_of::<Arm64UnwindExtension>() == 4); |
There was a problem hiding this comment.
neat! I wanted this so badly back in early days of rust, in this library and in dryad specifically :)
addresses #512
This is draft for now until plenty of more unwind code covered by tests (and of course for malformed ones as well).