Skip to content

Conversation

arielb1
Copy link
Contributor

@arielb1 arielb1 commented Dec 1, 2024

Adds a native C demangler to allow users with a build system that does not support Rust to demangle Rust symbols.

There are fuzz tests that check output compatibility between the native C demangler and the Rust demangler.

@arielb1
Copy link
Contributor Author

arielb1 commented Dec 1, 2024

cc @Mark-Simulacrum

@bjorn3
Copy link
Member

bjorn3 commented Dec 1, 2024

How does this compare with https://github.com/LykenSol/rust-demangle.c?

@arielb1
Copy link
Contributor Author

arielb1 commented Dec 1, 2024

  1. @Mark-Simulacrum was not able to find the other one
  2. This one is byte-for-byte identical to the Rust one including fuzzing and recursion limits, up to the Unicode table problem, with fuzz tests asserting equivalence. The other crate claims that it's not.
  3. This one, like the Rust one, doesn't use malloc, tho it's not that important for my needs.

I think that being fuzzed to be equivalent to the Rust implementation rather than having specific test vectors makes it more maintainable.

@arielb1
Copy link
Contributor Author

arielb1 commented Dec 1, 2024

Personally, I think that having the demangler fuzzable against the Rust one makes me much more comfortable putting it in the rust-lang org.

@tgross35
Copy link
Contributor

tgross35 commented Dec 3, 2024

Fyi crates/native-c/Cargo.toml and crates/native-c/README need a trailing newline

image

/// Write the string in a `struct demangle` into a buffer.
///
/// Return `OverflowOk` if the output buffer was sufficiently big, `OverflowOverflow` if it wasn't.
/// This function is `O(n)` in the length of the input + *output* [$], but the demangled output of demangling a symbol can
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, output here refers specifically to the produced demangling, not the output buffer's size, right? We don't do anything with unused bytes in the output buffer.

(Just to check my understanding, no changes needed here I think).

Copy link
Contributor Author

@arielb1 arielb1 Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, "output-sensitive algorithm"

@@ -0,0 +1,5 @@
A portable native C demangler, which should mostly have byte-for-byte identical outputs to the Rust one.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth noting what our expectations on safety are (e.g., do we fuzz etc sufficiently that we feel confident there's no UB on untrusted inputs)? And maybe whether we also fail in the same cases as the Rust one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidentally added file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

}

fuzz_target!(|data: &[u8]| {
if data.len() == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test to make sure an empty input is OK too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will add

match rustc_demangle_native_c::rust_demangle_display_demangle(
&demangle,
buf.as_mut_ptr().cast(),
buf.len() / 4,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the divide by 4?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a comment

if rust_overflowed.is_err() {
return; // rust overflowed as well, OK
}
// call C again with larger buffer. If it fits in an 1020-byte Rust buffer, it will fit in a 4096-byte C buffer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand why we have this special cased -- it feels like we ought to catch discrepancies of output with an assert on equivalent outputs; we don't really need differently sized buffers to catch that?

If we want to assert the overflow handling is correctly implemented maybe having a unit test that confirms output sizes less than the required output size all fail (e.g., 0..4kb for a 4kb output or so). Probably worth a test with zero-sized output buffer as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to assert the overflow handling is correctly implemented maybe having a unit test that confirms output sizes less than the required output size all fail (e.g., 0..4kb for a 4kb output or so). Probably worth a test with zero-sized output buffer as well.

I'll add a unit test for that as well, but the fuzzing helps too

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably also make sense to update the top-level README noting this exists and indicating that expected usage is probably copy/paste or git submodule?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added it

@Mark-Simulacrum Mark-Simulacrum merged commit 735451f into rust-lang:main Dec 3, 2024
6 checks passed
@github-actions github-actions bot mentioned this pull request Jun 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants