Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions crates/native-c/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
[package]
name = "rustc-demangle-native-c"
version = "0.1.0"
authors = ["automatically generated"]
description = """
Native C version of the rustc_demangle crate
"""
license = "MIT/Apache-2.0"
repository = "https://github.com/rust-lang/rustc-demangle"

[lib]
name = "rustc_demangle_native_c"

[build-dependencies]
cc = "1"
5 changes: 5 additions & 0 deletions crates/native-c/README
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably also make sense to update the top-level README noting this exists and indicating that expected usage is probably copy/paste or git submodule?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added it

Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
A portable native C demangler, which should mostly have byte-for-byte identical outputs to the Rust one.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth noting what our expectations on safety are (e.g., do we fuzz etc sufficiently that we feel confident there's no UB on untrusted inputs)? And maybe whether we also fail in the same cases as the Rust one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added


The only difference is that since it's hard to include up-to-date unicode tables in portable C code, strings in constants (do you know that feature exists?) have all non-ASCII characters escaped (as `\u{ABCD}`) rather than having only non-printable characters escaped. Unicode in identifiers is still translated as-is, allowing non-printable characters just like rustc. If you care, the code intentionally includes `unicode_isprint` and `unicode_isgraphemextend` that can be replaced with actual Unicode tables.

This has a Cargo.toml to make it easy to test, but people whose build systems can use Rust are expected to use the `rustc-demangle-capi` crate which uses the Rust `rustc-demangle` implementation instead. Since the crate is intended only for users with weird build systems, there is no build system provided.
8 changes: 8 additions & 0 deletions crates/native-c/build.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
fn main() {
cc::Build::new()
.file("src/demangle.c")
.include("include")
.compile("demangle_native_c");
println!("cargo::rerun-if-changed=src/demangle.c");
println!("cargo::rerun-if-changed=include/demangle.h");
}
72 changes: 72 additions & 0 deletions crates/native-c/include/demangle.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
#ifndef _H_DEMANGLE_V0_H
#define _H_DEMANGLE_V0_H

#include <stddef.h>

#if defined(__GNUC__) || defined(__clang__)
#define DEMANGLE_NODISCARD __attribute__((warn_unused_result))
#else
#define DEMANGLE_NODISCARD
#endif

typedef enum {
OverflowOk,
OverflowOverflow
} overflow_status;

enum demangle_style {
DemangleStyleUnknown = 0,
DemangleStyleLegacy,
DemangleStyleV0,
};

// Not using a union here to make the struct easier to copy-paste if needed.
struct demangle {
enum demangle_style style;
// points to the "mangled" part of the name,
// not including `ZN` or `R` prefixes.
const char *mangled;
size_t mangled_len;
// In DemangleStyleLegacy, is the number of path elements
size_t elements;
// while it's called "original", it will not contain `.llvm.9D1C9369@@16` suffixes
// that are to be ignored.
const char *original;
size_t original_len;
// Contains the part after the mangled name that is to be outputted,
// which can be `.exit.i.i` suffixes LLVM sometimes adds.
const char *suffix;
size_t suffix_len;
};

// if the length of the output buffer is less than `output_len-OVERFLOW_MARGIN`,
// the demangler will return `OverflowOverflow` even if there is no overflow.
#define OVERFLOW_MARGIN 4

/// Demangle a C string that refers to a Rust symbol and put the demangle intermediate result in `res`.
/// Beware that `res` contains references into `s`. If `s` is modified (or free'd) before calling
/// `rust_demangle_display_demangle` behavior is undefined.
///
/// Use `rust_demangle_display_demangle` to convert it to an actual string.
void rust_demangle_demangle(const char *s, struct demangle *res);

/// Write the string in a `struct demangle` into a buffer.
///
/// Return `OverflowOk` if the output buffer was sufficiently big, `OverflowOverflow` if it wasn't.
/// This function is `O(n)` in the length of the input + *output* [$], but the demangled output of demangling a symbol can
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, output here refers specifically to the produced demangling, not the output buffer's size, right? We don't do anything with unused bytes in the output buffer.

(Just to check my understanding, no changes needed here I think).

Copy link
Contributor Author

@arielb1 arielb1 Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, "output-sensitive algorithm"

/// be exponentially[$$] large, therefore it is recommended to have a sane bound (`rust-demangle`
/// uses 1,000,000 bytes) on `len`.
///
/// `alternate`, if true, uses the less verbose alternate formatting (Rust `{:#}`) is used, which does not show
/// symbol hashes and types of constant ints.
///
/// [$] It's `O(n * MAX_DEPTH)`, but `MAX_DEPTH` is a constant 300 and therefore it's `O(n)`
/// [$$] Technically, bounded by `O(n^MAX_DEPTH)`, but this is practically exponential.
DEMANGLE_NODISCARD overflow_status rust_demangle_display_demangle(struct demangle const *res, char *out, size_t len, bool alternate);

/// Returns true if `res` refers to a known valid Rust demangling style, false if it's an unknown style.
bool rust_demangle_is_known(struct demangle *res);

#undef DEMANGLE_NODISCARD

#endif
Empty file added crates/native-c/src/build.rs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidentally added file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

Empty file.
Loading
Loading