-
Notifications
You must be signed in to change notification settings - Fork 95
Embed markers for used types/events to enable spec filtering #1672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
0c24a90 to
3b7ef81
Compare
mootz12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall. Leaving review as comment for now as I haven't looked at the CLI side or done much testing.
One thing to note is this might have some impacts if the spec-marker ever changes to release complexity between the CLI and SDK.
Might not be anything we can do about it :/
| /// 2. Extract the hash from each marker | ||
| /// 3. Match against specs in contractspecv0 section (by hashing each spec) | ||
| /// 4. Strip unused specs from contractspecv0 | ||
| // TODO: Move the spec marker logic into a crate that can be shared with the CLI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed yet?
| use super::*; | ||
|
|
||
| #[test] | ||
| fn test_spec_marker() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(picky) can we add a test here to assert against a fixed spec_marker? It should fail if the structure of spec_market changes, whereas this one won't as long as the first 4 bytes and len don't change.
| /// Spec marker generation for identifying used spec entries. | ||
| /// | ||
| /// The marker is a byte array in the data section with a distinctive pattern: | ||
| /// - 4 bytes: "SpEc" prefix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it worthwhile to version tag this at all?
SpEcV1
In the event this ever changes, would we need to maintain backwards compatibility? Or is it more of a "use the correct CLI version for your given SDK version"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could add this. The SDK will also have the SDK version embedded inside the meta custom section and I was planning to update the stellar-cli, so I think it can work out from that version what it should do.
| // Create a marker that identifies this spec entry. The marker is a byte array | ||
| // in the data section with a distinctive pattern: "SpEc" + truncated SHA256. | ||
| // Post-build tools can scan the data section for "SpEc" markers and match | ||
| // against specs in contractspecv0. | ||
| let marker = spec_marker::spec_marker(spec_xdr); | ||
| let marker_lit = proc_macro2::Literal::byte_string(&marker); | ||
| let marker_len = marker.len(); | ||
| Some(quote! { | ||
| impl #path::IncludeSpecMarker for #ident { | ||
| #[doc(hidden)] | ||
| #[inline(always)] | ||
| fn include_spec_marker() { | ||
| // Include markers for nested field types. | ||
| #(<#field_types as #path::IncludeSpecMarker>::include_spec_marker();)* | ||
| #[cfg(target_family = "wasm")] | ||
| { | ||
| // Marker in data section. Post-build tools can scan for "SpEc" | ||
| // patterns and match against specs in contractspecv0. | ||
| static MARKER: [u8; #marker_len] = *#marker_lit; | ||
| // Volatile read prevents DCE within live function. | ||
| let _ = unsafe { ::core::ptr::read_volatile(MARKER.as_ptr()) }; | ||
| } | ||
| } | ||
| } | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we extract this logic out of the derive_* files to reduce some code duplication?
|
Is it worth considering putting this behind a build flag of some sort? This way, the CLI could provide it for users so the process is hidden. If some1 builds directly with cargo and doesn't provide the flag, the spec markers won't be added to avoid polluting the WASM. |
|
That's a good idea, at least during a rollout period. Something to note though is that part of this change is changing specs so they always output to the custom section, even specs from libraries and imports, so that the cli can then narrow them down to the ones being used. If you import a library today you get way more pollution than what these markers introduce. If you import with contractimport you get no type exporting which is really broken. So any world where we keep the old way is going to propogate continued broken experiences. |
I had a look at doing this, so that the sdk worked the way it does today if not used with a new cli. The approach I tried was that the cli would set a special cfg via rustflags, that the sdk would use to control if it should use this new approach or not. But it results in much more complex macro code because there's essentially two paths for everything. Part of the problem is that as part of this change more specs, all specs actually including those from I think it'd be preferred if the cargo build output was at least a superset, so I don't think we should try and keep the imperfect spec selection around, and all used specs are always included, and the cli is just an optimisation step now. So instead I've added a cfg that the cli sets to indicate it supports optimising contract specs, and the SDK can output a warning if its being built with an old CLI or no CLI. |
What
Embed markers in the WASM data section for each type/event that is used at external boundaries. These markers enable post-build tools to filter unused entries from the
contractspecv0section.Changes:
spec_markermodule with marker generation logicderive_struct,derive_struct_tuple,derive_enum,derive_enum_int,derive_error_enum_int,derive_event) to embed markers__include_spec_marker()functions with volatile readsWhy
Contract WASM files can contain unused type and event definitions in the
contractspecv0custom section that inflate binary size unnecessarily.The problem: Any exported
contracttypesorcontracteventsdefined in the SDK or any library that don't get used in the importing contract still end up in the contract spec. This results in awkward behaviour where we avoid exporting contracttypes in the SDK, even though doing so would be convenient.Why the SDK can't fix this alone: Because of how proc-macros run in isolation and cannot coordinate in WASM builds in Rust, there's nothing the soroban-sdk can do to identify whether a contracttype that sometimes needs to be exported hasn't been used and can be excluded.
Solution: The SDK embeds markers in the regular data section for each type/event when it's used. These markers survive DCE only if the corresponding code is reachable. A post-build tool (stellar-cli) extracts these markers and filters the spec accordingly.
Close #1569
Close stellar/stellar-cli#2030
Side-effects
A side-effect of this change, is that types/events brought into a contract from a
contractimportcan now be automatically exported from that contract if they use the types at their boundary. This is why the change also removes theexport = falsefrom types/events that come from acontractimport.Close #1330
How
The SDK embeds markers using volatile reads in conversion/publish functions:
Marker format (12 bytes):
SpEcmagic prefix (alternating case to avoid false positives)Embedding mechanism:
When markers are included:
__include_spec_marker()is called fromTryFromValimplementations viaIncludeSpectrait__include_spec_marker()is called from thepublish()methodWhy this works: If a type/event is never used at an external boundary, its
__include_spec_marker()function is never called, and DCE removes both the function and its marker. If the type is used, the marker survives in the data section.CLI Support Required
This feature requires CLI support for filtering specs using these markers. See: stellar/stellar-cli#2353
Try It Out
Install the modified
stellar-clithat strips unused specs:Import the modified
soroban-sdkthat marks used specs:Build using the stellar-cli:
Inspect the contract interface and it should only show types actually used at the boundary of the contract (inputs, outputs, and events):
Todo
#[used]