Skip to content

Conversation

madhav-madhusoodanan
Copy link
Contributor

@madhav-madhusoodanan madhav-madhusoodanan commented Aug 31, 2025

What does stdarch-gen-wasm32 do?

  1. First it collects the intrinsic definitions from the wasm_simd128.h file (for the definitions in C)
  2. Then it collects the intrinsic definitions from the Rust source files
  3. It extracts details (such as intrinsic name, function arguments, return types, etc) of the C and the Rust intrinsics by decomposing their definitions into their Abstract Syntax Tree (using the tree-sitter crate)
  4. It matches the C and the Rust definitions and creates a spec sheet like the below (for an example intrinsic):
/// u16x8_extract_lane
c-intrinsic-name = wasm_u16x8_extract_lane
c-arguments = __a, __i
c-arguments-data-types = v128_t, int
c-return-type = 
rust-intrinsic-name = u16x8_extract_lane
rust-arguments = a
rust-arguments-data-types = v128
rust-const-generic-arguments = N
rust-const-generic-arguments-data-types = usize
rust-return-type = u16

How to run

cd crates/stdarch-gen-wasm
cargo run -- --c ../../intrinsics_data/wasm_simd128.h --rust ../core_arch/src/wasm32/simd128.rs --rust ../core_arch/src/wasm32/relaxed_simd.rs > wasm32.spec

Context

C Abstract Syntax Tree

Take an intrinsic definition for example:

static __inline__ v128_t __DEFAULT_FN_ATTRS wasm_u32x4_make(uint32_t __c0, uint32_t __c1, uint32_t __c2, uint32_t __c3) {...}

For a C intrinsic, the immediate children would have their grammar names as:

  • storage_class_specifier: which is static
  • storage_class_specifier: which is __inline__
  • identifier: which is v128_t. The parser doesn't recognize that it is a type, instead thinks that it is an identifier.
  • ERROR: which points to the keyword __DEFAULT_FN_ATTRS. The parser doesn't recognize it as a valid part of the tree and annotates it as ERROR.
  • function_declarator: points to wasm_u32x4_make(uint32_t __c0, uint32_t __c1, uint32_t __c2, uint32_t __c3)
  • compound_statement: the body of the function

The immediate children of the function_declarator node would have their grammar as follows:

  • identifier : which is the intrinsic name wasm_u32x4_make
  • parameter_list : which represents the arguments to the intrinsic (uint32_t __c0, uint32_t __c1, uint32_t __c2, uint32_t __c3)

The immediate children of a parameter_list node would have their grammar as follows:

  • ( : The opening bracket that denotes the start of the arguments definition.
  • parameter_declaration : The definition for the first argument uint32_t __c0
  • , : The comma that separates the first and the second arguments.
  • parameter_declaration : The definition for the second argument uint32_t __c1
  • , : The comma that separates the second and the third arguments.
  • parameter_declaration : The definition for the third argument. uint32_t __c2
  • ,* : The comma that separates the third and the fourth arguments.
  • parameter_declaration : The definition for the fourth argument. uint32_t __c3
  • ) : The closing bracket that denotes the end of the arguments definition.

Each node with the grammar name parameter_declaration could have its children structured in a few ways:

  1. In the case of int x:
  • primitive_type : Points to int
  • identifier : Points to x
  1. In the case of v128_t x:
  • identifier : Points to v128_t, which is actually a type (but the parser is unaware of the same).
  • identifier : Points to x.
  1. In the case of const void *__mem:
  • type_qualifier : Points to const.
  • primitive_type: Points to void.
  • pointer_declarator : Breaks down into * and identifier (which is __mem).

Rust Abstract Syntax Tree

Take a Rust intrinsic definition for example:

pub unsafe fn v128_load64_splat(m: *const u64) -> v128 {
    u64x2_splat(ptr::read_unaligned(m))
}

For this Rust intrinsic, the immediate children would have their grammar names as:

  • visibility_modifier: For pub
  • function_modifiers : For unsafe. May not always be present
  • fn : The actual keyword fn
  • identifier : the name of the function v128_load64_splat
  • type_parameters : the const generic arguments. (This is not always present)
  • parameters : The arguments passed to the function (m: *const u64)
  • -> : The arrow used to specify the return type
  • identifier : The return type of the function v128
  • block: The body of the function

The children of the const_parameters node have their grammar_names as the following (assuming 2 generic arguments):

  • <: The opening angle bracket that starts the generic arguments definition
  • const_parameter: The first const generic argument
  • ,: The comma that separates the generic arguments
  • const_parameter: The second const generic argument
  • >: The closing angle bracket that concludes the generic arguments definition

The children of the parameters node have their grammar_names as the following (assuming 2 arguments):

  • (: The opening parenthesis that starts the arguments definition
  • parameter : The first argument
  • ,: The comma that separates the arguments
  • parameter : The second argument
  • ) : The closing parenthesis that concludes the arguments definition

cc: @Amanieu @folkertdev

@rustbot
Copy link
Collaborator

rustbot commented Aug 31, 2025

r? @folkertdev

rustbot has assigned @folkertdev.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@madhav-madhusoodanan madhav-madhusoodanan marked this pull request as draft August 31, 2025 18:45
@madhav-madhusoodanan
Copy link
Contributor Author

madhav-madhusoodanan commented Aug 31, 2025

i just realized that for this specific intrinsic, it may be difficult to match the C version of the arguments with its Rust version:

/// u16x8_extract_lane
c-intrinsic-name = wasm_u16x8_extract_lane
c-arguments = __a, __i
c-arguments-data-types = v128_t, int
c-return-type = 
rust-intrinsic-name = u16x8_extract_lane
rust-arguments = a
rust-arguments-data-types = v128
rust-const-generic-arguments = N
rust-const-generic-arguments-data-types = usize
rust-return-type = u16

(Also noticed a bug where the c-return-type didn't show a value in this case when it should be. I'll look into it)

@madhav-madhusoodanan madhav-madhusoodanan changed the title stdarch-gen-wasm32: Tool that creates spec sheet from C and Rust source files. stdarch-gen-wasm32: Tool that creates spec sheet from wasm32's C and Rust source files. Aug 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants