Skip to content

rustc_scalable_vector(N) #143924

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

davidtwco
Copy link
Member

@davidtwco davidtwco commented Jul 14, 2025

Supercedes #118917.

Initial experimental implementation of rust-lang/rfcs#3838. Introduces a rustc_scalable_vector(N) attribute that can be applied to types with a single [$ty] field (for u{16,32,64}, i{16,32,64}, f{32,64}, bool). rustc_scalable_vector types are lowered to scalable vectors in the codegen backend.

As with any unstable feature, there will necessarily be follow-ups as we experiment and find cases that we've not considered or still need some logic to handle, but this aims to be a decent baseline to start from.

See #145052 for request for a lang experiment.

@rustbot
Copy link
Collaborator

rustbot commented Jul 14, 2025

r? @compiler-errors

rustbot has assigned @compiler-errors.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-attributes Area: Attributes (`#[…]`, `#![…]`) A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. WG-trait-system-refactor The Rustc Trait System Refactor Initiative (-Znext-solver) labels Jul 14, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jul 14, 2025

Some changes occurred in compiler/rustc_attr_parsing

cc @jdonszelmann

Some changes occurred in compiler/rustc_attr_data_structures

cc @jdonszelmann

Some changes occurred in compiler/rustc_passes/src/check_attr.rs

cc @jdonszelmann

Some changes occurred to the intrinsics. Make sure the CTFE / Miri interpreter
gets adapted for the changes, if necessary.

cc @rust-lang/miri, @RalfJung, @oli-obk, @lcnr

Some changes occurred to the CTFE / Miri interpreter

cc @rust-lang/miri

Some changes occurred to the platform-builtins intrinsics. Make sure the
LLVM backend as well as portable-simd gets adapted for the changes.

cc @antoyo, @GuillaumeGomez, @bjorn3, @calebzulawski, @programmerjake

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

Some changes occurred in compiler/rustc_codegen_ssa

cc @WaffleLapkin

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

changes to the core type system

cc @compiler-errors, @lcnr

Some changes occurred to the CTFE machinery

cc @RalfJung, @oli-obk, @lcnr

@davidtwco davidtwco marked this pull request as draft July 14, 2025 11:21
@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 14, 2025
@davidtwco
Copy link
Member Author

I've changed this back to a draft and marked it as S-waiting-on-author while I rebase it and get the approval for an experimental implementation to go ahead.

@davidtwco davidtwco force-pushed the sve-infrastructure branch 2 times, most recently from cf9474d to d58c634 Compare July 14, 2025 11:53
@rust-log-analyzer

This comment was marked as resolved.

@davidtwco davidtwco force-pushed the sve-infrastructure branch 2 times, most recently from 0c22701 to 3ad0898 Compare July 14, 2025 12:12
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@davidtwco davidtwco force-pushed the sve-infrastructure branch from 5c92874 to 3edf1b6 Compare July 14, 2025 13:30
@rust-log-analyzer

This comment has been minimized.

@davidtwco davidtwco force-pushed the sve-infrastructure branch from 3edf1b6 to 4f6b823 Compare July 14, 2025 14:20
Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not intend to have repr(simd) survive this year, so I do not think this should be added.

@davidtwco
Copy link
Member Author

I do not intend to have repr(simd) survive this year, so I do not think this should be added.

Could you elaborate?

@workingjubilee
Copy link
Member

I intend to replace it with an approach based on lang items for a variety of reasons, one of them being that to start with, the repr(simd) "attribute" does not compose and should not.

r? @workingjubilee

@workingjubilee
Copy link
Member

workingjubilee commented Jul 14, 2025

Likewise, scalable vectors do not really have a diversity in their representation. They either go in the scalable vector registers or they don't. They are one type, even if it is parameterized by the element type (and possibly the CPU's Matrix state, idk).

@davidtwco
Copy link
Member Author

I intend to replace it with an approach based on lang items for a variety of reasons, one of them being that to start with, the repr(simd) "attribute" does not compose and should not.

I think this discussion is better for the RFC - rust-lang/rfcs#3838 - rather than the implementation.

With that said, I definitely don't think this should to be blocked by your intentions to change repr(simd) - extending our existing, RFC-approved, SIMD infrastructure is a natural way to go about supporting scalable vectors. Unless there's an alternative proposal for changing repr(simd) out there that has been accepted or is on the cusp of being accepted and I've missed it, working from what we have today is all I can do.

Nothing in rust-lang/rfcs#3838 should make your plans impossible as it's just more internal infrastructure that could easily be changed to something else with a well-motivated proposal, no different than repr(simd). It'll be your responsibility to update your plans/proposal if repr(scalable) makes progress first. It goes the other way too, if you make progress changing repr(simd) prior to this making progress then I'll have to update my proposal, that's fine. Unless something I'm proposing completely shuts the door on your plans - which would be worth discussing! - and we decide we'd like to avoid that, then it shouldn't block this.

@workingjubilee
Copy link
Member

So you want me to race you?

@workingjubilee
Copy link
Member

workingjubilee commented Jul 14, 2025

  1. My understanding has always been that "an RFC approved it" doesn't mean much when it concerns internal-facing details. We've changed things away from RFC-approved implementation details without redoing the RFC. e.g. #[track_caller] only barely resembles the original implementation described in the RFC.
  2. Specifically, we have removed or heavily modified several parts of RFC1199 already, and each one was a lot of work.
  3. We have had to issue many patches like Ban projecting into SIMD types [MCP838] #143833 for fixing how repr(simd) is compiled. Yes, we have this existing infrastructure, which is how we know how flawed it is.
  4. This is not some random idea I had, it was an issue from some time ago. I am surprised to find out I am the only one who reviews related issues before commencing work on things. Remove repr(simd) attribute and use a lang-item instead #63633
  5. As someone paid to work on this full-time and also empowered as compiler team lead, I fear you have more ability to push through things than me, especially because for various reasons I take a great deal of care before pushing for things. For that reason, while I am willing to concede the changes here might not obstruct a future change, I do expect to be allowed to review this to make that determination, and hoped for a more collaborative approach instead of possibly fighting to rebase over each other.

@bors

This comment was marked as resolved.

davidtwco and others added 7 commits August 12, 2025 10:05
Extend parsing of `ReprOptions` with `rustc_scalable_vector(N)` which
optionally accepts a single literal integral value - the base multiple of
lanes that are in a scalable vector. Can only be applied to structs.

Co-authored-by: Jamie Cunliffe <[email protected]>
Extend well-formedness checking and HIR analysis to prohibit the use of
scalable vectors in structs, enums, unions, tuples and arrays. LLVM does
not support scalable vectors being members of other types, so these
restrictions are necessary.

Co-authored-by: Jamie Cunliffe <[email protected]>
`simd_reinterpret` is a replacement for `transmute`, specifically for
use with scalable SIMD types. It is used in the tests for scalable
vectors and in stdarch.

Co-authored-by: Jamie Cunliffe <[email protected]>
Introduces `BackendRepr::ScalableVector` corresponding to scalable
vector types annotated with `repr(scalable)` which lowers to a scalable
vector type in LLVM.

Co-authored-by: Jamie Cunliffe <[email protected]>
LLVM doesn't handle stores on `<vscale x N x i1>` for `N != 16`, a type
used internally in SVE intrinsics. Spilling to the stack to create
debuginfo will cause errors during instruction selection. These types
that are an internal implementation detail to the intrinsic, so users
should never see them types and won't need any debuginfo.

Co-authored-by: Jamie Cunliffe <[email protected]>
Scalable vectors cannot be members of ADTs and thus cannot be kept over
await points in async functions.
Scalable vector types only work with the relevant target features
enabled, so require this for any function with the types in its
signature.
@davidtwco davidtwco changed the title repr(scalable) rustc_scalable_vector(N) Aug 12, 2025
@davidtwco
Copy link
Member Author

Updated this reflecting changes in rust-lang/rfcs#3838. It doesn't implement everything in it yet, and is definitely incomplete, but it's a starting point.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

Trivial changes to rust-analyzer to keep it compiling with changes to
`ReprOptions`.
@rust-log-analyzer
Copy link
Collaborator

The job aarch64-gnu-llvm-19-1 failed! Check out the build log: (web) (plain enhanced) (plain)

Click to see the possible cause of the failure (guessed by this bot)
failures:

---- [codegen] tests/codegen-llvm/scalable-vectors/simple.rs stdout ----

error: verification with 'FileCheck' failed
status: exit status: 1
command: "/usr/lib/llvm-19/bin/FileCheck" "--input-file" "/checkout/obj/build/aarch64-unknown-linux-gnu/test/codegen-llvm/scalable-vectors/simple/simple.ll" "/checkout/tests/codegen-llvm/scalable-vectors/simple.rs" "--check-prefix=CHECK" "--allow-unused-prefixes" "--dump-input-context" "100"
stdout: none
--- stderr -------------------------------
/checkout/tests/codegen-llvm/scalable-vectors/simple.rs:35:11: error: CHECK: expected string not found in input
// CHECK: define <vscale x 4 x i32> @pass_as_ref(ptr noalias noundef readonly align 16 captures(none) dereferenceable(16) %a, <vscale x 4 x i32> %b)
          ^
/checkout/obj/build/aarch64-unknown-linux-gnu/test/codegen-llvm/scalable-vectors/simple/simple.ll:1:1: note: scanning from here
; ModuleID = 'simple.762e3d9a4d4c6830-cgu.0'
^
/checkout/obj/build/aarch64-unknown-linux-gnu/test/codegen-llvm/scalable-vectors/simple/simple.ll:33:6: note: possible intended match here
 %_0 = call <vscale x 4 x i32> @pass_as_ref(ptr noalias noundef nonnull readonly align 16 dereferenceable(16) %a, <vscale x 4 x i32> %b)
     ^

Input file: /checkout/obj/build/aarch64-unknown-linux-gnu/test/codegen-llvm/scalable-vectors/simple/simple.ll
Check file: /checkout/tests/codegen-llvm/scalable-vectors/simple.rs

-dump-input=help explains the following input dump.

Input was:
<<<<<<
            1: ; ModuleID = 'simple.762e3d9a4d4c6830-cgu.0' 
check:35'0     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
            2: source_filename = "simple.762e3d9a4d4c6830-cgu.0" 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            3: target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128-Fn32" 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            4: target triple = "aarch64-unknown-linux-gnu" 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            5:  
check:35'0     ~
            6: ; simple::svdup_n_s32 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~
            7: ; Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none) uwtable 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            8: define <vscale x 4 x i32> @_ZN6simple11svdup_n_s3217ha891f46349370f05E(i32 noundef %op) unnamed_addr #0 { 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            9: start: 
check:35'0     ~~~~~~~
           10:  %.splatinsert = insertelement <vscale x 4 x i32> poison, i32 %op, i64 0 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           11:  %_0 = shufflevector <vscale x 4 x i32> %.splatinsert, <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           12:  ret <vscale x 4 x i32> %_0 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           13: } 
check:35'0     ~~
           14:  
check:35'0     ~
           15: ; Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(argmem: read) uwtable 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           16: define <vscale x 4 x i32> @pass_as_ref(ptr noalias nocapture noundef readonly align 16 dereferenceable(16) %a, <vscale x 4 x i32> %b) unnamed_addr #1 { 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           17: start: 
check:35'0     ~~~~~~~
           18:  %_3 = load <vscale x 4 x i32>, ptr %a, align 16 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           19:  %_0 = tail call <vscale x 4 x i32> @llvm.aarch64.sve.xar.nxv4i32(<vscale x 4 x i32> %_3, <vscale x 4 x i32> %b, i32 noundef 1) #5 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           20:  ret <vscale x 4 x i32> %_0 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           21: } 
check:35'0     ~~
           22:  
check:35'0     ~
           23: ; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none) uwtable 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           24: define <vscale x 4 x i32> @test() unnamed_addr #2 { 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           25: start: 
check:35'0     ~~~~~~~
           26:  %a = alloca <vscale x 4 x i32>, align 16 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           27:  call void @llvm.lifetime.start.p0(i64 16, ptr nonnull %a) 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           28: ; call simple::svdup_n_s32 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~
           29:  %0 = tail call <vscale x 4 x i32> @_ZN6simple11svdup_n_s3217ha891f46349370f05E(i32 noundef 1) 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           30:  store <vscale x 4 x i32> %0, ptr %a, align 16 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           31: ; call simple::svdup_n_s32 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~
           32:  %b = tail call <vscale x 4 x i32> @_ZN6simple11svdup_n_s3217ha891f46349370f05E(i32 noundef 2) 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           33:  %_0 = call <vscale x 4 x i32> @pass_as_ref(ptr noalias noundef nonnull readonly align 16 dereferenceable(16) %a, <vscale x 4 x i32> %b) 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:35'1          ?                                                                                                                                    possible intended match
           34:  call void @llvm.lifetime.end.p0(i64 16, ptr nonnull %a) 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           35:  ret <vscale x 4 x i32> %_0 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           36: } 
check:35'0     ~~
           37:  
check:35'0     ~
           38: ; Function Attrs: mustprogress nocallback nofree nosync nounwind willreturn memory(none) 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           39: declare <vscale x 4 x i32> @llvm.aarch64.sve.xar.nxv4i32(<vscale x 4 x i32>, <vscale x 4 x i32>, i32 immarg) unnamed_addr #3 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           40:  
check:35'0     ~
           41: ; Function Attrs: mustprogress nocallback nofree nosync nounwind willreturn memory(argmem: readwrite) 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           42: declare void @llvm.lifetime.start.p0(i64 immarg, ptr nocapture) #4 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           43:  
check:35'0     ~
           44: ; Function Attrs: mustprogress nocallback nofree nosync nounwind willreturn memory(argmem: readwrite) 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           45: declare void @llvm.lifetime.end.p0(i64 immarg, ptr nocapture) #4 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           46:  
check:35'0     ~
           47: attributes #0 = { mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none) uwtable "frame-pointer"="non-leaf" "probe-stack"="inline-asm" "target-cpu"="generic" "target-features"="+v8a,+outline-atomics,+neon,+fp-armv8,+sve" } 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           48: attributes #1 = { mustprogress nofree noinline norecurse nosync nounwind willreturn memory(argmem: read) uwtable "frame-pointer"="non-leaf" "probe-stack"="inline-asm" "target-cpu"="generic" "target-features"="+v8a,+outline-atomics,+neon,+fp-armv8,+sve,+neon,+fp-armv8,+sve,+sve2" } 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           49: attributes #2 = { mustprogress nofree norecurse nosync nounwind willreturn memory(none) uwtable "frame-pointer"="non-leaf" "probe-stack"="inline-asm" "target-cpu"="generic" "target-features"="+v8a,+outline-atomics,+neon,+fp-armv8,+sve,+neon,+fp-armv8,+sve,+sve2" } 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           50: attributes #3 = { mustprogress nocallback nofree nosync nounwind willreturn memory(none) } 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           51: attributes #4 = { mustprogress nocallback nofree nosync nounwind willreturn memory(argmem: readwrite) } 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           52: attributes #5 = { nounwind } 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           53:  
check:35'0     ~
           54: !llvm.module.flags = !{!0} 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~
           55: !llvm.ident = !{!1} 
check:35'0     ~~~~~~~~~~~~~~~~~~~~
           56:  
check:35'0     ~
           57: !0 = !{i32 8, !"PIC Level", i32 2} 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           58: !1 = !{!"rustc version 1.91.0-nightly (311dd95da 2025-08-12)"} 
check:35'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>
------------------------------------------



Copy link

@Ruanyx1823 Ruanyx1823 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, I'm a student interested in the rust compiler. I'd like to ask how exactly one can write an sve intrinsic. Or how to write the following C language in rust: // #include <arm_sve.h>

// void vec_add(const float *a,const float *b,float *c,int n){
// for (int i=0; i<n; i+=svcntw()){
// svbool_t pg = svwhilelt_b32(i,n);
// svfloat32_t va = svld1(pg,&a[i]);
// svfloat32_t vb = svld1(pg,&b[i]);
// svfloat32_t vc = svadd_f32_m(pg,va,vb);
// svst1(pg,&c[i],vc);
//}
//}

#include <stdio.h>
#include <stdlib.h>
#include <arm_sve.h>

void vec_add(const float* a, const float* b, float* c, int n) {
for (int i = 0; i < n; i += svcntw()) {
svbool_t pg = svwhilelt_b32(i, n);
svfloat32_t va = svld1(pg, &a[i]);
svfloat32_t vb = svld1(pg, &b[i]);
svfloat32_t vc = svadd_f32_m(pg, va, vb);
svst1(pg, &c[i], vc);
}
}

void print_array(const char* name, const float* arr, int n) {
printf("%s: [", name);
for (int i = 0; i < n; ++i) {
printf("%.1f", arr[i]);
if (i < n - 1) printf(", ");
}
printf("]\n");
}

int main() {
#ifdef __ARM_FEATURE_SVE
printf("SVE is supported! \n");
printf("SVE vector length: %d floats (%d bits)\n",
svcntw(), svcntw() * 32);
#else
printf("SVE not supported.\n");
return 1;
#endif

const int n = 16;
float a[n], b[n], c[n];

for (int i = 0; i < n; ++i) {
a[i] = (float)i;
b[i] = (float)(2 * i);
}

printf("\nBefore vector addition:\n");
print_array("A", a, n);
print_array("B", b, n);

vec_add(a, b, c, n);

printf("\nAfter vector addition:\n");
print_array("C = A+B", c, n);

int errors = 0;
for (int i = 0; i < n; ++i) {
if (c[i] ! = a[i] + b[i]) {
printf("Error at index %d: expected %.1f, got %.1f\n",
i, a[i] + b[i], c[i]);
errors++;
}
}

if (errors == 0) {
printf("\nAll results correct! \n");
}
else {
printf("\nFound %d errors! \n", errors);
}

return 0;
}

@bors
Copy link
Collaborator

bors commented Aug 15, 2025

☔ The latest upstream changes (presumably #145085) made this pull request unmergeable. Please resolve the merge conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-attributes Area: Attributes (`#[…]`, `#![…]`) A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. WG-trait-system-refactor The Rustc Trait System Refactor Initiative (-Znext-solver)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants