Skip to content

[AArch64] Add FeatureZCRegMoveFPR128 subtarget feature #152906

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

tomershafir
Copy link
Contributor

Adds a subtarget feature called FeatureZCRegMoveFPR128 that enables to query wether the target supports zero cycle reg move for FPR128 NEON registers, and embeds it into the appropriate processors.

Its a preparation for future optimizations.

Adds a subtarget feature called FeatureZCRegMoveFPR128 that enables to query wether the target supports zero cycle reg move for FPR128 NEON registers, and embeds it into the appropriate processors.

Its a preparation for future optimizations.
@tomershafir tomershafir marked this pull request as ready for review August 10, 2025 10:10
@tomershafir tomershafir requested a review from jroelofs August 10, 2025 10:10
@llvmbot
Copy link
Member

llvmbot commented Aug 10, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Tomer Shafir (tomershafir)

Changes

Adds a subtarget feature called FeatureZCRegMoveFPR128 that enables to query wether the target supports zero cycle reg move for FPR128 NEON registers, and embeds it into the appropriate processors.

Its a preparation for future optimizations.


Full diff: https://github.com/llvm/llvm-project/pull/152906.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64Features.td (+3)
  • (modified) llvm/lib/Target/AArch64/AArch64Processors.td (+10)
diff --git a/llvm/lib/Target/AArch64/AArch64Features.td b/llvm/lib/Target/AArch64/AArch64Features.td
index c1c1f0a1024d0..55aea17d29f55 100644
--- a/llvm/lib/Target/AArch64/AArch64Features.td
+++ b/llvm/lib/Target/AArch64/AArch64Features.td
@@ -621,6 +621,9 @@ def FeatureZCRegMoveGPR64 : SubtargetFeature<"zcm-gpr64", "HasZeroCycleRegMoveGP
 def FeatureZCRegMoveGPR32 : SubtargetFeature<"zcm-gpr32", "HasZeroCycleRegMoveGPR32", "true",
                                         "Has zero-cycle register moves for GPR32 registers">;
 
+def FeatureZCRegMoveFPR128 : SubtargetFeature<"zcm-fpr128", "HasZeroCycleRegMoveFPR128", "true",
+                                        "Has zero-cycle register moves for FPR128 registers">;
+
 def FeatureZCRegMoveFPR64 : SubtargetFeature<"zcm-fpr64", "HasZeroCycleRegMoveFPR64", "true",
                                         "Has zero-cycle register moves for FPR64 registers">;
 
diff --git a/llvm/lib/Target/AArch64/AArch64Processors.td b/llvm/lib/Target/AArch64/AArch64Processors.td
index 1bc1d98a6f65b..1f5a6ff1328e7 100644
--- a/llvm/lib/Target/AArch64/AArch64Processors.td
+++ b/llvm/lib/Target/AArch64/AArch64Processors.td
@@ -321,6 +321,7 @@ def TuneAppleA7  : SubtargetFeature<"apple-a7", "ARMProcFamily", "AppleA7",
                                     FeatureFuseAES, FeatureFuseCryptoEOR,
                                     FeatureStorePairSuppress,
                                     FeatureZCRegMoveGPR64,
+                                    FeatureZCRegMoveFPR128,
                                     FeatureZCRegMoveFPR64,
                                     FeatureZCZeroing,
                                     FeatureZCZeroingFPWorkaround]>;
@@ -335,6 +336,7 @@ def TuneAppleA10 : SubtargetFeature<"apple-a10", "ARMProcFamily", "AppleA10",
                                     FeatureFuseCryptoEOR,
                                     FeatureStorePairSuppress,
                                     FeatureZCRegMoveGPR64,
+                                    FeatureZCRegMoveFPR128,
                                     FeatureZCRegMoveFPR64,
                                     FeatureZCZeroing]>;
 
@@ -348,6 +350,7 @@ def TuneAppleA11 : SubtargetFeature<"apple-a11", "ARMProcFamily", "AppleA11",
                                     FeatureFuseCryptoEOR,
                                     FeatureStorePairSuppress,
                                     FeatureZCRegMoveGPR64,
+                                    FeatureZCRegMoveFPR128,
                                     FeatureZCRegMoveFPR64,
                                     FeatureZCZeroing]>;
 
@@ -361,6 +364,7 @@ def TuneAppleA12 : SubtargetFeature<"apple-a12", "ARMProcFamily", "AppleA12",
                                     FeatureFuseCryptoEOR,
                                     FeatureStorePairSuppress,
                                     FeatureZCRegMoveGPR64,
+                                    FeatureZCRegMoveFPR128,
                                     FeatureZCRegMoveFPR64,
                                     FeatureZCZeroing]>;
 
@@ -374,6 +378,7 @@ def TuneAppleA13 : SubtargetFeature<"apple-a13", "ARMProcFamily", "AppleA13",
                                     FeatureFuseCryptoEOR,
                                     FeatureStorePairSuppress,
                                     FeatureZCRegMoveGPR64,
+                                    FeatureZCRegMoveFPR128,
                                     FeatureZCRegMoveFPR64,
                                     FeatureZCZeroing]>;
 
@@ -392,6 +397,7 @@ def TuneAppleA14 : SubtargetFeature<"apple-a14", "ARMProcFamily", "AppleA14",
                                     FeatureFuseLiterals,
                                     FeatureStorePairSuppress,
                                     FeatureZCRegMoveGPR64,
+                                    FeatureZCRegMoveFPR128,
                                     FeatureZCRegMoveFPR64,
                                     FeatureZCZeroing]>;
 
@@ -410,6 +416,7 @@ def TuneAppleA15 : SubtargetFeature<"apple-a15", "ARMProcFamily", "AppleA15",
                                     FeatureFuseLiterals,
                                     FeatureStorePairSuppress,
                                     FeatureZCRegMoveGPR64,
+                                    FeatureZCRegMoveFPR128,
                                     FeatureZCRegMoveFPR64,
                                     FeatureZCZeroing]>;
 
@@ -428,6 +435,7 @@ def TuneAppleA16 : SubtargetFeature<"apple-a16", "ARMProcFamily", "AppleA16",
                                     FeatureFuseLiterals,
                                     FeatureStorePairSuppress,
                                     FeatureZCRegMoveGPR64,
+                                    FeatureZCRegMoveFPR128,
                                     FeatureZCRegMoveFPR64,
                                     FeatureZCZeroing]>;
 
@@ -446,6 +454,7 @@ def TuneAppleA17 : SubtargetFeature<"apple-a17", "ARMProcFamily", "AppleA17",
                                     FeatureFuseLiterals,
                                     FeatureStorePairSuppress,
                                     FeatureZCRegMoveGPR64,
+                                    FeatureZCRegMoveFPR128,
                                     FeatureZCRegMoveFPR64,
                                     FeatureZCZeroing]>;
 
@@ -463,6 +472,7 @@ def TuneAppleM4 : SubtargetFeature<"apple-m4", "ARMProcFamily", "AppleM4",
                                      FeatureFuseCryptoEOR,
                                      FeatureFuseLiterals,
                                      FeatureZCRegMoveGPR64,
+                                     FeatureZCRegMoveFPR128,
                                      FeatureZCRegMoveFPR64,
                                      FeatureZCZeroing
                                      ]>;

@tomershafir tomershafir requested a review from davemgreen August 10, 2025 10:11
@jroelofs
Copy link
Contributor

Its a preparation for future optimizations.

I have a small hesitation with this approach, since the lack of functional change along with it means this cannot be tested. I think it would be a lot healthier if this came with that change, as well as a test covering all these CPUs that have (and one that shows what happens when it doesn't). These feature lists can be a bit fragile with respect to refactoring + downstream changes... way too easy to lose things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants