Skip to content

[RISCV] Relax codegen predicates for HINT-based instructions#179872

Merged
kito-cheng merged 9 commits intollvm:mainfrom
kito-cheng:kitoc/hint-op-codegen
Mar 2, 2026
Merged

[RISCV] Relax codegen predicates for HINT-based instructions#179872
kito-cheng merged 9 commits intollvm:mainfrom
kito-cheng:kitoc/hint-op-codegen

Conversation

@kito-cheng
Copy link
Copy Markdown
Member

Following the assembler/disassembler changes in #178609, this patch also relaxes the codegen predicates for HINT-based instructions. Since these instructions use encodings that are architecturally guaranteed not to trap, the compiler can safely generate them regardless of extension availability.

Changes:

  • int_riscv_pause: Remove HasStdExtZihintpause predicate. The pause intrinsic now generates the FENCE hint encoding unconditionally.
  • NTL hints: Remove hasStdExtZihintntl() check in emitNTLHint(). Non-temporal locality hints are now emitted for all nontemporal memory operations.

Following the assembler/disassembler changes in llvm#178609,
this patch also relaxes the codegen predicates for HINT-based
instructions. Since these instructions use encodings that are
architecturally guaranteed not to trap, the compiler can safely
generate them regardless of extension availability.

Changes:
- int_riscv_pause: Remove HasStdExtZihintpause predicate. The pause
  intrinsic now generates the FENCE hint encoding unconditionally.
- NTL hints: Remove hasStdExtZihintntl() check in emitNTLHint().
  Non-temporal locality hints are now emitted for all nontemporal
  memory operations.
@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Feb 5, 2026

@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-backend-mips

Author: Kito Cheng (kito-cheng)

Changes

Following the assembler/disassembler changes in #178609, this patch also relaxes the codegen predicates for HINT-based instructions. Since these instructions use encodings that are architecturally guaranteed not to trap, the compiler can safely generate them regardless of extension availability.

Changes:

  • int_riscv_pause: Remove HasStdExtZihintpause predicate. The pause intrinsic now generates the FENCE hint encoding unconditionally.
  • NTL hints: Remove hasStdExtZihintntl() check in emitNTLHint(). Non-temporal locality hints are now emitted for all nontemporal memory operations.

Patch is 27.34 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/179872.diff

4 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp (+4-4)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.td (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/prefetch.ll (+84)
  • (modified) llvm/test/CodeGen/RISCV/xmips-cbop.ll (+8)
diff --git a/llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp b/llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
index 9740123d3859e..67e9c9585e25c 100644
--- a/llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
+++ b/llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
@@ -273,11 +273,11 @@ bool RISCVAsmPrinter::EmitToStreamer(MCStreamer &S, const MCInst &Inst,
 // instructions) auto-generated.
 #include "RISCVGenMCPseudoLowering.inc"
 
-// If the target supports Zihintntl and the instruction has a nontemporal
-// MachineMemOperand, emit an NTLH hint instruction before it.
+// If the instruction has a nontemporal MachineMemOperand, emit an NTL hint
+// instruction before it. NTL hints are always safe to emit since they use
+// HINT encodings that are guaranteed not to trap
+// (riscv-non-isa/riscv-elf-psabi-doc#474).
 void RISCVAsmPrinter::emitNTLHint(const MachineInstr *MI) {
-  if (!STI->hasStdExtZihintntl())
-    return;
 
   if (MI->memoperands_empty())
     return;
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.td b/llvm/lib/Target/RISCV/RISCVInstrInfo.td
index 156e41ede2d1e..71b88ebbba072 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.td
@@ -2316,8 +2316,8 @@ def : Pat<(i64 (add GPR:$rs1, negImm:$rs2)), (SUB GPR:$rs1, negImm:$rs2)>;
 // Zihintpause
 //===----------------------------------------------------------------------===//
 
-// Zihintpause
-let Predicates = [HasStdExtZihintpause] in
+// int_riscv_pause is always available since pause is a HINT encoding that is
+// guaranteed not to trap (riscv-non-isa/riscv-elf-psabi-doc#474).
 def : Pat<(int_riscv_pause), (FENCE 0x1, 0x0)>;
 
 //===----------------------------------------------------------------------===//
diff --git a/llvm/test/CodeGen/RISCV/prefetch.ll b/llvm/test/CodeGen/RISCV/prefetch.ll
index ba33ed7ac1a59..79cb5e0c6e85f 100644
--- a/llvm/test/CodeGen/RISCV/prefetch.ll
+++ b/llvm/test/CodeGen/RISCV/prefetch.ll
@@ -21,11 +21,13 @@ define void @test_prefetch_read_locality_0(ptr %a) nounwind {
 ;
 ; RV32ZICBOP-LABEL: test_prefetch_read_locality_0:
 ; RV32ZICBOP:       # %bb.0:
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
 ; RV64ZICBOP-LABEL: test_prefetch_read_locality_0:
 ; RV64ZICBOP:       # %bb.0:
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -49,11 +51,13 @@ define void @test_prefetch_write_locality_0(ptr %a) nounwind {
 ;
 ; RV32ZICBOP-LABEL: test_prefetch_write_locality_0:
 ; RV32ZICBOP:       # %bb.0:
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.w 0(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
 ; RV64ZICBOP-LABEL: test_prefetch_write_locality_0:
 ; RV64ZICBOP:       # %bb.0:
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.w 0(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -77,11 +81,13 @@ define void @test_prefetch_instruction_locality_0(ptr %a) nounwind {
 ;
 ; RV32ZICBOP-LABEL: test_prefetch_instruction_locality_0:
 ; RV32ZICBOP:       # %bb.0:
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.i 0(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
 ; RV64ZICBOP-LABEL: test_prefetch_instruction_locality_0:
 ; RV64ZICBOP:       # %bb.0:
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.i 0(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -105,11 +111,13 @@ define void @test_prefetch_read_locality_1(ptr %a) nounwind {
 ;
 ; RV32ZICBOP-LABEL: test_prefetch_read_locality_1:
 ; RV32ZICBOP:       # %bb.0:
+; RV32ZICBOP-NEXT:    ntl.pall
 ; RV32ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
 ; RV64ZICBOP-LABEL: test_prefetch_read_locality_1:
 ; RV64ZICBOP:       # %bb.0:
+; RV64ZICBOP-NEXT:    ntl.pall
 ; RV64ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -133,11 +141,13 @@ define void @test_prefetch_write_locality_1(ptr %a) nounwind {
 ;
 ; RV32ZICBOP-LABEL: test_prefetch_write_locality_1:
 ; RV32ZICBOP:       # %bb.0:
+; RV32ZICBOP-NEXT:    ntl.pall
 ; RV32ZICBOP-NEXT:    prefetch.w 0(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
 ; RV64ZICBOP-LABEL: test_prefetch_write_locality_1:
 ; RV64ZICBOP:       # %bb.0:
+; RV64ZICBOP-NEXT:    ntl.pall
 ; RV64ZICBOP-NEXT:    prefetch.w 0(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -161,11 +171,13 @@ define void @test_prefetch_instruction_locality_1(ptr %a) nounwind {
 ;
 ; RV32ZICBOP-LABEL: test_prefetch_instruction_locality_1:
 ; RV32ZICBOP:       # %bb.0:
+; RV32ZICBOP-NEXT:    ntl.pall
 ; RV32ZICBOP-NEXT:    prefetch.i 0(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
 ; RV64ZICBOP-LABEL: test_prefetch_instruction_locality_1:
 ; RV64ZICBOP:       # %bb.0:
+; RV64ZICBOP-NEXT:    ntl.pall
 ; RV64ZICBOP-NEXT:    prefetch.i 0(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -189,11 +201,13 @@ define void @test_prefetch_read_locality_2(ptr %a) nounwind {
 ;
 ; RV32ZICBOP-LABEL: test_prefetch_read_locality_2:
 ; RV32ZICBOP:       # %bb.0:
+; RV32ZICBOP-NEXT:    ntl.p1
 ; RV32ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
 ; RV64ZICBOP-LABEL: test_prefetch_read_locality_2:
 ; RV64ZICBOP:       # %bb.0:
+; RV64ZICBOP-NEXT:    ntl.p1
 ; RV64ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -217,11 +231,13 @@ define void @test_prefetch_write_locality_2(ptr %a) nounwind {
 ;
 ; RV32ZICBOP-LABEL: test_prefetch_write_locality_2:
 ; RV32ZICBOP:       # %bb.0:
+; RV32ZICBOP-NEXT:    ntl.p1
 ; RV32ZICBOP-NEXT:    prefetch.w 0(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
 ; RV64ZICBOP-LABEL: test_prefetch_write_locality_2:
 ; RV64ZICBOP:       # %bb.0:
+; RV64ZICBOP-NEXT:    ntl.p1
 ; RV64ZICBOP-NEXT:    prefetch.w 0(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -245,11 +261,13 @@ define void @test_prefetch_instruction_locality_2(ptr %a) nounwind {
 ;
 ; RV32ZICBOP-LABEL: test_prefetch_instruction_locality_2:
 ; RV32ZICBOP:       # %bb.0:
+; RV32ZICBOP-NEXT:    ntl.p1
 ; RV32ZICBOP-NEXT:    prefetch.i 0(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
 ; RV64ZICBOP-LABEL: test_prefetch_instruction_locality_2:
 ; RV64ZICBOP:       # %bb.0:
+; RV64ZICBOP-NEXT:    ntl.p1
 ; RV64ZICBOP-NEXT:    prefetch.i 0(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -354,11 +372,13 @@ define void @test_prefetch_offsetable_0(ptr %a) nounwind {
 ;
 ; RV32ZICBOP-LABEL: test_prefetch_offsetable_0:
 ; RV32ZICBOP:       # %bb.0:
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r 2016(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
 ; RV64ZICBOP-LABEL: test_prefetch_offsetable_0:
 ; RV64ZICBOP:       # %bb.0:
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r 2016(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -383,11 +403,13 @@ define void @test_prefetch_offsetable_1(ptr %a) nounwind {
 ;
 ; RV32ZICBOP-LABEL: test_prefetch_offsetable_1:
 ; RV32ZICBOP:       # %bb.0:
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r -2048(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
 ; RV64ZICBOP-LABEL: test_prefetch_offsetable_1:
 ; RV64ZICBOP:       # %bb.0:
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r -2048(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -412,11 +434,13 @@ define void @test_prefetch_offsetable_2(ptr %a) nounwind {
 ;
 ; RV32ZICBOP-LABEL: test_prefetch_offsetable_2:
 ; RV32ZICBOP:       # %bb.0:
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r 32(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
 ; RV64ZICBOP-LABEL: test_prefetch_offsetable_2:
 ; RV64ZICBOP:       # %bb.0:
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r 32(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -441,11 +465,13 @@ define void @test_prefetch_offsetable_3(ptr %a) nounwind {
 ;
 ; RV32ZICBOP-LABEL: test_prefetch_offsetable_3:
 ; RV32ZICBOP:       # %bb.0:
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r -32(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
 ; RV64ZICBOP-LABEL: test_prefetch_offsetable_3:
 ; RV64ZICBOP:       # %bb.0:
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r -32(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -471,12 +497,14 @@ define void @test_prefetch_offsetable_4(ptr %a) nounwind {
 ; RV32ZICBOP-LABEL: test_prefetch_offsetable_4:
 ; RV32ZICBOP:       # %bb.0:
 ; RV32ZICBOP-NEXT:    addi a0, a0, 32
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r 2016(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
 ; RV64ZICBOP-LABEL: test_prefetch_offsetable_4:
 ; RV64ZICBOP:       # %bb.0:
 ; RV64ZICBOP-NEXT:    addi a0, a0, 32
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r 2016(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -503,12 +531,14 @@ define void @test_prefetch_offsetable_5(ptr %a) nounwind {
 ; RV32ZICBOP-LABEL: test_prefetch_offsetable_5:
 ; RV32ZICBOP:       # %bb.0:
 ; RV32ZICBOP-NEXT:    addi a0, a0, -1
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r -2048(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
 ; RV64ZICBOP-LABEL: test_prefetch_offsetable_5:
 ; RV64ZICBOP:       # %bb.0:
 ; RV64ZICBOP-NEXT:    addi a0, a0, -1
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r -2048(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -535,12 +565,14 @@ define void @test_prefetch_offsetable_6(ptr %a) nounwind {
 ; RV32ZICBOP-LABEL: test_prefetch_offsetable_6:
 ; RV32ZICBOP:       # %bb.0:
 ; RV32ZICBOP-NEXT:    addi a0, a0, 16
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
 ; RV64ZICBOP-LABEL: test_prefetch_offsetable_6:
 ; RV64ZICBOP:       # %bb.0:
 ; RV64ZICBOP-NEXT:    addi a0, a0, 16
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -567,12 +599,14 @@ define void @test_prefetch_offsetable_7(ptr %a) nounwind {
 ; RV32ZICBOP-LABEL: test_prefetch_offsetable_7:
 ; RV32ZICBOP:       # %bb.0:
 ; RV32ZICBOP-NEXT:    addi a0, a0, -16
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
 ; RV64ZICBOP-LABEL: test_prefetch_offsetable_7:
 ; RV64ZICBOP:       # %bb.0:
 ; RV64ZICBOP-NEXT:    addi a0, a0, -16
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -600,6 +634,7 @@ define void @test_prefetch_offsetable_9(ptr %a) nounwind {
 ; RV32ZICBOP:       # %bb.0:
 ; RV32ZICBOP-NEXT:    lui a1, 1
 ; RV32ZICBOP-NEXT:    add a0, a0, a1
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r 64(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
@@ -607,6 +642,7 @@ define void @test_prefetch_offsetable_9(ptr %a) nounwind {
 ; RV64ZICBOP:       # %bb.0:
 ; RV64ZICBOP-NEXT:    lui a1, 1
 ; RV64ZICBOP-NEXT:    add a0, a0, a1
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r 64(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -635,6 +671,7 @@ define void @test_prefetch_offsetable_8(ptr %a) nounwind {
 ; RV32ZICBOP:       # %bb.0:
 ; RV32ZICBOP-NEXT:    lui a1, 1048575
 ; RV32ZICBOP-NEXT:    add a0, a0, a1
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r -64(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
@@ -642,6 +679,7 @@ define void @test_prefetch_offsetable_8(ptr %a) nounwind {
 ; RV64ZICBOP:       # %bb.0:
 ; RV64ZICBOP-NEXT:    lui a1, 1048575
 ; RV64ZICBOP-NEXT:    add a0, a0, a1
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r -64(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -673,6 +711,7 @@ define void @test_prefetch_frameindex_0() nounwind {
 ; RV32ZICBOP-LABEL: test_prefetch_frameindex_0:
 ; RV32ZICBOP:       # %bb.0:
 ; RV32ZICBOP-NEXT:    addi sp, sp, -512
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r 0(sp)
 ; RV32ZICBOP-NEXT:    addi sp, sp, 512
 ; RV32ZICBOP-NEXT:    ret
@@ -680,6 +719,7 @@ define void @test_prefetch_frameindex_0() nounwind {
 ; RV64ZICBOP-LABEL: test_prefetch_frameindex_0:
 ; RV64ZICBOP:       # %bb.0:
 ; RV64ZICBOP-NEXT:    addi sp, sp, -512
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r 0(sp)
 ; RV64ZICBOP-NEXT:    addi sp, sp, 512
 ; RV64ZICBOP-NEXT:    ret
@@ -725,6 +765,7 @@ define void @test_prefetch_frameindex_1() nounwind {
 ; RV32ZICBOP-NEXT:    addi a0, a0, 16
 ; RV32ZICBOP-NEXT:    sub sp, sp, a0
 ; RV32ZICBOP-NEXT:    addi a0, sp, 16
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV32ZICBOP-NEXT:    lui a0, 1
 ; RV32ZICBOP-NEXT:    addi a0, a0, 16
@@ -737,6 +778,7 @@ define void @test_prefetch_frameindex_1() nounwind {
 ; RV64ZICBOP-NEXT:    addi a0, a0, 16
 ; RV64ZICBOP-NEXT:    sub sp, sp, a0
 ; RV64ZICBOP-NEXT:    addi a0, sp, 16
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV64ZICBOP-NEXT:    lui a0, 1
 ; RV64ZICBOP-NEXT:    addi a0, a0, 16
@@ -778,6 +820,7 @@ define void @test_prefetch_frameindex_2() nounwind {
 ; RV32ZICBOP:       # %bb.0:
 ; RV32ZICBOP-NEXT:    addi sp, sp, -512
 ; RV32ZICBOP-NEXT:    addi a0, sp, 16
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV32ZICBOP-NEXT:    addi sp, sp, 512
 ; RV32ZICBOP-NEXT:    ret
@@ -786,6 +829,7 @@ define void @test_prefetch_frameindex_2() nounwind {
 ; RV64ZICBOP:       # %bb.0:
 ; RV64ZICBOP-NEXT:    addi sp, sp, -512
 ; RV64ZICBOP-NEXT:    addi a0, sp, 16
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV64ZICBOP-NEXT:    addi sp, sp, 512
 ; RV64ZICBOP-NEXT:    ret
@@ -822,6 +866,7 @@ define void @test_prefetch_frameindex_3() nounwind {
 ; RV32ZICBOP:       # %bb.0:
 ; RV32ZICBOP-NEXT:    addi sp, sp, -512
 ; RV32ZICBOP-NEXT:    addi a0, sp, -16
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV32ZICBOP-NEXT:    addi sp, sp, 512
 ; RV32ZICBOP-NEXT:    ret
@@ -830,6 +875,7 @@ define void @test_prefetch_frameindex_3() nounwind {
 ; RV64ZICBOP:       # %bb.0:
 ; RV64ZICBOP-NEXT:    addi sp, sp, -512
 ; RV64ZICBOP-NEXT:    addi a0, sp, -16
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV64ZICBOP-NEXT:    addi sp, sp, 512
 ; RV64ZICBOP-NEXT:    ret
@@ -865,6 +911,7 @@ define void @test_prefetch_frameindex_4() nounwind {
 ; RV32ZICBOP-LABEL: test_prefetch_frameindex_4:
 ; RV32ZICBOP:       # %bb.0:
 ; RV32ZICBOP-NEXT:    addi sp, sp, -512
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r 32(sp)
 ; RV32ZICBOP-NEXT:    addi sp, sp, 512
 ; RV32ZICBOP-NEXT:    ret
@@ -872,6 +919,7 @@ define void @test_prefetch_frameindex_4() nounwind {
 ; RV64ZICBOP-LABEL: test_prefetch_frameindex_4:
 ; RV64ZICBOP:       # %bb.0:
 ; RV64ZICBOP-NEXT:    addi sp, sp, -512
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r 32(sp)
 ; RV64ZICBOP-NEXT:    addi sp, sp, 512
 ; RV64ZICBOP-NEXT:    ret
@@ -906,6 +954,7 @@ define void @test_prefetch_frameindex_5() nounwind {
 ; RV32ZICBOP-LABEL: test_prefetch_frameindex_5:
 ; RV32ZICBOP:       # %bb.0:
 ; RV32ZICBOP-NEXT:    addi sp, sp, -512
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r -32(sp)
 ; RV32ZICBOP-NEXT:    addi sp, sp, 512
 ; RV32ZICBOP-NEXT:    ret
@@ -913,6 +962,7 @@ define void @test_prefetch_frameindex_5() nounwind {
 ; RV64ZICBOP-LABEL: test_prefetch_frameindex_5:
 ; RV64ZICBOP:       # %bb.0:
 ; RV64ZICBOP-NEXT:    addi sp, sp, -512
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r -32(sp)
 ; RV64ZICBOP-NEXT:    addi sp, sp, 512
 ; RV64ZICBOP-NEXT:    ret
@@ -947,6 +997,7 @@ define void @test_prefetch_frameindex_6() nounwind {
 ; RV32ZICBOP-LABEL: test_prefetch_frameindex_6:
 ; RV32ZICBOP:       # %bb.0:
 ; RV32ZICBOP-NEXT:    addi sp, sp, -512
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r 2016(sp)
 ; RV32ZICBOP-NEXT:    addi sp, sp, 512
 ; RV32ZICBOP-NEXT:    ret
@@ -954,6 +1005,7 @@ define void @test_prefetch_frameindex_6() nounwind {
 ; RV64ZICBOP-LABEL: test_prefetch_frameindex_6:
 ; RV64ZICBOP:       # %bb.0:
 ; RV64ZICBOP-NEXT:    addi sp, sp, -512
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r 2016(sp)
 ; RV64ZICBOP-NEXT:    addi sp, sp, 512
 ; RV64ZICBOP-NEXT:    ret
@@ -988,6 +1040,7 @@ define void @test_prefetch_frameindex_7() nounwind {
 ; RV32ZICBOP-LABEL: test_prefetch_frameindex_7:
 ; RV32ZICBOP:       # %bb.0:
 ; RV32ZICBOP-NEXT:    addi sp, sp, -512
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r -2048(sp)
 ; RV32ZICBOP-NEXT:    addi sp, sp, 512
 ; RV32ZICBOP-NEXT:    ret
@@ -995,6 +1048,7 @@ define void @test_prefetch_frameindex_7() nounwind {
 ; RV64ZICBOP-LABEL: test_prefetch_frameindex_7:
 ; RV64ZICBOP:       # %bb.0:
 ; RV64ZICBOP-NEXT:    addi sp, sp, -512
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r -2048(sp)
 ; RV64ZICBOP-NEXT:    addi sp, sp, 512
 ; RV64ZICBOP-NEXT:    ret
@@ -1030,6 +1084,7 @@ define void @test_prefetch_frameindex_8() nounwind {
 ; RV32ZICBOP:       # %bb.0:
 ; RV32ZICBOP-NEXT:    addi sp, sp, -512
 ; RV32ZICBOP-NEXT:    addi a0, sp, 2020
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV32ZICBOP-NEXT:    addi sp, sp, 512
 ; RV32ZICBOP-NEXT:    ret
@@ -1038,6 +1093,7 @@ define void @test_prefetch_frameindex_8() nounwind {
 ; RV64ZICBOP:       # %bb.0:
 ; RV64ZICBOP-NEXT:    addi sp, sp, -512
 ; RV64ZICBOP-NEXT:    addi a0, sp, 2020
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV64ZICBOP-NEXT:    addi sp, sp, 512
 ; RV64ZICBOP-NEXT:    ret
@@ -1075,6 +1131,7 @@ define void @test_prefetch_frameindex_9() nounwind {
 ; RV32ZICBOP-NEXT:    addi sp, sp, -512
 ; RV32ZICBOP-NEXT:    mv a0, sp
 ; RV32ZICBOP-NEXT:    addi a0, a0, -4
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r -2048(a0)
 ; RV32ZICBOP-NEXT:    addi sp, sp, 512
 ; RV32ZICBOP-NEXT:    ret
@@ -1084,6 +1141,7 @@ define void @test_prefetch_frameindex_9() nounwind {
 ; RV64ZICBOP-NEXT:    addi sp, sp, -512
 ; RV64ZICBOP-NEXT:    mv a0, sp
 ; RV64ZICBOP-NEXT:    addi a0, a0, -4
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r -2048(a0)
 ; RV64ZICBOP-NEXT:    addi sp, sp, 512
 ; RV64ZICBOP-NEXT:    ret
@@ -1116,12 +1174,14 @@ define void @test_prefetch_constant_address_0() nounwind {
 ; RV32ZICBOP-LABEL: test_prefetch_constant_address_0:
 ; RV32ZICBOP:       # %bb.0:
 ; RV32ZICBOP-NEXT:    lui a0, 1
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r 32(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
 ; RV64ZICBOP-LABEL: test_prefetch_constant_address_0:
 ; RV64ZICBOP:       # %bb.0:
 ; RV64ZICBOP-NEXT:    lui a0, 1
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r 32(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -1149,6 +1209,7 @@ define void @test_prefetch_constant_address_1() nounwind {
 ; RV32ZICBOP:       # %bb.0:
 ; RV32ZICBOP-NEXT:    lui a0, 1
 ; RV32ZICBOP-NEXT:    addi a0, a0, 31
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
@@ -1156,6 +1217,7 @@ define void @test_prefetch_constant_address_1() nounwind {
 ; RV64ZICBOP:       # %bb.0:
 ; RV64ZICBOP-NEXT:    lui a0, 1
 ; RV64ZICBOP-NEXT:    addi a0, a0, 31
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -1183,12 +1245,14 @@ define void @test_prefetch_constant_address_2() nounwind {
 ; RV32ZICBOP-LABEL: test_prefetch_constant_address_2:
 ; RV32ZICBOP:       # %bb.0:
 ; RV32ZICBOP-NEXT:    lui a0, 1048561
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r 32(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
 ; RV64ZICBOP-LABEL: test_prefetch_constant_address_2:
 ; RV64ZICBOP:       # %bb.0:
 ; RV64ZICBOP-NEXT:    lui a0, 1048561
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r 32(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -1216,6 +1280,7 @@ define void @test_prefetch_constant_address_3() nounwind {
 ; RV32ZICBOP:       # %bb.0:
 ; RV32ZICBOP-NEXT:    lui a0, 1048561
 ; RV32ZICBOP-NEXT:    addi a0, a0, 31
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
@@ -1223,6 +1288,7 @@ define void @test_prefetch_constant_address_3() nounwind {
 ; RV64ZICBOP:       # %bb.0:
 ; RV64ZICBOP-NEXT:    lui a0, 1048561
 ; RV64ZICBOP-NEXT:    addi a0, a0, 31
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -1253,6 +1319,7 @@ define void @test_prefetch_global_0() nounwind {
 ; RV32ZICBOP:       # %bb.0:
 ; RV32ZICBOP-NEXT:    lui a0, %hi(g)
 ; RV32ZICBOP-NEXT:    addi a0, a0, %lo(g)
+; RV32ZICBOP-NEXT:    ntl.all
 ; RV32ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV32ZICBOP-NEXT:    ret
 ;
@@ -1260,6 +1327,7 @@ define void @test_prefetch_global_0() nounwind {
 ; RV64ZICBOP:       # %bb.0:
 ; RV64ZICBOP-NEXT:    lui a0, %hi(g)
 ; RV64ZICBOP-NEXT:    addi a0, a0, %lo(g)
+; RV64ZICBOP-NEXT:    ntl.all
 ; RV64ZICBOP-NEXT:    prefetch.r 0(a0)
 ; RV64ZICBOP-NEXT:    ret
 ;
@@ -1288,6 ...
[truncated]

@lenary
Copy link
Copy Markdown
Member

lenary commented Feb 5, 2026

We should probably also update the check lines in llvm/test/CodeGen/RISCV/riscv-zihintpause.ll.

@kito-cheng
Copy link
Copy Markdown
Member Author

This change remind me prefetch is also a HINT as well (reuse ORI instruction), however it's different than other HINT instruction which come from zihint* extension, but I think that would be better address later instead mix more thing here.

Ref:
https://docs.riscv.org/reference/isa/unpriv/cmo.html
https://docs.riscv.org/reference/isa/unpriv/cmo.html#Zicbop

A cache-block prefetch instruction is a HINT to the hardware that software expects to perform a particular type of memory access in the near future. Additional details are described in [Cache-Block Prefetch Instructions](https://docs.riscv.org/reference/isa/unpriv/cmo.html#Zicbop).

Changes:

  • Add more test on riscv-zihintpause.ll
  • Merge check lines on prefetch.ll because ntl.* can be emit unconditionally now.

@kito-cheng
Copy link
Copy Markdown
Member Author

Changes:

  • Simplify xmips-cbop.ll as well

@kito-cheng
Copy link
Copy Markdown
Member Author

Changes:

  • Regen prefetch.ll

Copy link
Copy Markdown
Member

@lenary lenary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

- Add requireNTLHint
- Fix getInstSizeInBytes for NTL Hint
- Exclude MIPS prefetch instruction
  - Restore MIPS prefetch test
@kito-cheng
Copy link
Copy Markdown
Member Author

Changes:

  • Add requireNTLHint
  • Fix getInstSizeInBytes for NTL Hint
  • Exclude MIPS prefetch instruction
    • Restore MIPS prefetch test

void RISCVAsmPrinter::emitNTLHint(const MachineInstr *MI) {
if (!STI->hasStdExtZihintntl())
return;
const auto *TII = static_cast<const RISCVInstrInfo *>(STI->getInstrInfo());
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this cast needed?

return (!RCFractional && LMul == RCLMul) || (RCFractional && LMul == 1);
}

bool RISCVInstrInfo::requireNTLHint(const MachineInstr &MI) const {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool RISCVInstrInfo::requireNTLHint(const MachineInstr &MI) const {
bool RISCVInstrInfo::requiresNTLHint(const MachineInstr &MI) const {

/// or any kind of vector registers when \p LMul is zero.
bool isVRegCopy(const MachineInstr *MI, unsigned LMul = 0) const;

/// Return true if the instruction need come with a NTL hint.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Return true if the instruction need come with a NTL hint.
/// Return true if the instruction requires an NTL hint to be emitted.

@kito-cheng
Copy link
Copy Markdown
Member Author

Changes:

  • Address Craig's comments
    • Update comment
    • Rename requireNTLHint to requiresNTLHint
    • Drop unnecessary cast

Copy link
Copy Markdown
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kito-cheng kito-cheng merged commit 4922ab9 into llvm:main Mar 2, 2026
10 checks passed
@kito-cheng kito-cheng deleted the kitoc/hint-op-codegen branch March 2, 2026 09:24
sahas3 pushed a commit to sahas3/llvm-project that referenced this pull request Mar 4, 2026
…9872)

Following the assembler/disassembler changes in llvm#178609, this patch also
relaxes the codegen predicates for HINT-based instructions. Since these
instructions use encodings that are architecturally guaranteed not to
trap, the compiler can safely generate them regardless of extension
availability.

Changes:
- int_riscv_pause: Remove HasStdExtZihintpause predicate. The pause
intrinsic now generates the FENCE hint encoding unconditionally.
- NTL hints: Remove hasStdExtZihintntl() check in emitNTLHint().
Non-temporal locality hints are now emitted for all nontemporal memory
operations.
sujianIBM pushed a commit to sujianIBM/llvm-project that referenced this pull request Mar 5, 2026
…9872)

Following the assembler/disassembler changes in llvm#178609, this patch also
relaxes the codegen predicates for HINT-based instructions. Since these
instructions use encodings that are architecturally guaranteed not to
trap, the compiler can safely generate them regardless of extension
availability.

Changes:
- int_riscv_pause: Remove HasStdExtZihintpause predicate. The pause
intrinsic now generates the FENCE hint encoding unconditionally.
- NTL hints: Remove hasStdExtZihintntl() check in emitNTLHint().
Non-temporal locality hints are now emitted for all nontemporal memory
operations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants