Skip to content

Conversation

@dfukalov
Copy link
Collaborator

For some targets, the optimization X == Const ? X : Y → X == Const ? Const : Y
can cause extra register usage or redundant immediate encoding for the constant
in cndmask generated from the ternary operation.

This patch detects such cases and reuses the register from the compare instruction
that already holds the constant, instead of materializing it again for cndmask.

The optimization avoids immediates that can be encoded into cndmask instruction
(including +-0.0), as well as !isNormal() constants.

@dfukalov dfukalov requested a review from Copilot July 11, 2025 12:05
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds an AMDGPU DAG combiner optimization that detects when a SELECT reuses the same constant as a preceding SETCC and folds it to avoid redundant constant materialization. It introduces a helper to extract the constant and argument, implements foldCmpSelectWithSharedConstant, and hooks it into performSelectCombine. Extensive new FileCheck tests for integer and floating-point types verify both foldable and non-foldable patterns.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp Implement foldCmpSelectWithSharedConstant and integrate it
llvm/test/CodeGen/AMDGPU/select-cmp-shared-constant-int.ll Add Integer SELECT/SETCC folding tests
llvm/test/CodeGen/AMDGPU/select-cmp-shared-constant-fp.ll Add Floating-point SELECT/SETCC folding tests
Comments suppressed due to low confidence (1)

llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp:4947

  • This optimization is tested for scalar integer and floating-point types. Consider adding tests for vector types (e.g., v4i32, v2f32) to ensure the combiner handles vector SELECT patterns correctly.
  if (SDValue Folded = foldCmpSelectWithSharedConstant(N, DCI, Subtarget))

@dfukalov dfukalov force-pushed the cmp-select-const-reuse branch from 1ef8b84 to 0617b6e Compare July 13, 2025 19:25
@dfukalov dfukalov requested a review from Copilot July 13, 2025 19:25

This comment was marked as outdated.

@dfukalov dfukalov changed the title [AMDGPU] Try to reuse in v_cndmask register with constant from compare. [AMDGPU] Try to reuse in v_cndmask the register with constant from compare. Jul 14, 2025
@dfukalov dfukalov requested a review from Copilot July 14, 2025 22:10
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds a DAG combine to merge SETCC+SELECT patterns that reuse the same constant, so the register loaded for the compare can be reused in the conditional mask. Also includes a comprehensive set of new tests covering integer and floating-point types (I8/I16/I32/I64, F16, F32, F64, BF16) to validate both folded and non-folded cases.

  • Introduce foldCmpSelectWithSharedConstant to detect and fold SELECTs that share a constant with their SETCC.
  • Hook the new fold into performSelectCombine in AMDGPUISelLowering.
  • Add new LLVM IR tests in select-cmp-shared-constant-int.ll and select-cmp-shared-constant-fp.ll.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp Added foldCmpSelectWithSharedConstant and integrate it into select combining logic.
llvm/test/CodeGen/AMDGPU/select-cmp-shared-constant-int.ll New tests for integer constant reuse in cndmask folding.
llvm/test/CodeGen/AMDGPU/select-cmp-shared-constant-fp.ll New tests for floating-point constant reuse in cndmask folding.
Comments suppressed due to low confidence (1)

llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp:4845

  • [nitpick] The comment refers to patterns like "setccinv", but the function currently only handles ISD::SETCC with equality and inequality codes. Consider updating the comment to accurately describe the supported patterns or extending the code to explicitly handle inverted SETCC nodes if needed.
// Detect when CMP and SELECT use the same constant and fold them to avoid

Copy link

Copilot AI Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Casting the AMDGPUSubtarget pointer to GCNSubtarget may obscure the actual type relationship. Consider using the AMDGPUSubtarget interface directly or adding a runtime check/assertion to ensure the cast is safe, improving clarity for future maintainers.

Suggested change
const GCNSubtarget *GCNST = static_cast<const GCNSubtarget *>(ST);
const GCNSubtarget *GCNST = dyn_cast<GCNSubtarget>(ST);
assert(GCNST && "ST is not of type GCNSubtarget");

Copilot uses AI. Check for mistakes.
Copy link

Copilot AI Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The code duplicates logic for floating-point and integer immediate eligibility checks. Refactoring the common pattern into a helper or template function could reduce duplication and improve readability.

Copilot uses AI. Check for mistakes.
For some targets, the optimization X == Const ? X : Y → X == Const ? Const : Y
can cause extra register usage or redundant immediate encoding for the constant
in cndmask generated from the ternary operation.

This patch detects such cases and reuses the register from the compare instruction
that already holds the constant, instead of materializing it again for cndmask.

The optimization avoids immediates that can be encoded into cndmask instruction
(including +-0.0), as well as !isNormal() constants.
@dfukalov dfukalov force-pushed the cmp-select-const-reuse branch from 0617b6e to 57436f3 Compare July 14, 2025 22:19
@dfukalov dfukalov closed this Jul 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant