Skip to content

Commit 0c9ff2f

Browse files
update description
1 parent 096a801 commit 0c9ff2f

File tree

1 file changed

+8
-19
lines changed

1 file changed

+8
-19
lines changed

llvm/lib/Target/AMDGPU/AMDGPUUniformIntrinsicCombine.cpp

Lines changed: 8 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -8,25 +8,14 @@
88
//
99
/// \file
1010
/// This pass simplifies certain intrinsic calls when the arguments are uniform.
11-
/// also, this pass relies on the fact that uniformity analysis remains safe
12-
/// across valid transformations in LLVM. A transformation does not alter
13-
/// program behavior across threads: each instruction in the original IR
14-
/// continues to have a well-defined counterpart in the transformed IR, both
15-
/// statically and dynamically.
16-
///
17-
/// Valid transformations respect three invariants:
18-
/// 1. Use-def relationships are preserved. If one instruction produces a value
19-
/// and another consumes it, that dependency must remain intact.
20-
/// 2. Uniformity classification is preserved. Certain values are always uniform
21-
/// (constants, kernel arguments, convergent operations), while others are
22-
/// always divergent (atomics, most function calls). Transformations may turn
23-
/// divergent computations into uniform ones, but never the reverse.
24-
/// 3. Uniformity must hold not only at the point of value computation but also
25-
/// at all later uses of that value, consistently across the same set of
26-
/// threads.
27-
///
28-
/// Together, these invariants ensure that transformations in this pass are
29-
/// correctness-preserving and remain safe for uniformity analysis.
11+
/// It's true that this pass has transforms that can lead to a situation where
12+
/// some instruction whose operand was previously recognized as statically
13+
/// uniform is later on no longer recognized as statically uniform. However, the
14+
/// semantics of how programs execute don't (and must not, for this precise
15+
/// reason[0]) care about static uniformity, they only ever care about dynamic
16+
/// uniformity. And every instruction that's downstream and cares about dynamic
17+
/// uniformity must be convergent (and isel will introduce v_readfirstlane for
18+
/// them if their operands can't be proven statically uniform).
3019
//===----------------------------------------------------------------------===//
3120

3221
#include "AMDGPU.h"

0 commit comments

Comments
 (0)