|
8 | 8 | //
|
9 | 9 | /// \file
|
10 | 10 | /// This pass simplifies certain intrinsic calls when the arguments are uniform.
|
11 |
| -/// also, this pass relies on the fact that uniformity analysis remains safe |
12 |
| -/// across valid transformations in LLVM. A transformation does not alter |
13 |
| -/// program behavior across threads: each instruction in the original IR |
14 |
| -/// continues to have a well-defined counterpart in the transformed IR, both |
15 |
| -/// statically and dynamically. |
16 |
| -/// |
17 |
| -/// Valid transformations respect three invariants: |
18 |
| -/// 1. Use-def relationships are preserved. If one instruction produces a value |
19 |
| -/// and another consumes it, that dependency must remain intact. |
20 |
| -/// 2. Uniformity classification is preserved. Certain values are always uniform |
21 |
| -/// (constants, kernel arguments, convergent operations), while others are |
22 |
| -/// always divergent (atomics, most function calls). Transformations may turn |
23 |
| -/// divergent computations into uniform ones, but never the reverse. |
24 |
| -/// 3. Uniformity must hold not only at the point of value computation but also |
25 |
| -/// at all later uses of that value, consistently across the same set of |
26 |
| -/// threads. |
27 |
| -/// |
28 |
| -/// Together, these invariants ensure that transformations in this pass are |
29 |
| -/// correctness-preserving and remain safe for uniformity analysis. |
| 11 | +/// It's true that this pass has transforms that can lead to a situation where |
| 12 | +/// some instruction whose operand was previously recognized as statically |
| 13 | +/// uniform is later on no longer recognized as statically uniform. However, the |
| 14 | +/// semantics of how programs execute don't (and must not, for this precise |
| 15 | +/// reason[0]) care about static uniformity, they only ever care about dynamic |
| 16 | +/// uniformity. And every instruction that's downstream and cares about dynamic |
| 17 | +/// uniformity must be convergent (and isel will introduce v_readfirstlane for |
| 18 | +/// them if their operands can't be proven statically uniform). |
30 | 19 | //===----------------------------------------------------------------------===//
|
31 | 20 |
|
32 | 21 | #include "AMDGPU.h"
|
|
0 commit comments