Skip to content

Commit 9f271dd

Browse files
committed
[AMDGPU][SplitModule] Fix a couple of issues
A static analysis tool found that ModuleCost could be zero, so would perform divide by zero when being printed. Perhaps this is unreachable in practice, but the fix is straightforward enough and unlikely to be a performance concern. The same tool warned that a division was always being performed in integer division, so was either 0.0 or 1.0. This doesn't seem intentional, so has been fixed to return a true ratio using floating-point division. This has a knock-on effect on how a test was splitting modules.
1 parent 99fd1c5 commit 9f271dd

File tree

2 files changed

+10
-8
lines changed

2 files changed

+10
-8
lines changed

llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,8 @@ static constexpr unsigned InvalidPID = -1;
149149
/// \param Dem denominator
150150
/// \returns a printable object to print (Num/Dem) using "%0.2f".
151151
static auto formatRatioOf(CostType Num, CostType Dem) {
152-
return format("%0.2f", (static_cast<double>(Num) / Dem) * 100);
152+
CostType DemOr1 = Dem ? Dem : 1;
153+
return format("%0.2f", (static_cast<double>(Num) / DemOr1) * 100);
153154
}
154155

155156
/// Checks whether a given function is non-copyable.
@@ -1101,7 +1102,7 @@ void RecursiveSearchSplitting::pickPartition(unsigned Depth, unsigned Idx,
11011102
// Check if the amount of code in common makes it worth it.
11021103
assert(SimilarDepsCost && Entry.CostExcludingGraphEntryPoints);
11031104
const double Ratio =
1104-
SimilarDepsCost / Entry.CostExcludingGraphEntryPoints;
1105+
(double)SimilarDepsCost / Entry.CostExcludingGraphEntryPoints;
11051106
assert(Ratio >= 0.0 && Ratio <= 1.0);
11061107
if (LargeFnOverlapForMerge > Ratio) {
11071108
// For debug, just print "L", so we'll see "L3=P3" for instance, which

llvm/test/tools/llvm-split/AMDGPU/large-kernels-merging.ll

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,19 +15,20 @@
1515
; Also check w/o large kernels processing to verify they are indeed handled
1616
; differently.
1717

18-
; P0 is empty
19-
; CHECK0: declare
18+
; CHECK0: define internal void @HelperC()
19+
; CHECK0: define amdgpu_kernel void @C
2020

21-
; CHECK1: define internal void @HelperC()
22-
; CHECK1: define amdgpu_kernel void @C
21+
; CHECK1: define internal void @large2()
22+
; CHECK1: define internal void @large1()
23+
; CHECK1: define internal void @large0()
24+
; CHECK1: define internal void @HelperB()
25+
; CHECK1: define amdgpu_kernel void @B
2326

2427
; CHECK2: define internal void @large2()
2528
; CHECK2: define internal void @large1()
2629
; CHECK2: define internal void @large0()
2730
; CHECK2: define internal void @HelperA()
28-
; CHECK2: define internal void @HelperB()
2931
; CHECK2: define amdgpu_kernel void @A
30-
; CHECK2: define amdgpu_kernel void @B
3132

3233
; NOLARGEKERNELS-CHECK0: define internal void @HelperC()
3334
; NOLARGEKERNELS-CHECK0: define amdgpu_kernel void @C

0 commit comments

Comments
 (0)