Skip to content

Commit 6d35135

Browse files
committed
[AMDGPU] Enable expensive unroll trip count.
This patch enables unrolling innermost loop of Pytorch reduce function in Normalization.cuh.
1 parent 0a66411 commit 6d35135

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -271,6 +271,8 @@ void AMDGPUTTIImpl::getUnrollingPreferences(
271271
if (L->isInnermost() && BB->size() < UnrollMaxBlockToAnalyze)
272272
UP.MaxIterationsCountToAnalyze = 32;
273273
}
274+
275+
UP.AllowExpensiveTripCount = true;
274276
}
275277

276278
void AMDGPUTTIImpl::getPeelingPreferences(Loop *L, ScalarEvolution &SE,

0 commit comments

Comments
 (0)