-
Notifications
You must be signed in to change notification settings - Fork 15.3k
Description
I am struggling to understand the logic behind the computation of the occupancy with respect to the LDS size for a given function, in particular how it fits in the overall occupancy computation for the function. In particular, it seems to me that there is a discrepancy in the meaning of the achievable occupancy number computed with respect to the number of used SGPRs/VGPRs as opposed to LDS size. Current implementation of GCNSubtarget::computeOccupancy pasted below for future reference.
unsigned GCNSubtarget::computeOccupancy(const Function &F, unsigned LDSSize,
unsigned NumSGPRs,
unsigned NumVGPRs) const {
unsigned Occupancy =
std::min(getMaxWavesPerEU(), getOccupancyWithLocalMemSize(LDSSize, F));
if (NumSGPRs)
Occupancy = std::min(Occupancy, getOccupancyWithNumSGPRs(NumSGPRs));
if (NumVGPRs)
Occupancy = std::min(Occupancy, getOccupancyWithNumVGPRs(NumVGPRs));
return Occupancy;
}On gfx908 for example, getMaxWavesPerEU() == 10. If the number of SGPRs and VGPRs used in the function is low enough to support maximum occupancy, getOccupancyWithNumSGPRs(NumSGPRs) == getOccupancyWithNumVGPRs(NumVGPRs) == 10 as well, which makes sense to me. However, even if LDSSize == 0 (or any low enough number so that the LDS should not restrict occupancy), getOccupancyWithLocalMemSize(LDSSize, F) == 8, and as a consequence GCNSubtarget::computeOccupancy returns 8. This seems to be because getOccupancyWithLocalMemSize performs a semantically different calculation that the two other occupancy-computing methods (which as the FIXMEs in the method suggest is perhaps not the one we should do). I would have assumed that no or low LDS usage would return the same number as getMaxWavesPerEU().
I just wanted to check if I am missing something here or if there is indeed a problem in this context.