-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Description
Summary
After the commit 54ec8bc one of the AMDGPU benchmarks regressed because the assembler became longer.
Reduced Input IR
See input.ll in the attached archive
Steps to Reproduce
-
Run opt with these parameters:
opt -mtriple=amdgcn-amd-amdhsa -mcpu=gfx908 -O3 -vectorize-loops -vectorize-slp -amdgpu-early-inline-all=true -amdgpu-function-calls=false -S -o output.ll input.ll -
Then run llc with these parameters:
llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx908 -O3 -vectorize-loops -vectorize-slp -amdgpu-early-inline-all=true -amdgpu-function-calls=false -o output.s output.ll
Additional Information
The input IR includes a complicated set of binary operations. The commit tries to fold the instructions but instead produces extra code, leading to the regression.
Please find the attachments for detailed comparison and further analysis.