Skip to content

[Flang] read statement is slower than Gfortran in SPEC CPU 2017 #134026

@s-watanabe314

Description

@s-watanabe314

When measuring 649.fotonik3d_s from SPEC CPU 2017 on the Grace (AArch64) machine, the speed of LLVM was slower than GCC as the number of threads increased. The options specified were -Ofast -mcpu=neoverse-v2.

According to our investigation, one of the reasons seems to be that the read statement, called in the non-parallelized initialization process, is slower than Gfortran.
The read statement is called approximately 54 million times within the nested loop, so its impact is significant.

The difference in processing time of the read statement was verified using the test program test.f90. This test was measured using LLVM 20.1.0 built with the build option -DCMAKE_BUILD_TYPE=Release.
To minimize the impact of file systems like NFS, the input file test.dat was placed in the machine's local directory.
test.dat can be generated with the following command:

python3 -c 'for i in range(54000000): print(i,i,i)' > test.dat

  • test.f90
program main
  integer, parameter :: num = 54000000

  integer, dimension(num) :: a,b,c
  integer :: t1,t2, count, countmax
  integer :: ii

  open(unit=9, file='test.dat', status='old', IOSTAT=ios)
  call system_clock(t1, count, countmax)

  do ii=1,num
    read(9,*) a(ii),b(ii),c(ii)
  end do

  call system_clock(t2)
  print*, "init time : ",real(t2 - t1)/count,"sec"

  close(9)
end program main
  • Compilation commands
$ flang-new -O3 -ffast-math -mcpu=neoverse-v2 test.f90
$ gfortran -O3 -ffast-math -mcpu=neoverse-v2 test.f90
  • Measurement results
version time
GCC 14.2.0 22.496 [s]
LLVM (Release build) 20.1.0 42.341 [s]

The read statement appears to be nearly twice as slow in Flang compared to Gfortran.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions