Skip to content

CI misses runtime crashes of ifx 2025.2.2+ #162

@bonachea

Description

@bonachea

In the process of developing PR #157, I discovered that CI is not correctly detecting runtime crashes of Fortran multi-image binaries built using ifx 2025.2.2 or later.

Example CI shows a case where ifx 2025.2.2 appears to have "passed", but if you open the logs you'll see it actually crashed at runtime. Excerpt:

+ build/ifx_042072F58D4AEA91/test/driver 

Append '-- --help' or '-- -h' to your `fpm test` command to display usage information.

Running all tests.
(Add '-- --contains <string>' to run only tests with subjects or descriptions containing the specified string.)
forrtl: severe (174): SIGSEGV, segmentation fault occurred
In coarray image 3
Image              PC                Routine      forrtl: severe (174): SIGSEGV, segmentation fault occurred
In coarray image 2
Image              PC                Routine            Line        Source             
libc.so.6          00007F416340C330  Unknown               Unknown  Unknown
driver             00000000004154FC  Unknown               Unknown  Unknown
driver             000000000046F01B  Unknown               Unknown  Unknown
driver             000000000046D79A  Unknown               Unknown  Unknown
driver             000000000044022E  Unknown               Unknown  Unknown
driver             00000000004066D1  Unknown               Unknown  Unknown
driver             000000000040539D  Unknown               Unknown  Unknown
libc.so.6          00007F41633F11CA  Unknown               Unknown  Unknown
libc.so.6          00007F41633F128B  __libc_start_main     Unknown  Unknown
driver             00000000004052B5  Unknown               Unknown  Unknown

I've replicated this problem in manual runs of the same Intel-provided intel/fortran-essentials:2025.2.2-0-devel-ubuntu24.04 docker container used by CI.

The problem appears to be that Fortran binaries built with recent ifx (at least in multi-image mode) are not propagating a crash of an image process back to the invoking parent process. So in the case shown above, fpm launches a driver process, which then internally uses Intel MPI to fork the image processes, but when one or more of those image processes hits a SEGV that failure is not reported in the exit code of the driver process (which returns 0). fpm subsequently treats this as a "passed" test, as does CI.

This weakens CI coverage for ifx because runtime crashes may go unnoticed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions