Potential FPE bug (divide-by-zero) in pstrord() and pdtrord()

We have a new Fortran compiler under development, and we have included building and testing ScaLAPACK in our nightly regression testing.

Two of the tests, `xshseqr` and `xdhseqr`, fail with the development compiler due to a divide by zero FPE.  The failure happens at line 1087 in both the `pstrord.f` and `pdtrord.f` files:

```
               IF( FLOPS.NE.0 .AND.
     $              ( FLOPS*100 ) / ( 2*NWIN*NWIN ) .GE. MMULT ) THEN
```

In the case where NWIN is 0, the divisor in this expression also becomes 0, and hence we get the divide by zero FPE.

Many compilers will short-circuit evaluating the second part of the expression if FLOPS is 0, but this is not strictly required.  According to the Fortran 2003 Handbook, p. 222:

> The rules for equivalent evaluation schemes allow the compiler to elide evaluating any part of an expression that has no effect on the resulting value of the expression. Consider the expression X ∗ F(Y), where F is a function and X has the value 0. The result will be the same regardless of the value of F(Y); therefore, it need not be evaluated. This shortened evaluation is allowed in all cases, even if F(Y) has side effects. In this case every data object that F could affect is undefined after the expression is evaluated—that is, it does not have a predictable value.
> This normally applies to functions in logical expressions where expression evaluation is often “short-circuited”. Some processors evaluate every term in a logical expression, others use run-time tests and skip further evaluation once the result is clear.
> 
> Consider
> 
> PRESENT( A ) .AND. A > 0 .AND. LOG( A ) < 3.5
> 
> where A is an optional argument. If A is not present, the processor is allowed to evaluate the A > 0 term, and the program is invalid. Similarly, if A is present and has a negative value, the processor is allowed to evaluate LOG(A) and the program is again invalid.
> 
> The conclusion to be drawn from all of this is that the result of a program using a function with side effects is not predictable and hence not portable. To be completely safe and portable, a subroutine should be used in place of a function when a procedure is needed with a side effect. However, in practice, the side effect will occur as expected in most cases.

Hence, a compiler that chooses to evaluate all parts of the expression is not performing invalid behavior.

We propose that this code should be fixed to guard against compilers choosing to evaluate all parts of the expression, as follows:

```
               IF( FLOPS.NE.0 .AND.
     $              ( FLOPS*100 ) / MAX(1, 2*NWIN*NWIN ) .GE. MMULT )
     $              THEN
```

In this case, the divisor is always guaranteed to be at least 1, so no divide by zero will occur.

This change makes the FPE exception go away and allows the test to pass with our development compiler.

Thanks in advance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential FPE bug (divide-by-zero) in pstrord() and pdtrord() #146

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Potential FPE bug (divide-by-zero) in pstrord() and pdtrord() #146

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions