Skip to content

Fixes for flang in MOM_remapping.F90#1692

Open
mathomp4 wants to merge 2 commits intomom-ocean:mainfrom
GEOS-ESM:bugfix/1689-flang-fix
Open

Fixes for flang in MOM_remapping.F90#1692
mathomp4 wants to merge 2 commits intomom-ocean:mainfrom
GEOS-ESM:bugfix/1689-flang-fix

Conversation

@mathomp4
Copy link
Collaborator

@mathomp4 mathomp4 commented Feb 28, 2026

Closes #1689

This PR fixes #1689 by renaming the instance of a type. Fortran does not allow (or is unhappy with):

type(PCM) :: PCM

Both ifort/ifx and GNU seem to allow this, but stricter compilers like nagfor and flang do not.

So we do:

type(PCM) :: PCM_instance

and then rename the variable use below, e.g.:

  call test%test( PCM_instance%unit_tests(verbose, test%stdout, test%stderr), 'PCM unit test')

NOTE: GEOS does not build all of MOM6, for example, the NUOPC code, but we do a good chunk. This was the only error Flang flagged.

Copy link
Collaborator

@adcroft adcroft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it fixes #1689 and I get the gist of the change. It's short and sweet, but it could be even shorter. How about if renaming the local variables in remapping_unit_tests()? There, they are declared once and used once, whereas this PR currently renames the import at the top of the module, changes the allocation in setReconstructionType(), and then changes the local declaration remapping_unit_tests().

type(PPM_hybgen) :: PPM_hybgen
type(PPM_CWK) :: PPM_CWK
type(EPPM_CWK) :: EPPM_CWK
type(PCM_t) :: PCM
Copy link
Collaborator

@adcroft adcroft Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable is used once at line 2727. If (e.g.) you changed this line (2098) to

type(PCM) :: PCM_r

then line 2727 becomes

call test%test( PCM_r%unit_tests(verbose, test%stdout, test%stderr), 'PCM unit test')

and there will be overall fewer changes, and no need to rename types when importing them or allocating them.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adcroft Since I wasn't sure of the right style you'd like, for now I've pushed:

type(PCM) :: PCM_instance

so I don't forget. I can change it to whatever you'd like easily

@marshallward
Copy link
Collaborator

How about if renaming the local variables in remapping_unit_tests()?

When something like this came up for AMD a year or so ago, this is how we decided to resolve the issue.

As for the naming, I'd prefer not to see _t suffix, and instead consider names that describe themselves as types. There is also the fact that _t is reserved for POSIX types, albeit in C. (Linux forbids _t IIRC.)

@mathomp4
Copy link
Collaborator Author

mathomp4 commented Mar 2, 2026

How about if renaming the local variables in remapping_unit_tests()?

When something like this came up for AMD a year or so ago, this is how we decided to resolve the issue.

As for the naming, I'd prefer not to see _t suffix, and instead consider names that describe themselves as types. There is also the fact that _t is reserved for POSIX types, albeit in C. (Linux forbids _t IIRC.)

Ohhh. As a non C-programmer, I didn't think of that.

Well, if you all can figure out a good strategy, I'm willing to try it out.

NOTE: I haven't actually tried to run MOM6 yet. I've been fiddling with compile flags with our boring data-ocean model at the moment trying to get some sort of performance I like. I'm going to try today as I think I figured out a good first set of flags.

@mathomp4
Copy link
Collaborator Author

mathomp4 commented Mar 2, 2026

That is, I can test whatever you'd like as a rename:

  • PCM_r
  • PCM_instance
  • PCM_local_variable

I just wasn't sure if there was an "official" MOM standard of how to name instances of types, etc. 🙂

@mathomp4
Copy link
Collaborator Author

mathomp4 commented Mar 2, 2026

NOTE: I haven't actually tried to run MOM6 yet.

Well, I just tried and it failed, but not due to MOM. I think I'm made a mistake using MPICH on discover. With MOM6 I got:

Abort(206659087) on node 4 (rank 4 in comm 336): Fatal error in internal_Comm_dup: Other MPI error, error stack:
internal_Comm_dup(97)...............: MPI_Comm_dup(comm=0xc400000f, newcomm=0x7fff684ed44c) failed
MPIR_Comm_dup_impl(671).............:
MPII_Comm_dup(895)..................:
MPII_Comm_copy(937).................:
MPIR_Get_contextid_sparse_group(575): Too many communicators (2/2048 free on this process; ignore_id=0)

Maybe GEOS + MOM6 does enough comm_dup's to kill off MPICH? Not sure. Time for Open MPI!

@klausler
Copy link

klausler commented Mar 5, 2026

As background, this usage (host-associated type name followed by local declaration of the same name) is intentionally not allowed in flang-new. It's just not because it's non-conforming (there's 100s of things in Fortran that aren't, but are portable and unambiguous), but because it seems inherently unsafe. Fortran allows the name of a type to appear in TYPE()/CLASS() type declaration statements before the type has been defined; this is handy for mutually referential cases with pointers and allocatable components. But when the name is declared as a type in an outer scope, and in a nested scope, should its appearance in an intervening declaration be interpreted as an intervening use of that type name a use of the outer scope or a forward reference to the one to the inner scope? The ISO standard isn't completely clear, and other compilers aren't all in agreement. So I passed on this one out of caution.

@mathomp4
Copy link
Collaborator Author

mathomp4 commented Mar 5, 2026

@klausler It might actually be a Standards Violation. Per Steve Lionel on the ifx forum:

F2023 19.3.1 (Classes of local identifiers) provides four classes of local identifiers, with named variables and nonintrinsic types both being class 1. Paragraph 3 then says:

Within its scope, a local identifier of one class shall not be the same as another local identifier of the same class, except that a generic name may be the same as the name of a procedure as explained in 15.4.3.4 or the same as the name of a derived type (7.5.10). A local identifier of one class may be the same as a local identifier of another class.

In your original example with the define, and my reduced example, "PCM" is both a named variable and a nonintrinsic type, so it's not allowed for these to share a name. Host association and use association are somewhat different in effect, but here the compiler should be looking for any accessible definition of "PCM" to detect a conflict.

So I think flang is doing the right thing! Just some compilers aren't.

@klausler
Copy link

klausler commented Mar 5, 2026

@klausler It might actually be a Standards Violation. Per Steve Lionel on the ifx forum:

F2023 19.3.1 (Classes of local identifiers) provides four classes of local identifiers, with named variables and nonintrinsic types both being class 1. Paragraph 3 then says:

Within its scope, a local identifier of one class shall not be the same as another local identifier of the same class, except that a generic name may be the same as the name of a procedure as explained in 15.4.3.4 or the same as the name of a derived type (7.5.10). A local identifier of one class may be the same as a local identifier of another class.

In your original example with the define, and my reduced example, "PCM" is both a named variable and a nonintrinsic type, so it's not allowed for these to share a name. Host association and use association are somewhat different in effect, but here the compiler should be looking for any accessible definition of "PCM" to detect a conflict.

So I think flang is doing the right thing! Just some compilers aren't.

Well, I'm doing the conforming thing, yes though that's not often the right thing in all cases for Fortran. The language is really defined by what its eight or more surviving compilers actually do than it is by its standard document.

I'm dubious anyway that derived type names really need to be in the same "class" as most other things, since one can always tell from syntax whether a name is being used as one or not (and other kinds of names could also be their own classes for the same reason, like construct names). And even standard Fortran allows some names to coexist in the same scope and mean more than one thing, which is really troublesome for implementors while not being terribly useful for users.

The problem here for me is entirely about a host-associated name conflicting with a local definition. If some compiler allowed a type name and variable name to be the same in some scope without misusing host association, I'd have to handle it. But none does.

@marshallward
Copy link
Collaborator

Thanks for the detailed explanation @klausler. As you say, the value of type(foo) :: foo is dubious, and if the pattern is non-compliant then we can make a case for their removal in the future.

Copy link
Collaborator

@adcroft adcroft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We like this solution.

@mathomp4 mathomp4 marked this pull request as ready for review March 9, 2026 14:36
@mathomp4 mathomp4 removed the request for review from marshallward March 9, 2026 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fortran Standards violation in MOM_remapping.F90

4 participants