- 
                Notifications
    You must be signed in to change notification settings 
- Fork 928
fortran/use-mpi-f08: add support for Fortran 2018 ISO_Fortran_binding.h #6569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| @jsquyres @hppritcha @bosilca this is a first cut on a topic we discussed long time ago. stuff in  I ran a simple pingpong test with and use  | 
9a05dd7    to
    2f47ed6      
    Compare
  
    | This is really cool stuff, it was about time someone address this issue. Let me take a look at how we can build these types in a better way. Can you point me to some good documentation about the Fortran cdesc ? | 
| @bosilca I guess the best documentation is in the standard. I am not sure it is available at no cost, but you can check the recent and freely available draft at https://j3-fortran.org/doc/year/18/18-007r1.pdf As a side note, we should really fix how non blocking collectives handle datatypes/operators refcounts | 
| Refs. #2154 | 
| last = i; goto fn_exit; | ||
| } | ||
|  | ||
| mpi_errno = PMPI_Type_commit(&types[i+1]); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we only have to commit the last datatype?
I.e., isn't "commit" potentially an expensive operation? If so, can't we use uncommitted datatypes to build up new datatypes, and therefore only have to commit the last datatype?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bosilca confirmed to me in an email that I was correct -- you only need to commit the final datatype. Depending on the complexity of the array slice, this could be a nice optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, I will do that (and issue a PR to mpich since this is basically their code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still need to move this type_commit outside of the loop and only commit the last/final datatype.
| I did not meant to resolve the conversation, I just clicked on the wrong button. What I wanted to say is that I do not like copying the code from MPICH here. This entire piece of code seems to be a call to darray, we might want to go that route. | 
2f47ed6    to
    f0170b5      
    Compare
  
    d7e3600    to
    f5bad5e      
    Compare
  
    | Can one of the admins verify this patch? | 
f5bad5e    to
    8e418b6      
    Compare
  
    | These new Fortran 2008 bindings bring new opportunities (performance improvements, correct handling of sub-arrays in non blocking subroutines) but also some challenges I will try to list here 
 integer :: a(0:3,0:3)
call mpi_send(a(1:2,1:2), 3, MPI_INT, ...)this is currently unsupported. Under the hood, a derived datatype is created for the subarray (4 elements) and the  integer :: a(0:3,0:3)
call mpi_scatter(a(1:2,1:2), 2, MPI_INT, ...)if the communicator size is  integer :: a(0:3,0:3)
call mpi_pack(..., outbuf=a(0:2,0:2), outsize=4, ...)this is arguably a dumb thing to do, and this is currently not supported. note the current  Bottom line, some decisions have to be made about what we support and how. As a reminder, I opened pending #2154 to correctly handle derived datatypes freed in the middle of a collective operation (can be done at the component level e.g.  @jsquyres @bosilca @kawashima-fj could you please comment on that ? | 
| bot:lanl:retest | 
| last = i; goto fn_exit; | ||
| } | ||
|  | ||
| mpi_errno = PMPI_Type_commit(&types[i+1]); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still need to move this type_commit outside of the loop and only commit the last/final datatype.
| bindings.h \ | ||
| cdesc.h \ | ||
| cdesc.c \ | ||
| \ | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: remove the extra blank lines.  cdesc.c should be the last entry in the list, and not have a \ continuation character.
8e418b6    to
    385f8fa      
    Compare
  
    | @jsquyres I addressed all your comments and updated the PR. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR represents a ton of work -- thank you for doing this!
| #include "ompi/mpi/fortran/base/constants.h" | ||
|  | ||
| void ompi_pack_cdesc(CFI_cdesc_t* x, MPI_Fint *incount, MPI_Fint *datatype, | ||
| char *outbuf, MPI_Fint *outsize, MPI_Fint *position, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't both the inbuf and outbuf of all these pack/unpack routines be descriptors?
I.e., don't we have to combine the inbuf, the MPI datatype, and the outbuf to do the underlying memcpy(ies) correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is where I asked for guidance in a previous comment.
IMHO, that falls into the category "the enduser wants to do something pretty dumb but that is allowed by the standard".
Anyway, if we make outbuf a descriptor, then we cannot simply invoke PMPI_Pack() anymore (since it expects a contiguous outbuf), so it would be much easier to let the Fortran runtime handle this ... unless the compiler does not support ignore_tkr.
If simply returning an error when outbuf is not contiguous is not an option and we want to support compilers that cannot do ignore_tkr, then the only option is to do it by ourself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think that there is a restriction on the outbuf to be contiguous -- i.e., it can be a descriptor. More comments below in the main thread...
| !DEC$ ATTRIBUTES NO_ARG_CHECK :: inbuf, outbuf | ||
| !GCC$ ATTRIBUTES NO_ARG_CHECK :: inbuf, outbuf | ||
| OMPI_F08_GCC_ATTRIBUTES(inbuf) | ||
| !GCC$ ATTRIBUTES NO_ARG_CHECK :: outbuf | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per above/comment in the C file, the outbuf can be a descriptor, too.
| Going back to the discussion @ggouaillardet started last week, more specifically on his 3 array shape challenges. On all these points the issue seems to be related to the expected extent of the resulting datatype. Honestly, I am not sure I see what the outcome of  | 
| In MPI-3.1 17.1.2 (see here), it says: 
 BTW, this behavior is only supposed to occur if the compile-time constant  | 
Signed-off-by: Gilles Gouaillardet <[email protected]>
Signed-off-by: Gilles Gouaillardet <[email protected]>
Signed-off-by: Gilles Gouaillardet <[email protected]>
…tines Signed-off-by: Gilles Gouaillardet <[email protected]>
385f8fa    to
    86f05cf      
    Compare
  
    | bot:ibm:xl:retest | 
| 
 The behavior is described on MPI-3.1 p634. Line 18 starts a particularly interesting section: 
 I.e., it is useful to couch all of these conversations in the context of this "virtual contiguous scratch buffer". Specifically: for all of Gille's shape issues: I unfortunately think they all must be supported. I do not find any text in MPI-3.1 that says that they can be unsupported by the implementation. They may not be performant (e.g., if someone does a pack from a [non-contig] descriptor to a [non-contig] descriptor -- the implementation, for simplicity, can pack from the [non-contig] descriptor to a contiguous buffer and then from that contiguous buffer to the [non-contig] descriptor), but we're talking correctness here, not performance. I think there's other places in the MPI spec that allow users to do non-performant things, too. Also, I noticed this morning that per MPI-3.1 table 17.1, we have to change the name of the back-end Fortran symbols if descriptors are supported: they need to end in  The spec unfortunately says that we have to support descriptors in all of mpif.h, the mpi module, and the mpi_f08 module. I don't think we can support it in just the mpi_f08 module. 😦 Finally: per compilers that support  | 
| Thanks Jeff, 
 FWIW, I have made some progresses with this PR, and my dev branch is at https://github.com/ggouaillardet/ompi/tree/dev/f08_cdesc | 
| 
 Is there a way to return something more descriptive than  
 I think that when we wrote this text for MPI-3.0, we were envisioning one possible implementation where a "union" datatype was created: i.e., take the "union" of the shape of the data in memory and the supplied MPI datatype, and create a new datatype representing the resulting shape. Then simply pass that datatype through the MPI engine. Hypothetically, one would therefore not need to modify the internal MPI engine at all -- because the result of this operation is just another MPI datatype, like any other MPI datatype. Is that possible here? 
 It is optional for  Re-reading MPI-3.1 a little better than I did yesterday, I notice this (starting on p613:47): 
 This section -- and sections 17.1.2/3/4 -- says to me that each of  I guess that makes sense for exactly the case you are asking about: we can have  ...but how do we implement that?  The name  Regardless, this opens a whole new can of worms: 
 😲 | 
| would  I do not fully understand what you mean by "union" datatype. and run this on a communicator with two MPI ranks. Note I chose to have  I'd rather start with  
 the definition is currently in a single place (e.g.  We do need a configure option to manually disable  We should decide asap if we want to have both  | 
| The IBM CI (GNU/Scale) build failed! Please review the log, linked below. Gist: https://gist.github.com/01f8be4f3c3350eff5399cfba60efac9 | 
| This issue kinda fell off the table for a few months.   @ggouaillardet I don't see a reply to your issue on the GCC mailing list. Did that get resolved? Given the lateness in the game here, I'm kinda guessing that this is going to miss the boat for Open MPI v5.0.0. | 
| Discussed on call today. If this can make it in by v5.0.x branch-date (5/14/2020), great, but probably next opportunity will be v6.0.0 (currently unscheduled). | 
| The IBM CI (PGI) build failed! Please review the log, linked below. Gist: https://gist.github.com/09792ae4918db9664b9dc6145ebfe8e2 | 
| It's been a while since there has been any activity on this PR. Is there any interest in getting this feature set merged at some point? | 
| this work has been superseded by PR #12226 which was merged in to main a while ago now. closing this PR. | 
add the infrastructuce and update MPI_Send() and MPI_Recv()
Signed-off-by: Gilles Gouaillardet [email protected]