-
Notifications
You must be signed in to change notification settings - Fork 0
pythonizing
As of MPI-4.0, the language-independent specification ("LIS"), C, Fortran 90 ("F90"), and Fortran 08 ("F08") bindings are no longer being hard-coded in LaTeX. Instead, a procedural method is used to define what the bindings are (e.g., the procedure name, the parameter names/types/directions/descriptions, etc.), and then the LIS, C, F90, and F08 bindings are rendered in LaTeX automatically.
This is best shown through an example. Here's is a snipit of the point-to-point .tex
file -- with a little surrounding LaTeX, just for context:
The syntax of the blocking send operation is given below.
\cdeclmainindex{MPI\_Comm}
\begin{mpi-binding}
functionname('MPI_Send')
parameter('buf', 'BUFFER', desc='initial address of send buffer', constant=True)
parameter('count', 'POLYXFER_NUM_ELEM', desc='number of elements in send buffer')
parameter('datatype', 'DATATYPE', desc='datatype of each send buffer element')
parameter('dest', 'RANK', desc='rank of destination')
parameter('tag', 'TAG', desc='message tag')
parameter('comm', 'COMMUNICATOR')
\end{mpi-binding}
The blocking semantics of this call are described in Section~\ref{sec:pt2pt-modes}.
Notice the new {mpi-binding}
section. This section wholly replaces the hard-coded LIS/C/F90/F08 LaTeX bindings.
This section is actually Python code (i.e., when you invoke make
, a Python interpreter runs the contents of each {mpi-binding}
section to render the bindings in LaTeX). The intent is that you call Python functions to define an MPI routine, such as:
-
functionname(NAME)
: define the name of this function. This function is straightforward. -
parameter(NAME, TYPE, ...)
: define a parameter and attributes about this parameter. Several examples of callingparameter(...)
are shown above; the parameters that this function accepts will be described in detail below.
With just these two Python functions, LaTeX for the LIS, C, F90, and F08 can be generated for the vast majority of MPI procedures. This rendering is equivalent to what was used in the MPI-3.1 version of the standard.
Although the above definition of MPI_Send
is a relatively straightforward example, it is actually a good representation of how the vast majority of MPI functions are now written.
Of course, there are more complicated cases that require a few more Python functions and several more parameters to the parameter()
function; those will be described below (an easy example to cite is MPI_WTICK
, which has to return a double precision value, not an integer). But for the most part, the Pythonized version of the MPI routines is as simple as is pictured above.
Using this scheme provides the following benefits:
- Only define an MPI procedure once (vs. effectively defining it four times in LaTeX: LIS, C, F90, F08). Meaning: significantly less typing for / less chances for error by chapter authors (yay!).
- Ensure that the LIS, C, F90, and F08 bindings agree in function name, parameter names and types, etc.
- Ensure more consistent style of language bindings throughout the entire document.
- Allow the possibility of easily making global changes to how bindings are rendered throughout the entire document.
- Nearly all Fortran
ierror
parameter handling is automatic. - Enable the generation and publication of:
- Large portions of
mpi.h
,mpif.h
, thempi
module, and thempi_f08
module (i.e., all the procedures) that can be used as reference. - Machine-readable files containing the contents of all the
{mpi-binding}
blocks (i.e., all procedures, their parameters, etc.).
- Large portions of
- Decrease the time needed for humans to verify bindings in the text.
- Incorporate continuous integration-style checking of the bindings. For example:
- Note any changes to bindings in pull requests for human review before merging.
- Compare any two versions of the
.tex
source to deterministically and repeatably show the differences between them.
So how do you go about writing / editing / maintaining MPI bindings in this Python style?
First thing to note: we are replacing all the LaTeX bindings with Pythonized bindings. Meaning: delete the old LaTeX bindings after you create and verify new Pythonized bindings. Absolutely do not just comment out the old LaTeX bindings.
- Leaving the old LaTeX bindings in the text just clutters up the document and makes it harder to maintain over time.
- The old LaTeX bindings are available in git history if we ever need them.
Let's be 100% clear: the goal is to delete the old LaTeX bindings. Do not just comment them out.
Here's a few general notes about the {mpi-bindings}
sections:
- When you
make
(either a full document or an individual chapter), this section will automatically be replaced by generated LaTeX for you. - The generated LaTeX will include appropriate index references, etc. (just like the old hard-coded LaTeX bindings).
-
{mpi-binding}
blocks are only for MPI bindings. They are not for constants, typedefs, examples, or any other type of code blocks. All of those must still be hard-coded in LaTeX with the appropriate macros. - Everything between
\begin{mpi-binding}
and\end{mpi-binding}
is Python.- The contents of this section are actually given to a Python interpreter to execute. As a direct consequence, you must obey Python syntax inside
{mpi-bindings}
sections! This includes whitespace, line breaks, etc. For example:- Blank lines are fine.
- You must consistently whitespace-indent all your code lines to the same level.
- You may actually use any valid Python code in this block (but this probably isn't too useful).
- No Python state is shared between different
{mpi-binding}
sections; each{mpi-binding}
section is interpreted in its own, unique Python interpreter. - You may only define one MPI routine per
{mpi-binding}
block
- No Python state is shared between different
- Comments can begin with
#
(you can even use the"""
-style Python "comments", if desired. Just like in LaTeX, there are sometimes complicated situations where leaving comments for future authors are helpful. - Do not use LaTeX escaping in
{mpi-binding}
sections; use Python escaping. In particular, do not escape underscores (_
); the Python code will escape all of those for you when rendering the final LaTeX.
- If you do not use correct Python syntax, you will actually get a Python syntax error during
make
. There is unfortunately not good debugging output to indicate which{mpi-binding}
block caused the syntax error; you'll have to rely on context from the Python error output to discover the location of your error.
- The contents of this section are actually given to a Python interpreter to execute. As a direct consequence, you must obey Python syntax inside
As stated above, the intent is that you invoke a few Python functions to define the MPI routine. The two main functions that you will use are functionname()
and parameter()
, but there are a few other functions that are necessary in some cases.
The definitive listing of these functions, parameters, and other information you may need to know are in the binding-tool/binding_tool.py
script in the git repository. This is only mentioned in case this wiki documentation gets stale (gasp!).
This function is straightforward: pass in the mixed-case name of the MPI routine in question. E.g., MPI_Send
(not mpi_send
or MPI_send
). Incorrect casing will not be fixed for you.
It is assumed that this function will be invoked in every single {mpi-binding}
block.
A single invocation of this function describes a single parameter in an MPI routine.
The order in which parameters are defined via the parameter()
function is maintained when then final LaTeX is rendered. Meaning:
functionname('MPI_Foo')
parameter('foo', 'COMMUNICATOR', desc='the communicator')
parameter('bar', 'DATATYPE', desc='the datatype')
will render MPI_Foo(foo, bar)
, while:
functionname('MPI_Foo')
parameter('bar', 'DATATYPE', desc='the datatype')
parameter('foo', 'COMMUNICATOR', desc='the communicator')
will render MPI_Foo(bar, foo)
.
The parameter()
function can take many parameters. The first two are positional and are mandatory. The remaining are either optional or only required in certain cases.
It is highly recommended that you go try to write your bindings with just the name
, kind
, direction
, and desc
parameters to parameter()
, and use the documentation in this section as a reference for when those four parameters are not sufficient.
The string name of the parameter. This parameter is always the first parameter, and is required.
A string representing the type/kind of the parameter. This parameter is always the second parameter, and is required. The allowable kinds are:
-
BUFFER
: a choice buffer -
C_BUFFER
: a C choice buffer (e.g., forMPI_ALLOC_MEM
, which specifically takes a C buffer argument) -
EXTRA_STATE
: extra state (e.g., for MPI attribute functions) -
FUNCTION
: a function pointer. When this type is used, thefunc_type
parameter must be specified to indicate the type of the function pointer. -
STRING
: a string -
STRING_ARRAY
: an array of strings (e.g., forMPI_COMM_SPAWN
) -
STRING_2DARRAY
: an array of arrays of strings (e.g., forMPI_COMM_SPAWN_MULTIPLE
) -
ARRAY_LENGTH
: the integer length of an array- JMS: MAY NEED TO REVISIT THIS?
-
ATTRIBUTE_VAL_10
: the type of MPI attribute values in MPI-1.0. This type only exists because of some deprecated functions that are still listed in the standard. It should not be used for any new bindings. -
ATTRIBUTE_VAL
: the type of MPI attribute values starting with MPI-2.0. -
BLOCKLENGTH
: integer length of blocks -
COLOR
: color for algebraic operations (e.g., forMPI_COMM_SPLIT
) -
ENUM
: an arbitraryenum
-like integer -
FILE_DESCRIPTOR
: an integer file descriptor (e.g., forMPI_COMM_JOIN
) -
KEY
: an integer key (e.g., forMPI_COMM_SPLIT
) -
KEYVAL
: integer keyvals for MPI attribute functions -
INDEX
: integer index into an array -
LOGICAL
: Boolean true/false value -
NUM_DIMS
: integer number of dimensions -
RANK
: integer rank in a communicator or group -
COMM_SIZE
: the integer number of processes in a communicator or group -
STRING_LENGTH
: the integer length of a string -
STRIDE_BYTES
: an integer stride expressed as a number of bytes -
STRIDE_ELEM
: an integer stride expressed as a number of elements -
TAG
: an integer tag -
VERSION
: an integer version -
DEFERRED_INT
: an integer that we may need to revist to figure out if it needs to be embiggened for "big count" purposes in MPI-4.0- JMS the intent is that this type will disappear before MPI-4.0 is published
-
ALLOC_MEM_NUM_BYTES
: this should probably beDEFERRED_INT
. SeeMPI_ALLOC_MEM
. -
PACK_EXTERNAL_SIZE
: this should probably beDEFERRED_INT
. See the external pack routines. -
DISPLACEMENT_BIG
: MPI-3.1 "big" parameters in the_X
functions. -
XFER_NUM_ELEM_BIG
: MPI-3.1 "big" parameters in the_X
functions. -
NUM_BYTES_BIG
: MPI-3.1 "big" parameters in the_X
functions. -
ERROR_CODE
: an integer MPI error code -
ERROR_CLASS
: an integer MPI error class -
ORDER
: an integer MPI enum-like value -
THREAD_LEVEL
: an integer MPI enum-like value -
COMBINER
: an integer MPI enum-like value -
POLYDISPLACEMENT
: a displacement that is currently setup to render as plain integer in MPI-3.1 style and "big" integer in MPI-4.0 style.- JMS: MAY NEED TO REVISIT THIS?
-
POLYDTYPE_NUM_ELEM
: a datatype number of elements that is currently setup to render as plain integer in MPI-3.1 style and "big" integer in MPI-4.0 style.- JMS: MAY NEED TO REVISIT THIS?
-
POLYNUM_BYTES
: a datatype number of bytes that is currently setup to render as plain integer in MPI-3.1 style and "big" integer in MPI-4.0 style.- JMS: MAY NEED TO REVISIT THIS?
-
POLYXFER_NUM_ELEM
: a number of elements that is currently setup to render as plain integer in MPI-3.1 style and "big" integer in MPI-4.0 style.- JMS: MAY NEED TO REVISIT THIS?
-
COMMUNICATOR
: an MPI communicator handle -
DATATYPE
: an MPI datatype handle -
ERRHANDLER
: an MPI errhandler handle -
FILE
: an MPI file handle -
GROUP
: an MPI group handle -
INFO
: an MPI info handle -
MESSAGE
: an MPI message handle -
REQUEST
: an MPI request handle -
STATUS
: an MPI status -
WINDOW
: an MPI window handle
This string parameter is passed by name (i.e., desc="blah"
).
It is technically not required, but it is strongly recomended.
The string value is rendered as part of the LIS.
Indicate the direction intent of this parameter. This parameter affects the rendering in most language bindings:
- LIS: determines the IN, INOUT, or OUT label
- C: generally determines whether the parameter is passed by value or reference
- F90: does not affect the rendering
- F08: generally determines the
INTENT
clause
The allowable values of the direction
parameter are:
-
in
: since the majority of MPI parameters are intent IN,in
is the default value for this parameter. Parameters marked asin
will be rendered as being passed by value. -
out
: the OUT intent. Parameters marked asout
will be rendered as being passed by reference. -
inout
: the INOUT intent. Parameters marked asinout
will be rendered as being passed by reference -- except for one special case. See below.
There is a special case: MPI's definition of INOUT has a peculiar meaning with regards to MPI handles. Specifically: if an MPI handle parameter is marked as INOUT, it may be passed by value or it may be passed by reference depending on the situation.
By default, inout
-marked parameters are passed by reference. But for cases where the MPI binding actually requires the MPI handle to be passed by value, you can pass a special value to the direction
indicating the disparity. For example:
functionname('MPI_Comm_set_info')
parameter('comm', 'COMMUNICATOR', direction='lis:inout,param:in',
desc='communicator')
parameter('info', 'INFO', desc='info object')
For MPI_COMM_SET_INFO
, the comm
argument is marked INOUT
in the LIS, but it is passed by value in the C/F08 bindings. Hence, we pass lis:inout
to indicate that the LIS should be rendered as INOUT
, but the C/Fortran bindings parameter should be rendered as IN
.
Thanks, MPI! 😉
For single-dimension array parameters, this value is set to a string representing the length of the array (when that length is known). For example:
functionname('MPI_Waitall')
parameter('count', 'ARRAY_LENGTH', desc='lists length')
parameter('array_of_requests', 'REQUEST', desc='array of requests',
direction='inout', length='count')
parameter('array_of_statuses', 'STATUS',
desc='array of status objects',
direction='out', length='*')
Note the array_of_requests
parameter lists count
as its length, because that array is defined to be the length specified by the count
parameter.
Note, too, the length
for the array_of_statuses
parameter is *
. This is because the length of the array is not known (specifically, because it could be MPI_STATUS_IGNORE
), and therefore must be rendered as (*)
in Fortran (array lengths are not rendered in C; arrays are rendered as []
in C).
There are a small number of two-dimension arrays in MPI. 2D string arrays are a special beast and have their own type (STRING_2DARRAY
) and do not use the length
parameter. But functions like MPI_GROUP_RANGE_INCL
require a fixed 2D array of integers. Consider:
functionname('MPI_Group_range_incl')
parameter('group', 'GROUP', desc='group')
parameter('n', 'DEFERRED_INT',
desc='number of triplets in array \mpiarg{ranges}')
parameter('ranges', 'RANK', length=['n', '3'],
desc='a one-dimensional array of integer triplets, of the form (first rank, last rank, stride) indicating ranks in \mpiarg{group} of processes to be included in \mpiarg{newgroup}')
parameter('newgroup', 'GROUP', direction='out',
desc='new group derived from above, in the order defined by \mpiarg{ranges}')
Notice that length
is an array of each of the dimension lengths.
If the kind
parameter is FUNCTION
, this parameter must be specified.
The string value is the type of the function parameter. For example:
functionname('MPI_Comm_create_keyval')
parameter('comm_copy_attr_fn', 'FUNCTION',
func_type='MPI_Comm_copy_attr_function',
desc='copy callback function for \mpiarg{comm_keyval}')
parameter('comm_delete_attr_fn', 'FUNCTION',
func_type='MPI_Comm_delete_attr_function',
desc='delete callback function for \mpiarg{comm_keyval}')
# ...etc.
Specifying the function pointer type allows the C/Fortran bindings to render the correct type.
This is a Boolean value (that defaults to False
) that allows you to override the rendering and render passing the parameter by value.
This is a Boolean value (that defaults to False
) that indicates whether the parameter is constant or not. In C, this translates to prefixing the type with const
.
This is a Boolean value (that defaults to False
). If set to True
, the string ", significant only at root" is added to the description. It is meant as a shortcut / syntatic sugar for the many rooted MPI routings.
This is a Boolean value (that defaults to False
). When set to True
, it indicates that MPI retains ownership of this value after the routine returns. This causes the ASYNCHRONOUS
keyword to be rendered in the F08 bindings for this parameter.
This parameter is used to suppress the rendering of certain properties. They are generally very special cases, and are only needed infrequently. This parameter can take the following values:
-
f08_intent
: do not emit the F08INTENT
clause. Specifically, theINTENT
clause is rendered for most F08 parameters. There are a few cases whereINTENT
is not rendered, and those are usually automatically detected by the rendering engine. However, there are a few cases where we specifically do not include anINTENT
clause in the F08 bindings, but the reasons for omitting theINTENT
are obscure and/or do not fit into a general rule that the rendering engine knows. Hence, you can passsuppress=f08_intent
to cause the F08 bindings to not emit anINTENT
clause for this parameter.
functionname('MPI_Buffer_attach')
parameter('buffer', 'BUFFER', desc='initial buffer address',
mpi_owned=True, suppress='f08_intent')
parameter('size', 'POLYNUM_BYTES', desc='buffer size, in bytes')
This is a Boolean value (that defaults to False
). It is used to indicate optional parameters. The only notable parameter that meets this description is the F08 ierror
, which is automatically included in all parameter lists unless the no_ierror()
function is invoked.
This function takes the same arguments as parameter()
, but these parameters are only used in the LIS+C bindings (not the Fortran bindings). MPI_INIT
and MPI_INIT_THREAD
are good examples where this is needed.
JMS NOT SURE THIS HAS BEEN TESTED / NOT EXACTLY SURE WHAT THE PARAMS ARE TO THIS FUNCTION
Nearly all MPI routines return an int
in C and render a SUBROUTINE
in Fortran (i.e., no return value). However, there are a small number of routines that return something else. The returntype()
function accepts the following values:
-
INT
: if not invoked,INT
is assumed. Return anint
in C and render aSUBROUTINE
in Fortran. -
DOUBLE
: return a double precision value. -
ADDRESS
: return an address type.
functionname('MPI_Wtime')
returntype('DOUBLE')
no_ierror()
A small number of MPI routines do not have an ierror
parameter to the Fortran bindings. Invoking this function suppresses the ierror
parameter in Fortran bindings. For example:
functionname('MPI_Wtick')
returntype('DOUBLE')
no_ierror()
Only relevant for MPI-4.0-style rendering, which isn't done yet.
JMS TBD
When this function is invoked, the F08 binding is suppressed.
This function is really only necessary for several MPI-1.0 functions that are still listed in the deprecated chapter that have no F08 bindings. It should probably not be used for new MPI routines.