Skip to content

Conversation

@ax3l
Copy link

@ax3l ax3l commented Feb 23, 2019

Same as #6344, but applied to v3.0.x.

cc @edgargabriel @jsquyres

jsquyres and others added 30 commits March 8, 2018 10:31
…contiguous_flag

io/romio314: mark datatypes of size 0 as contiguous
Allowing MPI_PROC_NULL as a neighbor in any topology allows us to add
gaps on the send and recv buffers. This does make the traditional
neighbor collective have a similar behavior as the V version, but in
same time it allows the users to skip the step where they prepare the
counts and the displacement array.

For more info please take a look at issue open-mpi#4675.

Signed-off-by: George Bosilca <[email protected]>
 * The `MPIR_PROCDESC` structure needs to be visible even in optimized
   builds so that debuggers can attach to `mpirun` and properly read the
   `MPIR_proctable`.
 * In the v2.0.x and v2.x series this structure resided in the `orterun`
   directory and included the `CFLAGS` fix included here. This code
   moved in the v3.x series and the `CFLAGS` did not move causing this
   issue.
   - Instead of applying the debug `CFLAGS` globally to libopen-rte,
     only apply them to the `orted_submit.c` compile which contains the
     MPIR symbols.

Signed-off-by: Joshua Hursey <[email protected]>
data sieving has to occur for any offset provided that is larger
or equal zero for this implementation to work correctly.

Signed-off-by: Edgar Gabriel <[email protected]>
…ta_sieving_fix

fcoll/two_phase: data sieving has to occur at offset 0 as well
Fix MPIR_proctable structure visibility
Signed-off-by: Benoît Legat <[email protected]>
Fix typo in MPI_Cart_shift doc
999de13 accidentally reset opal_cuda_verbose's default value.
This commit puts it back.

Signed-off-by: Jeff Squyres <[email protected]>
…bose-value

opal_datatype_module.c: reset opal_cuda_verbose
Flush out the DVM ready notice on stdout

Signed-off-by: Aurelien Bouteiller <[email protected]>
This commit is a large update to the osc/rdma component. Included in
this commit:

 - Add support for using hardware atomics for fetch-and-op and single
   count accumulate  when using the accumulate lock. This will improve
   the performance of these operations even when not setting the
   single intrinsic info key.

 - Rework how large accumulates are done. They now block on the get
   operation to fix some bugs discovered by an IBM one-sided test. I
   may roll back some of the changes if the underlying bug in the
   original design is discovered. There appear to be no real
   difference (on the hardware this was tested with) in performance so
   its probably a non-issue. References open-mpi#2530.

 - Add support for an additional lock-all algorithm: on-demand. The
   on-demand algorithm will attempt to acquire the peer lock when
   starting an RMA operation. The lock algorithm default has not
   changed. The algorithm can be selected by setting the
   osc_rdma_locking_mode MCA variable. The valid values are two_level
   and on_demand.

 - Make use of the btl_flush function if available. This can improve
   performance with some btls.

 - When using btl_flush do not keep track of the number of put
   operations. This reduces the number of atomic operations in the
   critical path.

 - Make the window buffers more friendly to multi-threaded
   applications. This was done by dropping support for multiple
   buffers per MPI window. I intend to re-add that support once the
   underlying performance bug under the old buffering scheme is
   fixed.

 - Fix a bug in request completion in the accumulate, get, and put
   paths. This also helps with open-mpi#2530.

 - General code cleanup and fixes.

Signed-off-by: Nathan Hjelm <[email protected]>
Scaling.pl: Fix Srun options and wait for DVM launch
Improve the range and accuracy of MPI_Wtime.
dist: Sync 2.1.3 NEWS items into master
This commit fixes the case when local client asks for the key from the
process on the remote node. The local server don't have commit count for
remote ranks, it is maintained by another PMIx server, so commit count
should be ignored for remote requests.

Signed-off-by: Boris Karasev <[email protected]>
mpool/memkind: fix typo in partition page sizes
We have a small number of requirements for contributions (e.g.,
"Signed-off-by"), so let's make sure that people have an easy way of
knowing these things.

Signed-off-by: Jeff Squyres <[email protected]>
…iens

CONTRIBUTING.md: add Github contribution guidelines
plfs components are at this point not utilized by anybody as far as I know.
Easy to bring back if we want to.

Signed-off-by: Edgar Gabriel <[email protected]>
never got to move this sharedfp component into anything
usable. Can easily be restored if necessary.

Signed-off-by: Edgar Gabriel <[email protected]>
somehow the flag indicating to gather performance data
on collective io operations has changed to 1 accidentally.
Should be 0 ( false) by default.

Signed-off-by: Edgar Gabriel <[email protected]>
bwbarrett and others added 28 commits May 21, 2018 14:18
Remove the MXM MTL, which has been deprecated in preference for
the Yalla PML.  This was discussed at the last developers meeting
and somehow I ended up with the action item to do the removal.

Signed-off-by: Brian Barrett <[email protected]>
- supported 4 or 8 bytes only

Signed-off-by: Sergey Oblomov <[email protected]>
…ng for C11 features to prevent e.g. _Static_assert being treated as an implicitly-defined function.

Signed-off-by: Ben Menadue <[email protected]>
configure: use AC_LINK_IFELSE instead of AC_COMPILE_IFELSE for C11 tests
fix the logic in the decision which aggregator selection algorithm
to use.

Signed-off-by: Edgar Gabriel <[email protected]>
io/ompio: fix an erroneous condition when selecting aggregator selection algorithm
enable_oshmem holds the result of a customer decision and, like
most user options, can have the values "yes" (user wants us to
build feature), "no" (user wants us not to build feature),
"" (user wants us to figure it out), and "<something>" (user
wants us to build feature, with <something> turned on).

This change updates oshmem to not lose this data by not overwriting
enable_oshmem with a yes/no and leaving the original customer
intent in place.  Aside from fixing one bug (below) there are no
customer visible changes in this patch, but it makes it possible
to do the right thing in the upcoming work to allow oshmem to be
disabled based on test results.

There was a cosmetic bug in the existing code where specifying
a feature argument (like --enable-oshmem=awesome) would result
in the "checking if want oshmem" test reporting no, but oshmem
being built anyway.  With this cleanup, the "checking if want
oshmem" test, the final output summary, and what actually happens
will all match.

Signed-off-by: Brian Barrett <[email protected]>
Two related changes to allow projects to not build based on
configure test results, as opposed to only reacting to
user configure options today.  Use case is disabling a project
like oshmem because no communication channels can be built.

First, Move PROJECT_* AM_CONDITIONALs from the top of configure to
the bottom, so that we can change the results during configure.
Second, add a DIST_SUBDIRS to Makefile.am (and populate it in
opal_mca) so that "make dist" will work even when a project is
disabled.

Signed-off-by: Brian Barrett <[email protected]>
This patch disables the oshmem layer if there are no SPMLs that
will build.  With the limited set of SPMLs available to support
oshmem, many builds end up installing an oshmem library that we
know will not work.  There has been a bit of customer confusion
over oshmem, hopefully this will lead customers in the right
direction.

Signed-off-by: Brian Barrett <[email protected]>
+ Add quiet method to SPML, so it can have different implementation with
fence.
+ Use ucp_worker_fence for spml_fence method of UCX SPML

Signed-off-by: Mikhail Brinskii <[email protected]>
cuda: add option to remove warning about missing libcuda.
oshmem: remove `shmem_put/get` when not the C11 case in accordance with the spec v1.3
Implements butterfly algorithm for MPI_Reduce_scatter_block.
The algorithm can be used both by commutative and non-commutative
operations, for power-of-two and non-power-of-two number of processes.

Signed-off-by: Mikhail Kurnosov <[email protected]>
OSHMEM/SMPL/UCX: Add real fence support
…butterfly

coll: reduce_scatter_block: add butterfly algorithm
MCA/UCX: fixed error messages for incorrect msg size
Per discussion at
open-mpi#2614 (comment),
do not allow for selection of the OSC PT2PT when creating an MPI RMA
window when THREAD_MULTIPLE is active.  Print a helpful message and
return a not-supported error.

Signed-off-by: Howard Pritchard <[email protected]>
Signed-off-by: Jeff Squyres <[email protected]>

(cherry picked from commit d0ffd66)
Signed-off-by: Jeff Squyres <[email protected]>
…or-thread-multiple

osc/pt2pt: disable when THREAD_MULITPLE
This fix was already included in pmix upstream (openpmix/openpmix@fb7af8af2).

Signed-off-by: Jeff Squyres <[email protected]>
- Improve descriptions
- Fix some typos
- Remove MPI-1 functions and replace them with MPI-2 functions

Signed-off-by: Kurita, Takehiro <[email protected]>
@ompiteam-bot
Copy link

Can one of the admins verify this patch?

@ax3l ax3l closed this Feb 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.