Skip to content

UCX_IB_MLX5_DEVX mystery #13688

@ashterenli

Description

@ashterenli

I found, by accident, that setting UCX_IB_MLX5_DEVX=no gives ~2x perf improvement for one customer code on Azure Genoa hbv4 VMs ( https://learn.microsoft.com/en-us/azure/virtual-machines/hbv4-series-overview). Each VM has a single 400 Gb/s Mellanox ConnectX-7 NDR nic.
So far none of the other codes I've tried show any sensitivity to this setting.

Very similar ~2x perf boost is seen on this code on bare metal Genoa IB cluster (also 400 Gb/sec (4X NDR).

Howver, on the same bare metal cluster, Turin nodes show no sensitivity to this setting.

The code has mostly allreduce and alltoallv. Alltoallv calls are "sparse" -- 4 ranks are either sending to all other ranks, or receiving from all other ranks. The typical scale of my jobs is ~1k ranks, no OpenMP.

Recently, I observed another behaviour affected by setting UCX_IB_MLX5_DEVX=no .
This is on NOAA HAFS code (https://github.com/HAFS-community/HAFS).
I only tried this code on the Azure hbv4 platform.
Using recent master branches of pmix, prrte and openmpi.
I start the job with:

mpiexec --display-map --bind-to core --map-by ppr:88:node:pe=2  \
-n 3072 /usr/bin/env OMP_NUM_THREADS=2 ${exe} :  \
-n   32 /usr/bin/env OMP_NUM_THREADS=2 ${exe} : \
-n  240 /usr/bin/env OMP_NUM_THREADS=2 ${exe}

The jobs hangs somewhere in ucx layers.
However, if I set UCX_IB_MLX5_DEVX=no, the job runs to completion, with performance comparable to intel mpi.

If I try to use 4 omp threads/rank, i.e:

mpiexec --display-map --bind-to core--map-by ppr:44:node:pe=4 \
-n 3072 /usr/bin/env OMP_NUM_THREADS=4 ${exe} : \
-n   32 /usr/bin/env OMP_NUM_THREADS=4 ${exe} : \
-n  240 /usr/bin/env OMP_NUM_THREADS=4 ${exe}

the job hangs regardless of whether UCX_IB_MLX5_DEVX is set to yes or no.

I'm not sure what to make of these observations.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions