I am trying to run the gcn_dist_mnmg.py example from the cuGraph-GNN repository on a single node equipped with 4 H100 SXM GPUs.
When executing using torchrun all ranks fail with the following error:
[rank0]: RuntimeError: non-success value returned from cugraph_homogeneous_uniform_neighbor_sample: CUGRAPH_UNKNOWN_ERROR cuGraph failure at file=/home/coder/cugraph/cpp/src/sampling/sampling_post_processing_impl.cuh line=233: Invalid input arguments: if seed_vertex_label_offsets is valid, (*seed_vertex_label_offsets).size() (size of the offset array) should be num_labels + 1.
I tested the example with both ogbn-products and ogbn-arxiv, and the same error occurs in each case.