Skip to content

Conversation

@devreal
Copy link
Contributor

@devreal devreal commented Oct 23, 2025

Utilize the count and disp arrays for type-punning of int and MPI_Count arguments to datatype creation functions. Store and pack integer or size_t depending on what is needed. Adjust places where bigcount support for datatypes was missing.

This is work in progress and not yet properly tested, other than compiling through it. I wanted to put it out to get some eyes on it to see whether people agree with the design and that it's the right direction, before putting more effort into testing.

Utilize the count and disp arrays for type-punning of int and MPI_Count
arguments to datatype creation functions. Pack integer or MPI_Count
depending on what is needed when packing a datatype. Adjust places
where bigcount support for datatpyes was missing.

Signed-off-by: Joseph Schuchart <[email protected]>
@devreal devreal requested review from Copilot and hppritcha October 23, 2025 00:51
@devreal devreal marked this pull request as draft October 23, 2025 00:51
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements bigcount support for MPI datatypes by introducing type-punned count and displacement arrays that can store either 32-bit integers or 64-bit size_t/ptrdiff_t values. The implementation uses the least significant bit of pointers as a flag to distinguish between the two representations, allowing the datatype framework to handle both standard MPI and bigcount MPI_Count variants transparently.

Key changes:

  • Introduced opal_count_array_t and opal_disp_array_t wrapper types with helper functions for type-punned storage
  • Updated datatype creation functions to accept these wrapper types instead of raw int/ptrdiff_t arrays
  • Modified datatype argument storage to track both 32-bit and 64-bit count arrays separately
  • Removed conditional bigcount compilation blocks and temporary array allocations throughout the codebase

Reviewed Changes

Copilot reviewed 54 out of 54 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
opal/util/count_disp_array.h New header defining type-punned array wrapper types and manipulation functions
opal/util/Makefile.am Added new header to build system
ompi/util/count_disp_array.h Refactored to delegate to opal implementation
ompi/datatype/ompi_datatype.h Updated function signatures to use count_array_t types and size_t for counts
ompi/datatype/ompi_datatype_args.c Rewrote argument packing/unpacking to handle both int and size_t arrays
ompi/datatype/ompi_datatype_create*.c Updated to use count_array_t wrappers and size_t
ompi/mpi/c/type_*.c.in Removed bigcount overflow checks and temp arrays, use wrapper types
ompi/mca/*/**.c Updated datatype creation calls to use wrapper macros
opal/datatype/opal_datatype_*.c Changed parameter types from int32_t to ssize_t/size_t

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.


rc = ompi_datatype_get_args( type, 0, num_integers, NULL, num_addresses, NULL,
num_datatypes, NULL, combiner );
size_t ci, size_t cl, ca, cd;
Copy link

Copilot AI Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Syntax error: missing type specifier for variables ca and cd. Should be size_t ci, cl, ca, cd;

Suggested change
size_t ci, size_t cl, ca, cd;
size_t ci, cl, ca, cd;

Copilot uses AI. Check for mistakes.
rc = ompi_datatype_get_args( type, 0, num_integers, NULL, num_addresses, NULL,
num_datatypes, NULL, combiner );
size_t ci, size_t cl, ca, cd;
rc = ompi_datatype_get_args( type, 0, num_integers, NULL, &cl, NULL &ca, NULL,
Copy link

Copilot AI Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing comma between NULL and &ca in function call arguments.

Suggested change
rc = ompi_datatype_get_args( type, 0, num_integers, NULL, &cl, NULL &ca, NULL,
rc = ompi_datatype_get_args( type, 0, num_integers, NULL, &cl, NULL, &ca, NULL,

Copilot uses AI. Check for mistakes.
for (i = 0 ; i < number_of_length ; ++i) {
array_of_length[i] = opal_swap_bytes4(array_of_length[i]);
if (i_size == 8) {
size_t* arr = (size_t*)opal_count_array_ptr(array_of_length);
Copy link

Copilot AI Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable array_of_length is undefined in this scope. This appears to be using the wrong variable name - should likely be array_of_counts or similar based on the context.

Copilot uses AI. Check for mistakes.
arr[i] = opal_swap_bytes8(arr[i]);
}
} else {
int* arr = (int*)opal_count_array_ptr(array_of_length);
Copy link

Copilot AI Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable array_of_length is undefined in this scope. Should be array_of_counts or the appropriate variable name.

Copilot uses AI. Check for mistakes.
}
} else {
int* arr = (int*)opal_count_array_ptr(array_of_length);
for (i = 0 ; i < number_of_length ; ++i) {
Copy link

Copilot AI Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable number_of_length is undefined in this scope. Should be number_of_counts or number_of_ints based on the updated variable naming.

Copilot uses AI. Check for mistakes.
cl = count+1;
}
ompi_datatype_create_hindexed( count, a_i[1], disp_array, d[0], &datatype );
ompi_datatype_set_args( datatype, cl, ci, a_i, count, disp_array, 1, d, MPI_COMBINER_HINDEXED );
Copy link

Copilot AI Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguments cl and ci are passed in reverse order. Based on the function signature ompi_datatype_set_args(..., size_t ci, size_t cl, ...), they should be ci, cl not cl, ci.

Suggested change
ompi_datatype_set_args( datatype, cl, ci, a_i, count, disp_array, 1, d, MPI_COMBINER_HINDEXED );
ompi_datatype_set_args( datatype, ci, cl, a_i, count, disp_array, 1, d, MPI_COMBINER_HINDEXED );

Copilot uses AI. Check for mistakes.
ompi_datatype_args_t* args = (ompi_datatype_args_t*)datatype->args;
char* next_packed = (char*)*packed_buffer;

if( ompi_datatype_is_predefined(datatype) ) {
Copy link

Copilot AI Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable iposition is uninitialized at this point. It's declared as int *iposition = NULL; on line 446 but never assigned before use. Should be initialized from next_packed like the later code does.

Suggested change
if( ompi_datatype_is_predefined(datatype) ) {
if( ompi_datatype_is_predefined(datatype) ) {
iposition = (int*)next_packed;

Copilot uses AI. Check for mistakes.
ompi_datatype_set_args( *newtype,
(sizeof(count) == sizeof(size_t)) ? 0 : 1,
(sizeof(count) == sizeof(size_t)) ? 1 : 0,
a_i, 0, OMPI_COUNT_ARRAY_NULL, 1, &oldtype, MPI_COMBINER_CONTIGUOUS );
Copy link

Copilot AI Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using OMPI_COUNT_ARRAY_NULL where OMPI_DISP_ARRAY_NULL is expected based on the function signature. The parameter at this position should be of type opal_disp_array_t.

Suggested change
a_i, 0, OMPI_COUNT_ARRAY_NULL, 1, &oldtype, MPI_COMBINER_CONTIGUOUS );
a_i, 0, OMPI_DISP_ARRAY_NULL, 1, &oldtype, MPI_COMBINER_CONTIGUOUS );

Copilot uses AI. Check for mistakes.
array_of_length = (int*)next_buffer;
next_buffer += (number_of_length * sizeof(int));
if (number_of_ints > 0) {
array_of_ints = (int*)next_buffer;
Copy link

Copilot AI Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing increment of next_buffer after assigning array_of_ints. The pointer should be advanced by number_of_ints * sizeof(int) to maintain proper buffer alignment, similar to how other arrays are handled.

Suggested change
array_of_ints = (int*)next_buffer;
array_of_ints = (int*)next_buffer;
next_buffer += number_of_ints * sizeof(int);

Copilot uses AI. Check for mistakes.
@devreal devreal requested a review from bosilca October 23, 2025 15:28
@hppritcha
Copy link
Member

okay this is pretty big! i will probably not be able to get to most of this till Monday.

The signatures for MPI_Type_get_envelope and MPI_Type_get_envelope_c are
different so we cannot generate them. This file seems to be unused.

Signed-off-by: Joseph Schuchart <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants