Skip to content

Conversation

@ggouaillardet
Copy link
Contributor

and use these macros to access oshmem related per proc data :

  • OSHMEM_PROC_NUM_TRANSPORTS(proc)
  • OSHMEM_PROC_TRANSPORT_IDS(proc)

Fixes #2023

@ggouaillardet
Copy link
Contributor Author

@jladd-mlnx can you please review this PR (or have it reviewed) ?
you can refer to #2023 if you need some more context

@jsquyres FYI

@mike-dubman
Copy link
Member

👍
@igor-ivanov - plz review as well.

OBJ_CONSTRUCT(&oshmem_proc_lock, opal_mutex_t);

assert(sizeof(ompi_proc_t) >= sizeof(oshmem_proc_t));
assert(sizeof(ompi_proc_t) >= sizeof(ompi_proc_t));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logical tautology? Instead you might want to check that you have enough room in the padding for what you want to store inside.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @bosilca
I ran sed, and tested it without thoroughly checking the new code ...
will review it all tomorrow

@hjelmn
Copy link
Member

hjelmn commented Aug 29, 2016

👍 👍 👍

@ggouaillardet ggouaillardet force-pushed the topic/oshmem_proc_t branch 2 times, most recently from e629994 to dac5c0a Compare August 30, 2016 00:41
@ggouaillardet
Copy link
Contributor Author

i made the requested changes, can you please review them ?

int is_member; /* true if my_pe is part of the group, participate in collectives */
struct oshmem_proc_t **proc_array; /**< list of pointers to ompi_proc_t structures
struct ompi_proc_t **proc_array; /**< list of pointers to ompi_proc_t structures
for each process in the group */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alignment was corrupted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i do not get it.
a pointer (of pointer of some struct) was replaced by a pointer (of pointer of some struct) e.g.
sizeof(struct oshmem_proc_t **) == sizeof (struct ompi_proc_t **)
so how can this break alignment ?

or are you saying that the padding array of an ompi_proc_t might not be aligned, and hence the oshmem_proc_t will not be aligned too ?
i configured with and without --enable-debug, and i could not see any difference.
shall i simply declare the padding of ompi_proc_t as
void *padding[OMPI_PROC_PADDING_SIZE/sizeof(void *);
instead of
char padding[OMPI_PROC_PADDING_SIZE;
in order to guarantee it will be aligned ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean coding style issue (could you align proc_array field with others). Sorry for confusion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now, will do
(one more drawback of running sed blindly ...)

@jsquyres
Copy link
Member

I agree with @igor-ivanov's points. Once all these minor things have been cleaned up, 👍

@ggouaillardet
Copy link
Contributor Author

i made the requested changes.

note i also added an assert to ensure the padding of an ompi_proc_t is aligned on a pointer.

an other option is to redefine padding in ompi_proc_t from

char padding[OMPI_PROC_PADDING_SIZE];
to
void *padding[OMPI_PROC_PADDING_SIZE/sizeof(void *)];

yet an other option is to do nothing for now and start enforcing that if/when problem start occuring on alignment sensitive arch (such as sparc)

thoughts anyone ?


#define OSHMEM_PE_INVALID (-1)

struct oshmem_proc_data_t {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ggouaillardet could you add comment for this struct that says that this struct should meet padding size in ompi_proc_t as I asked. Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@igor-ivanov i did the change but i lost it before committing.

here is the comment

/* This struct will be copied into the padding field of an ompi_proc_t
 * so the size of oshmem_proc_data_t must be less or equal than
 * OMPI_PROC_PADDING_SIZE */
struct oshmem_proc_data_t {
    char * transport_ids;
    int num_transports;
};

@ibm-ompi
Copy link

Build Failed with GNU compiler! Please review the log, and get in touch if you have questions.

Gist: https://gist.github.com/a3d6ae30de97955422bc165cf0f4bb5f

@ibm-ompi
Copy link

Build Failed with XL compiler! Please review the log, and get in touch if you have questions.

Gist: https://gist.github.com/2147be152bccf7ff487db01432eadc16

@igor-ivanov
Copy link
Member

👍

@jjhursey
Copy link
Member

bot:ibm:retest

@ibm-ompi
Copy link

Build Failed with GNU compiler! Please review the log, and get in touch if you have questions.

Gist: https://gist.github.com/7fbfa41567971d4e0611b85f94d104a0

@jsquyres
Copy link
Member

👍

@jjhursey
Copy link
Member

IBM CI system seems to have hit an intermittent Jenkins and system failure - it looks like it should be resolved now.
bot:ibm:retest

store oshmem related per proc data in an oshmem_proc_data_t struct,
that is stored in the padding section of an ompi_proc_t

this data can be accessed via the OSHMEM_PROC_DATA(proc) macro

Fixes open-mpi#2023
previously, the definition was

struct oshmem_proc_data_t {
    int num_transports;
    char * transport_ids;
};

so in 64 bits arch, the compiler would very likely insert a 4 bytes
padding before the two fields in order to have transport_ids aligned
@ggouaillardet ggouaillardet merged commit 184d53a into open-mpi:master Sep 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants