-
Couldn't load subscription status.
- Fork 928
handle errors gracefuly to prevent SEGV #13194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
oob_allgather_test() do not check isend() call success, leading to the possibility to use oob_req->reqs[] un-initialized upon error and thus to SEGV. Signed-off-by: Bruno Faccini <[email protected]>
3db0950 to
bc5e821
Compare
| MCA_PML_BASE_SEND_STANDARD, comm, &oob_req->reqs[0])); | ||
| MCA_PML_CALL(irecv(tmprecv, msglen, MPI_BYTE, recvfrom, | ||
| if (OMPI_SUCCESS != rc) { | ||
| return rc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if conversion is safe here, return value of this function is ucc_status_t and might be not compatible with ompi error codes. If rc is positive number then it's actually not an error from ucc perspective since all errors in ucc are negative. Maybe just to make it safe the function returns UCC_ERR_NO_MESSAGE in case of error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is correct. The macro ends up calling the .pml_isend field in the PML struct, which in the case of UCX points to mca_pml_ucx_isend, a function that returns OMPI errors.
Got the changes I asked but, but I'll still defer to those who are experts in this area of the code for providing a formal review
|
we will file a new PR for this change that adheres to our company policy |
|
we will file a new PR for this change that adheres to our company policy |
| oob_req->msglen = msglen; | ||
| oob_req->oob_coll_ctx = oob_coll_ctx; | ||
| oob_req->iter = 0; | ||
| oob_req->reqs[0] = NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically I think this should be MPI_REQUEST_NULL.
oob_allgather_test() do not check isend() call
success, leading to the possibility to use
oob_req->reqs[] un-initialized upon error and
thus to SEGV.