-
Couldn't load subscription status.
- Fork 929
osc/pt2pt: fix infinite frag allocation loop #3324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Nathan Hjelm <[email protected]> (cherry picked from commit 12b52b2) Signed-off-by: Nathan Hjelm <[email protected]>
|
The IBM CI (GNU Compiler) build failed! Please review the log, linked below. Gist: https://gist.github.com/51b6e8e91673586a73efff49b7b3f3de |
|
The IBM CI failure is valid, but not related to this PR. @rhc54 it's a failure in PMIx (stack below) #0 0x00003fff7c99593c in pmix_server_init ()
from /home/mpiczar/jenkins/workspace/ompi_public_pr_release_gnu/ompi-install/lib/libopen-rte.so.0
#1 0x00003fff7c296630 in rte_init () from /home/mpiczar/jenkins/workspace/ompi_public_pr_release_gnu/ompi-install/lib/openmpi/mca_ess_hnp.so
#2 0x00003fff7c95121c in orte_init () from /home/mpiczar/jenkins/workspace/ompi_public_pr_release_gnu/ompi-install/lib/libopen-rte.so.0
#3 0x00003fff7c98ed84 in orte_submit_init ()
from /home/mpiczar/jenkins/workspace/ompi_public_pr_release_gnu/ompi-install/lib/libopen-rte.so.0
#4 0x00000000100012e8 in orterun (argc=7, argv=0x3fffff3a0328) at orterun.c:133
#5 0x0000000010000fc0 in main (argc=7, argv=0x3fffff3a0328) at main.c:13It looks like this is just on the |
|
@jjhursey I don't know what I can do with that info - is there any way to tell us what failed in that function? |
|
@jjhursey What's with these warning messages in the IBM CI gist: |
|
Those are all removed symbols, and those components should also have been removed. Maybe some mistiming of PR's? |
|
Humm. I'm investigating. But I'm wondering if Jenkins merged this into v2.x (which has those symbols) instead of v3.x (which doesn't). |
|
bot:ibm:gnu:retest |
|
Ok so I found the problem in the IBM CI setup and it's fixed now. We were picking up an old install which was throwing off our release build. This looks clean now. Sorry for the noise... 😞 |
ompi/mca/osc/pt2pt/osc_pt2pt_frag.h
Outdated
| } | ||
|
|
||
| do { | ||
| ret = ompi_osc_pt2pt_frag_alloc (module, target, request_len , buffer, ptr, long_send, buffered); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsquyres Looks like the typo is indeed here. The other pt2pt commit is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok. I thought somehow it didn't matter over here.
Will you add c72fb30 to this PR?
Signed-off-by: Nathan Hjelm <[email protected]> (cherry picked from commit c72fb30) Signed-off-by: Nathan Hjelm <[email protected]>
|
@jsquyres Ready to review. |
Signed-off-by: Nathan Hjelm [email protected]
(cherry picked from commit 12b52b2)
Signed-off-by: Nathan Hjelm [email protected]