prov/lnx: lnx_open_core_domains - Correctly track number of open domains#11905
prov/lnx: lnx_open_core_domains - Correctly track number of open domains#11905jfillers wants to merge 1 commit intoofiwg:mainfrom
Conversation
Fixing lnx_open_core_domains to correctly track the number of open domains by decrementing by 1 when a domain fails to open. Removed an unnecessary null check in lnx_domain_close. Signed-off-by: Thomas Fillers <fillersjt@ornl.gov>
40d707f to
18fbc67
Compare
| &cd->cd_domain, context); | ||
| if (rc) | ||
| if (rc){ | ||
| lnx_domain->ld_num_doms--; |
There was a problem hiding this comment.
This only works if the failing domain is the last one.
There was a problem hiding this comment.
It stops on the first failure. So you'll never get a working-notworking-working. You'll always get working-notworking-> fail and clean up
There was a problem hiding this comment.
but the count would be incorrect.
There was a problem hiding this comment.
I'm not sure I follow. The count is incremented earlier in this function. So when there is a failure, the decrement ensures that the count is set to the number of domains that need to be cleaned up.
There was a problem hiding this comment.
at L130, ld_num_dom is increased in a loop. If, say, the loop count is 4 then it's increased by 4. Now the domain creation fails at the first one. The count is decreased by 1. That doesn't reflect how many domains are actually valid.
There was a problem hiding this comment.
right. we can do:
lnx_domain->ld_num_doms = inter_dom_start - 1
That should give the exact number of domains which were successfully started.
Then the close function will always assume that it's closing open ones.
This should work if there are only shm domain or a combination of shm and other domains and one of them fails.
Fixing lnx_open_core_domains to correctly track the number of open domains by decrementing by 1 when a domain fails to open. Removed an unnecessary null check in lnx_domain_close.