-
Notifications
You must be signed in to change notification settings - Fork 338
DAOS-18368 rebuild: fix bug of ec_agg_boundary and agg peer update #17324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Ticket title is 'Data corruption observed with master branch under MDonSSD environment.' |
122e9e8 to
10fa58e
Compare
|
Test stage NLT on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos/job/PR-17324/2/display/redirect |
10fa58e to
66f44a9
Compare
|
Test stage NLT on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-17324/4/testReport/ |
|
Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17324/4/execution/node/1313/log |
|
Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17324/4/execution/node/1323/log |
|
just refresh to change a few logs. |
32db84f to
e1c08ea
Compare
|
Test stage NLT on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-17324/7/testReport/ |
e1c08ea to
c474fe4
Compare
|
Test stage NLT on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-17324/8/testReport/ |
c474fe4 to
3b13f7d
Compare
|
Test stage Functional on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17324/8/execution/node/1075/log |
3b13f7d to
1ba9f49
Compare
56751dc to
f4bc272
Compare
kccain
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rebuild/ source file changes LGTM.
|
Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17324/12/execution/node/1277/log |
|
Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17324/12/execution/node/1318/log |
1. fix a bug of using ec_agg_boundary before checking its valid 2. add some more logs for rebuild fetch getting zero iod_size, to provide some hints for layout information. Signed-off-by: Xuezhao Liu <[email protected]>
Signed-off-by: Xuezhao Liu <[email protected]>
Signed-off-by: Xuezhao Liu <[email protected]>
Some failures need to be retried. Signed-off-by: Xuezhao Liu <[email protected]>
For reint ranks is excluded from rebuild/reclaim if the co_in_ver exceed rebuild ver. Should set its completion in rebuild leader to avoid possible stuck. Refine dtx_resync wait handling, need not wait anymore if previously already resynced. Add some log. Signed-off-by: Xuezhao Liu <[email protected]>
17ef124
f4bc272 to
17ef124
Compare
Steps for the author:
After all prior steps are complete: