-
Notifications
You must be signed in to change notification settings - Fork 338
Merge release/2.6 into google/2.6 #16807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Contributor
jolivier23
commented
Sep 3, 2025
- DAOS-17737 dtx: handle race between DTX refresh and DTX abort - b26 (DAOS-17737 dtx: handle race between DTX refresh and DTX abort - b26 #16536)
- SRE-3194 build: Remove redhat-lsb-core from Dockerfile.mockbuild (SRE-3194 build: Remove redhat-lsb-core from Dockerfile.mockbuild #16589) (SRE-3194 build: Remove redhat-lsb-core from Dockerfile.mockbuild (#16589) #16614)
- DAOS-17738 client: reset DTX base UUID after fork - b26 (DAOS-17738 client: reset DTX base UUID after fork - b26 #16540)
- DAOS-17748 control: Add verification to raft-db Add/Remove wrappers (… (DAOS-17748 control: Add verification to raft-db Add/Remove wrappers (… #16617)
- DAOS-17780 bio: Fix use-after-free in JSON parsing (DAOS-17780 bio: Fix use-after-free in JSON parsing #16592) (DAOS-17780 bio: Fix use-after-free in JSON parsing (#16592) #16593)
- DAOS-17772 rebuild: fix a race condition between fetch and aggregation (DAOS-17772 rebuild: fix a race condition between fetch and aggregation #16645)
- DAOS-17547 rebuild: error on stopped ds_pool_child (DAOS-17547 rebuild: error on stopped ds_pool_child #16382) (DAOS-17547 rebuild: error on stopped ds_pool_child (#16382) #16600)
- DAOS-17492 control: Ensure updated members can become voters (DAOS-17492 control: Ensure updated members can become voters #16392) (DAOS-17492 control: Ensure updated members can become voters (#16392) #16665)
- SRE-3236 LEAP-15 LUA-LMOD hack fix for Leap 15.6 (SRE-3236 LEAP-15 LUA-LMOD hack fix for Leap 15.6 #16676)
- DAOS-17835 doc: Document how to add/remove MS replica (DAOS-17835 docs: Document how to add/remove MS replica #16651) (DAOS-17835 doc: Document how to add/remove MS replica (#16651) #16683)
- DAOS-17534 dtx: not add cont to batched commit list if being stopped - b26 (DAOS-17534 dtx: not add cont to batched commit list if being stopped - b26 #16669)
- DAOS-17872 build: Tag 2.6.4 rc2 (DAOS-17872 build: Tag 2.6.4 rc2 #16705)
- DAOS-17534 dtx: avoid repeatedly adding item into batched commit list - b26 (DAOS-17534 dtx: avoid repeatedly adding item into batched commit list - b26 #16726)
- DAOS-17877 cq: give create_release.yml write permission (DAOS-17877 cq: give create_release.yml write permission #16708) (DAOS-17877 cq: give create_release.yml write permission (#16708) #16722)
- DAOS-17876 control: Expect lowercase hostname in unit test (DAOS-17876 control: Expect lowercase hostname in unit test #16710) (DAOS-17876 control: Expect lowercase hostname in unit test (#16710) #16723)
- DAOS-17828 vos: fix a pointer misuse (DAOS-17828 vos: fix a pointer misuse #16701)
- DAOS-16557 test: Add debug to NvmeEnospace ftest (DAOS-16557 test: Add debug to NvmeEnospace ftest #15559) (DAOS-16557 test: Add debug to NvmeEnospace ftest (#15559) #16728)
- DAOS-17783 test: Suppress NLT false positives in Go (DAOS-17783 test: Suppress NLT false positives in Go #16615) (DAOS-17783 test: Suppress NLT false positives in Go (#16615) #16680)
- DAOS-17591 dtx: handle orphan DTX entries - b26 (DAOS-17591 dtx: handle orphan DTX entries - b26 #16483)
…16536) If current transaction is aborted during dtx_refresh() yield by race, then return non-zero value to the sponsor to trigger client side RPC retry. That will make related transaction's status to be more clean. More check after dtx_refresh() to avoid re-initializing aborted DTX. The patch also cleanup the usage for vos_dtx_validation() to handle kinds of DTX abort (and maybe resent after that) cases. Signed-off-by: Fan Yong <[email protected]>
) (#16614) Signed-off-by: Ryon Jensen <[email protected]>
To avoid parent and child processes generating the same DTX ID. It also changes vos_dtx logic to avoid assertion when client reuses some DTX ID. Signed-off-by: Fan Yong <[email protected]>
#16617) Address issue where snapshot load fails because of inconsistency with Member address-to-uuid map. Avoid duplicate UUID member entries by fixing removeMember function. Signed-off-by: Tom Nabarro <[email protected]> Signed-off-by: Kris Jacque <[email protected]>
Use-after-free addressed in JSON parsing code that extracts daos_data from SPDK engine-bootstrap config file. Avoid freeing JSON context until relevant objects have been read and stored elsewhere. Signed-off-by: Tom Nabarro <[email protected]>
#16645) Add ORF_FETCH_EPOCH_EC_AGG_BOUNDARY flag for rebuild fetch. The container's sc_ec_agg_eph_boundary possibly be different on the initiator and target engines of the rebuild fetch, initiator selected fetch epoch possibly lower than readable epoch at target engine side if vos aggregation merged adjacent extents to higher epoch. For this case increase the fetch epoch to sc_ec_agg_eph_boundary. Signed-off-by: Xuezhao Liu <[email protected]>
When a faulty SSD is replaced, reintegration will be auto triggered once local setup completed (ds_pool_child started). Howerver, admin could manually run "dmg pool reintegrate" before the local setup done, then we need to return a retry-able error to make reintegration keep retry until the local ds_pool_child started. Signed-off-by: Niu Yawei <[email protected]>
…#16665) When adding a new access point to config and restarting, the member is updated, not added, so it was not being considered a voter in the MS leader election. Signed-off-by: Kris Jacque <[email protected]>
Backport of PR-16586 and updated with: ci/provisioning/post_provision_config_nodes_LEAP.sh: Something in Leap-15.6 added an additional dependency of the distro provided lua-lmod that is not removed when lua-lmod is removed and blocks the install of the newer lua-lmod. Signed-off-by: John E. Malmberg <[email protected]>
Add documentation on how to add or remove MS replicas. Also remove unused variable in a unit test. Signed-off-by: Kris Jacque <[email protected]>
…- b26 (#16669) When close the container, dtx_flush_on_close logic will try to commit pending committable DTX entries. If such flush failed for some reason, then it will ask async-batched-commit logic to do that sometime later. But if the container is in stopping, then do not re-add the container back to the async-batched-commit list; otherwise the stop logic maybe blocked for long time (or for ever). Similar cases for when open/close the container. Some code cleanup for DTX logic. Signed-off-by: Fan Yong <[email protected]>
Tag second release candidate for 2.6.4. Signed-off-by: Dalton Bohning <[email protected]>
… - b26 (#16726) DTX logic maintains batched commit list. Each opened container has each own 'dtx_batched_cont_args' (dbca) item in such list. If some container is already in such list, then do not re-add it; otherwise such list may be broken. Signed-off-by: Fan Yong <[email protected]>
) Give create_release.yml write permission so it can create tags. Also exit on error. Signed-off-by: Dalton Bohning <[email protected]>
…16723) If the host where the test was run had a capital letter in the name, this test failed. Fault domain code normalizes names to lowercase. Signed-off-by: Kris Jacque <[email protected]>
* DAOS-17828 vos: fix a pointer misuse (#16635) A handle passed to evt_iter_probe() is an EVT context not a VOS iterator. Signed-off-by: Jan Michalski <[email protected]>
Add aggregation debugging information on the state of the pool to allow debugging if ENOSPACE error happens unexpectedly. Signed-off-by: Cedric Koch-Hofer <[email protected]>
An additional case of tsan::TraceRestartMemoryAccess with a slightly different call stack. This is a false positive coming from the Go runtime. Also moved another tsan suppression to be near similar ones, and named them more descriptively. Signed-off-by: Kris Jacque <[email protected]>
Our current DTX resync mechanism does DTX leader sponsored scanning for the specified container. But if current DTX leader is dead, the new DTX leader will switch to another target on which related entry may be not exist or has been committed. Under such case, DTX resync on the new DTX leader will not handle such DTX entry, as to the DTX entry on other non-leaders may become "orphan". Such kind of orphan DTX entries may affect subsequent rebuild. This patch introduces DTX orphan cleanup mechanism to handle them before rebuild scanning related container. Signed-off-by: Fan Yong <[email protected]>
…le/2.6 Change-Id: Ia2ca4e64b86cdd8b7641e9c15ad9ada56585b5f9 Signed-off-by: Jeff Olivier <[email protected]>
|
Errors are component not formatted correctly,Ticket number prefix incorrect,PR title is malformatted. See https://daosio.atlassian.net/wiki/spaces/DC/pages/11133911069/Commit+Comments,Unable to load ticket data |
wangdi1
approved these changes
Sep 5, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.