Merge release/2.6 into google/2.6#16807
Merged
jolivier23 merged 20 commits intogoogle/2.6from Sep 5, 2025
Merged
Conversation
Contributor
jolivier23
commented
Sep 3, 2025
- DAOS-17737 dtx: handle race between DTX refresh and DTX abort - b26 (DAOS-17737 dtx: handle race between DTX refresh and DTX abort - b26 #16536)
- SRE-3194 build: Remove redhat-lsb-core from Dockerfile.mockbuild (SRE-3194 build: Remove redhat-lsb-core from Dockerfile.mockbuild #16589) (SRE-3194 build: Remove redhat-lsb-core from Dockerfile.mockbuild (#16589) #16614)
- DAOS-17738 client: reset DTX base UUID after fork - b26 (DAOS-17738 client: reset DTX base UUID after fork - b26 #16540)
- DAOS-17748 control: Add verification to raft-db Add/Remove wrappers (… (DAOS-17748 control: Add verification to raft-db Add/Remove wrappers (… #16617)
- DAOS-17780 bio: Fix use-after-free in JSON parsing (DAOS-17780 bio: Fix use-after-free in JSON parsing #16592) (DAOS-17780 bio: Fix use-after-free in JSON parsing (#16592) #16593)
- DAOS-17772 rebuild: fix a race condition between fetch and aggregation (DAOS-17772 rebuild: fix a race condition between fetch and aggregation #16645)
- DAOS-17547 rebuild: error on stopped ds_pool_child (DAOS-17547 rebuild: error on stopped ds_pool_child #16382) (DAOS-17547 rebuild: error on stopped ds_pool_child (#16382) #16600)
- DAOS-17492 control: Ensure updated members can become voters (DAOS-17492 control: Ensure updated members can become voters #16392) (DAOS-17492 control: Ensure updated members can become voters (#16392) #16665)
- SRE-3236 LEAP-15 LUA-LMOD hack fix for Leap 15.6 (SRE-3236 LEAP-15 LUA-LMOD hack fix for Leap 15.6 #16676)
- DAOS-17835 doc: Document how to add/remove MS replica (DAOS-17835 docs: Document how to add/remove MS replica #16651) (DAOS-17835 doc: Document how to add/remove MS replica (#16651) #16683)
- DAOS-17534 dtx: not add cont to batched commit list if being stopped - b26 (DAOS-17534 dtx: not add cont to batched commit list if being stopped - b26 #16669)
- DAOS-17872 build: Tag 2.6.4 rc2 (DAOS-17872 build: Tag 2.6.4 rc2 #16705)
- DAOS-17534 dtx: avoid repeatedly adding item into batched commit list - b26 (DAOS-17534 dtx: avoid repeatedly adding item into batched commit list - b26 #16726)
- DAOS-17877 cq: give create_release.yml write permission (DAOS-17877 cq: give create_release.yml write permission #16708) (DAOS-17877 cq: give create_release.yml write permission (#16708) #16722)
- DAOS-17876 control: Expect lowercase hostname in unit test (DAOS-17876 control: Expect lowercase hostname in unit test #16710) (DAOS-17876 control: Expect lowercase hostname in unit test (#16710) #16723)
- DAOS-17828 vos: fix a pointer misuse (DAOS-17828 vos: fix a pointer misuse #16701)
- DAOS-16557 test: Add debug to NvmeEnospace ftest (DAOS-16557 test: Add debug to NvmeEnospace ftest #15559) (DAOS-16557 test: Add debug to NvmeEnospace ftest (#15559) #16728)
- DAOS-17783 test: Suppress NLT false positives in Go (DAOS-17783 test: Suppress NLT false positives in Go #16615) (DAOS-17783 test: Suppress NLT false positives in Go (#16615) #16680)
- DAOS-17591 dtx: handle orphan DTX entries - b26 (DAOS-17591 dtx: handle orphan DTX entries - b26 #16483)
…16536) If current transaction is aborted during dtx_refresh() yield by race, then return non-zero value to the sponsor to trigger client side RPC retry. That will make related transaction's status to be more clean. More check after dtx_refresh() to avoid re-initializing aborted DTX. The patch also cleanup the usage for vos_dtx_validation() to handle kinds of DTX abort (and maybe resent after that) cases. Signed-off-by: Fan Yong <fan.yong@hpe.com>
) (#16614) Signed-off-by: Ryon Jensen <ryon.jensen@hpe.com>
To avoid parent and child processes generating the same DTX ID. It also changes vos_dtx logic to avoid assertion when client reuses some DTX ID. Signed-off-by: Fan Yong <fan.yong@hpe.com>
#16617) Address issue where snapshot load fails because of inconsistency with Member address-to-uuid map. Avoid duplicate UUID member entries by fixing removeMember function. Signed-off-by: Tom Nabarro <thomas.nabarro@hpe.com> Signed-off-by: Kris Jacque <kris.jacque@hpe.com>
Use-after-free addressed in JSON parsing code that extracts daos_data from SPDK engine-bootstrap config file. Avoid freeing JSON context until relevant objects have been read and stored elsewhere. Signed-off-by: Tom Nabarro <thomas.nabarro@hpe.com>
#16645) Add ORF_FETCH_EPOCH_EC_AGG_BOUNDARY flag for rebuild fetch. The container's sc_ec_agg_eph_boundary possibly be different on the initiator and target engines of the rebuild fetch, initiator selected fetch epoch possibly lower than readable epoch at target engine side if vos aggregation merged adjacent extents to higher epoch. For this case increase the fetch epoch to sc_ec_agg_eph_boundary. Signed-off-by: Xuezhao Liu <xuezhao.liu@hpe.com>
When a faulty SSD is replaced, reintegration will be auto triggered once local setup completed (ds_pool_child started). Howerver, admin could manually run "dmg pool reintegrate" before the local setup done, then we need to return a retry-able error to make reintegration keep retry until the local ds_pool_child started. Signed-off-by: Niu Yawei <yawei.niu@hpe.com>
Backport of PR-16586 and updated with: ci/provisioning/post_provision_config_nodes_LEAP.sh: Something in Leap-15.6 added an additional dependency of the distro provided lua-lmod that is not removed when lua-lmod is removed and blocks the install of the newer lua-lmod. Signed-off-by: John E. Malmberg <john.malmberg@hpe.com>
…- b26 (#16669) When close the container, dtx_flush_on_close logic will try to commit pending committable DTX entries. If such flush failed for some reason, then it will ask async-batched-commit logic to do that sometime later. But if the container is in stopping, then do not re-add the container back to the async-batched-commit list; otherwise the stop logic maybe blocked for long time (or for ever). Similar cases for when open/close the container. Some code cleanup for DTX logic. Signed-off-by: Fan Yong <fan.yong@hpe.com>
Tag second release candidate for 2.6.4. Signed-off-by: Dalton Bohning <dalton.bohning@hpe.com>
… - b26 (#16726) DTX logic maintains batched commit list. Each opened container has each own 'dtx_batched_cont_args' (dbca) item in such list. If some container is already in such list, then do not re-add it; otherwise such list may be broken. Signed-off-by: Fan Yong <fan.yong@hpe.com>
* DAOS-17828 vos: fix a pointer misuse (#16635) A handle passed to evt_iter_probe() is an EVT context not a VOS iterator. Signed-off-by: Jan Michalski <jan-marian.michalski@hpe.com>
An additional case of tsan::TraceRestartMemoryAccess with a slightly different call stack. This is a false positive coming from the Go runtime. Also moved another tsan suppression to be near similar ones, and named them more descriptively. Signed-off-by: Kris Jacque <kris.jacque@hpe.com>
Our current DTX resync mechanism does DTX leader sponsored scanning for the specified container. But if current DTX leader is dead, the new DTX leader will switch to another target on which related entry may be not exist or has been committed. Under such case, DTX resync on the new DTX leader will not handle such DTX entry, as to the DTX entry on other non-leaders may become "orphan". Such kind of orphan DTX entries may affect subsequent rebuild. This patch introduces DTX orphan cleanup mechanism to handle them before rebuild scanning related container. Signed-off-by: Fan Yong <fan.yong@hpe.com>
…le/2.6 Change-Id: Ia2ca4e64b86cdd8b7641e9c15ad9ada56585b5f9 Signed-off-by: Jeff Olivier <jeffolivier@google.com>
|
Errors are component not formatted correctly,Ticket number prefix incorrect,PR title is malformatted. See https://daosio.atlassian.net/wiki/spaces/DC/pages/11133911069/Commit+Comments,Unable to load ticket data |
wangdi1
approved these changes
Sep 5, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.