Skip to content

Rpadma2/daos 265 p1#18468

Draft
rpadma2 wants to merge 17 commits into
release/2.6from
rpadma2/daos_265_p1
Draft

Rpadma2/daos 265 p1#18468
rpadma2 wants to merge 17 commits into
release/2.6from
rpadma2/daos_265_p1

Conversation

@rpadma2

@rpadma2 rpadma2 commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Steps for the author:

  • Commit message follows the guidelines.
  • Appropriate Features or Test-tag pragmas were used.
  • Appropriate Functional Test Stages were run.
  • At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • Gatekeeper requested (daos-gatekeeper added as a reviewer).

jgmoore-or and others added 17 commits January 7, 2026 16:04
…gnal. (#17268)

* DAOS-17931 engine: Terminate engine process upon receipt of SIGBUS signal.

Signed-off-by: Joseph Moore <joseph.moore@hpe.com>
…IGBUS signal.

Allow-unstable-test: true

Signed-off-by: Joseph Moore <joseph.moore@hpe.com>
…gnal.

Signed-off-by: Joseph Moore <joseph.moore@hpe.com>
…gnal.

Signed-off-by: Joseph Moore <joseph.moore@hpe.com>
…gnal.

Skip-build-el8-gcc: true

Signed-off-by: Joseph Moore <joseph.moore@hpe.com>
…gnal.

Signed-off-by: Joseph Moore <joseph.moore@hpe.com>
There is race condition between IO RPC handler and DTX resync that may
commit or abort the DTX when related DTX leader waiting for non-leader
participants. To properly handle such case, anytime when an active DTX
entry is evicted from the cache, in spite of it is for commit or abort,
we need to set dtx_handle::dth_need_validation to notify the DTX owner
about the event.

Signed-off-by: Fan Yong <fan.yong@hpe.com>
It is unnecessary to evict the vos object from cache after related
DTX committed; otherwise, other concurrent modification against the
same object shard maybe required to retry.

Signed-off-by: Fan Yong <fan.yong@hpe.com>
The latest available leap 15.5 mercury RMP has version 2.4.1-2
This version must be used for proper DAOS build on leap 15.5

Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@hpe.com>
Priority: 2
Cancel-prev-build: false
Skip-unit-test: true
Skip-unit-test-memcheck: true
Skip-test-el-9-rpms: true
Skip-test-leap-15-rpms: true
Skip-func-test-el9: true
Skip-func-test-leap15: false
Skip-test-el-8-rpms: true
Skip-func-hw-test: true
Skip-func-test-el8: true
Skip-fault-injection-test: true
Skip-NLT: true
Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@hpe.com>
Priority: 2
Cancel-prev-build: false
Skip-unit-test: true
Skip-unit-test-memcheck: true
Skip-test-el-9-rpms: true
Skip-test-leap-15-rpms: true
Skip-func-test-el9: true
Skip-func-test-leap15: false
Skip-test-el-8-rpms: true
Skip-func-hw-test: true
Skip-func-test-el8: true
Skip-fault-injection-test: true
Skip-NLT: true
Mainly including the following fixes:

1. When DTX leader switch, it is possible that the old DTX leader
   wanted to abort such DTX but not completed before its eviction.
   And then the new DTX leader may re-execute related modification
   successfully and try to commit such DTX. If without control, it
   is possible that those in-flight DTX ABORT RPC from the old DTX
   leader may abort the DTX that is to be committed by the new DTX
   leader, then break DTX semantics.

   The patch adds @Version parameter when abort DTX: when new DTX
   leader handles resent RPC from client, related DTX version will
   be refreshed if it has been prepared by old DTX leader; anytime
   when abort DTX locally, the logic will compare the version from
   ABORT request with related DTX version and skip stale ABORT RPC.

2. vos_dtx_load_mbs() maybe triggered before related DTX prepared
   locally. Under such case, related MBS information is empty. We
   need to handle such case to avoid segmentation fault.

3. Handle race between DTX resync and IO handler for resent RPC.

Skip-build-leap15-rpm: true
Skip-func-test-leap15: true

Signed-off-by: Fan Yong <fan.yong@hpe.com>
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown

Errors are component not formatted correctly,Ticket number prefix incorrect,PR title is malformatted. See https://daosio.atlassian.net/wiki/spaces/DC/pages/11133911069/Commit+Comments,Unable to load ticket data
https://daosio.atlassian.net/browse/Rpadma2/daos

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants