Skip to content

DAOS-19132 rdb: Fix destination ranks in log messages#18481

Open
liw wants to merge 1 commit into
masterfrom
liw/rdb_raft_rpc_cb
Open

DAOS-19132 rdb: Fix destination ranks in log messages#18481
liw wants to merge 1 commit into
masterfrom
liw/rdb_raft_rpc_cb

Conversation

@liw

@liw liw commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Rajesh observed log messages like these:

crt_rpc_complete_and_unlock(0x7efc3df4a620) [opc=0x7040000
  (DAOS_RDB_MODULE:) rpcid=0x3637ab4c00003653 rank:tag=102:0]
  failed, DER_OOG(-1019): 'Out of group or member list'
rdb_raft_rpc_cb() 14673705[207]: RPC 0 to rank 0 failed:
  DER_OOG(-1019): 'Out of group or member list'

Apparently, rdb_raft_rpc_cb should log 'to rank 102' instead of 'to rank 0'. The issue is that depending on when the error happens the destination rank in the request header may have yet to be initialized. This patch returns to use cr_ep.ep_rank for request senders.

Steps for the author:

  • Commit message follows the guidelines.
  • Appropriate Features or Test-tag pragmas were used.
  • Appropriate Functional Test Stages were run.
  • At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • Gatekeeper requested (daos-gatekeeper added as a reviewer).

Rajesh observed log messages like these:

  crt_rpc_complete_and_unlock(0x7efc3df4a620) [opc=0x7040000
    (DAOS_RDB_MODULE:) rpcid=0x3637ab4c00003653 rank:tag=102:0]
    failed, DER_OOG(-1019): 'Out of group or member list'
  rdb_raft_rpc_cb() 14673705[207]: RPC 0 to rank 0 failed:
    DER_OOG(-1019): 'Out of group or member list'

Apparently, rdb_raft_rpc_cb should log 'to rank 102' instead of 'to rank
0'. The issue is that depending on when the error happens the
destination rank in the request header may have yet to be initialized.
This patch returns to use cr_ep.ep_rank for request senders.

Signed-off-by: Li Wei <liwei@hpe.com>
@liw

liw commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

With this patch, sending an RDB RPC to an out-of-group rank now produces:

WARN 2026/06/10 05:51:38.011971 brd-224 DAOS[967749/0/88] rpc  src/cart/crt_context.c:519 crt_rpc_complete_and_unlock(0x7f58bc94d510) [opc=0x7050000 (DAOS_RDB_MODULE:) rpcid=0x1464ae330000006f rank:tag=3:0 orig=NOINFO] failed, DER_OOG(-1019): 'Out of group or member list'
DBUG 2026/06/10 05:51:38.011981 brd-224 DAOS[967749/0/88] rdb  src/rdb/rdb_rpc.c:359 rdb_raft_rpc_cb() 928973ff[0.1]: opc=0 rank=3 rtt=0.000043
INFO 2026/06/10 05:51:38.011987 brd-224 DAOS[967749/0/88] rdb  src/rdb/rdb_rpc.c:364 rdb_raft_rpc_cb() 928973ff[0.1]: RPC 0 to rank 3 failed: DER_OOG(-1019): 'Out of group or member list'

@github-actions

Copy link
Copy Markdown

Ticket title is 'Destination ranks in rdb_raft_rpc_cb log messages may be incorrect'
Status is 'In Progress'
https://daosio.atlassian.net/browse/DAOS-19132

@liw liw marked this pull request as ready for review June 11, 2026 01:07
@liw liw requested review from a team as code owners June 11, 2026 01:07
@liw liw requested review from frostedcmos and kccain June 11, 2026 01:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant