Skip to content

pimd: MLAG: skip pim_register_join on non-DR#21920

Merged
Jafaral merged 1 commit into
FRRouting:masterfrom
hnattamaisub:pim
May 13, 2026
Merged

pimd: MLAG: skip pim_register_join on non-DR#21920
Jafaral merged 1 commit into
FRRouting:masterfrom
hnattamaisub:pim

Conversation

@hnattamaisub
Copy link
Copy Markdown
Contributor

Rootcause and fix:
In mlag+pim, packets are software forwarded because the mroute is stuck in pimreg state in non-DR node.
Align with RFC:avoid REG_JOIN on the mlag node that lost DR election. Gate pim_register_join() with pim_upstream_could_register() on the WRVIFWHOLE connected-source path.

@frrbot frrbot Bot added the pim label May 12, 2026
@hnattamaisub hnattamaisub marked this pull request as draft May 12, 2026 04:06
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 12, 2026

Greptile Summary

This PR fixes a multicast software-forwarding regression in MLAG+PIM deployments where non-DR nodes get stuck in pimreg state because pim_register_join was being called unconditionally on connected-source paths. The fix adds pim_upstream_could_register(up) guards — which internally checks PIM_I_am_DR — to the two remaining unguarded pim_register_join call sites, aligning with RFC behaviour.

  • pim_mroute_msg_wholepkt (CAN_BE_LHR / pimreg-as-FHR path, line 497): adds the DR guard so that a non-DR MLAG peer receiving a WHOLEPKT on pimreg no longer adds the register tunnel to the OIL.
  • pim_mroute_msg_wrvifwhole (connected-source FHR path, line 849): same guard prevents the WRVIFWHOLE handler from initiating PIM registration on a non-DR node.
  • All other pim_register_join call sites in pim_upstream.c and pim_mroute_msg_nocache are already correctly gated (either via pim_upstream_could_register or an early-exit DR check), so the fix is complete.

Confidence Score: 5/5

Safe to merge — a minimal two-line change in each affected path, with no modification to data structures, timers, or state-machine transitions.

The change guards exactly the two pim_register_join call sites that lacked a DR check in the MLAG context. All other call sites in the file and in pim_upstream.c are already gated by pim_upstream_could_register or an equivalent early-exit, so the fix is complete.

No files require special attention.

Important Files Changed

Filename Overview
pimd/pim_mroute.c Adds pim_upstream_could_register(up) guard to pim_register_join in both pim_mroute_msg_wholepkt (CAN_BE_LHR/pimreg FHR path) and pim_mroute_msg_wrvifwhole (connected-source FHR path), preventing non-DR MLAG nodes from entering pimreg state.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Multicast packet arrives] --> B{WRVIFWHOLE or WHOLEPKT?}
    B -->|WRVIFWHOLE| C[pim_mroute_msg_wrvifwhole]
    B -->|WHOLEPKT on pimreg| D[pim_mroute_msg_wholepkt]
    C --> E{pim_if_connected_to_source?}
    E -->|Yes| F[pim_upstream_add with FHR flag]
    F --> G{pim_is_group_filtered AND pim_upstream_could_register?}
    G -->|DR node: both pass| H[pim_register_join]
    G -->|non-DR node: could_register=0| I[Skip pim_register_join]
    D --> K{Exact S,G upstream found?}
    K -->|No, but *,G with CAN_BE_LHR| L{src locally connected?}
    L -->|Yes| M[pim_upstream_add with FHR flag]
    M --> N{pim_is_group_filtered AND pim_upstream_could_register?}
    N -->|DR node| O[pim_register_join]
    N -->|non-DR node| P[Skip pim_register_join]
Loading

Reviews (4): Last reviewed commit: "pimd: MLAG: skip pim_register_join on no..." | Re-trigger Greptile

@hnattamaisub
Copy link
Copy Markdown
Contributor Author

ci:rerun

@hnattamaisub hnattamaisub marked this pull request as ready for review May 12, 2026 05:58
@Jafaral
Copy link
Copy Markdown
Member

Jafaral commented May 12, 2026

@greptile review

@Jafaral
Copy link
Copy Markdown
Member

Jafaral commented May 12, 2026

@Mergifyio backport stable/10.6 stable/10.5

@mergify
Copy link
Copy Markdown

mergify Bot commented May 12, 2026

backport stable/10.6 stable/10.5

✅ Backports have been created

Details

Comment thread pimd/pim_mroute.c Outdated
Comment on lines 497 to 498
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do the same here?

@hnattamaisub
Copy link
Copy Markdown
Contributor Author

ci:rerun

Rootcause and fix:
In mlag+pim, packets are software forwarded because the mroute
is stuck in pimreg state in non-DR node.
Align with RFC:avoid REG_JOIN on the mlag node that lost DR election.
Gate pim_register_join() with pim_upstream_could_register() on the
WRVIFWHOLE connected-source path.

Signed-off-by: harini <hnattamaisub@nvidia.com>
@Jafaral
Copy link
Copy Markdown
Member

Jafaral commented May 13, 2026

@greptile review

@Jafaral
Copy link
Copy Markdown
Member

Jafaral commented May 13, 2026

@Mergifyio backport stable/10.4

@Jafaral Jafaral merged commit 3c37442 into FRRouting:master May 13, 2026
24 checks passed
@mergify
Copy link
Copy Markdown

mergify Bot commented May 13, 2026

backport stable/10.4

✅ Backports have been created

Details

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants