Skip to content

OSPFv3 can get stuck after startup/reconnect: neighbor Full but routes not installed until manual clear process #21010

@florath

Description

@florath

Description

We see an OSPFv3 convergence bug in FRR under high churn/startup load. We roll out a high number of frr pods (e.g. >1000).

On some routers, OSPFv3 neighbor state becomes Full, but routes are not properly installed/usable for a long time (or stay stuck). In this state, end-to-end connectivity fails for some destinations, although adjacency itself looks healthy.

A manual clear ipv6 ospf6 process on the affected node recovers immediately.

Version

show version on affected pods shows FRR 10.5.2.
We added PR #20897 changes

How to reproduce

  1. Deploy a larger IPv6 OSPFv3 topology (in our case Kubernetes pods, ~500 to 1200 routers).
  2. Start all nodes nearly at once.
  3. Wait until many adjacencies are Full.
  4. Run connectivity checks.
  5. Observe that a subset fails.

Expected behavior

When neighbor state is Full, OSPFv3 routes should be consistently present and installed into zebra/kernel without manual intervention.
No node should require clear ipv6 ospf6 process to become operational.

Actual behavior

Intermittently, a node gets into a bad state:

  • neighbor shows Full
  • but route programming is incomplete/missing
  • connectivity from/to that node fails for multiple destinations

Example pattern we repeatedly observed:

  • show ipv6 ospf6 neighbor => Full exists
  • show ipv6 route ospf6 => missing/too few routes in affected moments
  • show ipv6 ospf6 route may show entries, but forwarding still not correct on affected node
  • after clear ipv6 ospf6 process, routes are reinstalled and connectivity returns

Additional context

  • Reproduced multiple times in containerized environment with many concurrent OSPFv3 instances.
  • Issue is much easier to trigger under large-scale startup/restart conditions than in very small topologies.

Checklist

  • I have searched the open issues for this bug.
  • I have not included sensitive information in this report.

Metadata

Metadata

Assignees

No one assigned

    Labels

    triageNeeds further investigation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions