Description
Hello, developer:
I often use Linux FRRouting to interface with BGP networks at network edge nodes in data centers. During my daily work and maintenance, I found that there is an issue with Zebra, specifically manifested as:
When the physical interface restarts for some reason (manual operation under bash, shutdown of the upstream router to restore the interconnect interface, or oscillation of the physical interface), Zebra will not recognize the interface recovery, and the IP routing table will still display "inactive" for the interface IP routing
For example
SoftRouting# do show ip route connected
Codes: K - kernel route, C - connected, L - local, S - static,
R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
f - OpenFabric, t - Table-Direct,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
IPv4 unicast VRF default:
C>* 10.10.17.0/24 is directly connected, vpls-wg0, weight 1, 1d06h22m
C>* 192.168.2.1/32 is directly connected, lo, weight 1, 2d08h34m
C>* 192.168.10.0/24 [0/425] is directly connected, lan, weight 1, 1d08h27m
C 192.168.71.0/24 [0/102] is directly connected, eth0 inactive, weight 1, 1d08h28m
SoftRouting# quit
root@SoftRouting:~# ip addr show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp/id:65 qdisc mq state UP group default qlen 1000
link/ether 60:be:b4:02:13:60 brd ff:ff:ff:ff:ff:ff
altname enp2s0
altname enx60beb4021360
inet 192.168.71.100/24 brd 192.168.71.255 scope global noprefixroute eth0
valid_lft forever preferred_lft forever
inet6 240e:xxxx:2xxx:xxxx:xxxx:bxxx:fxxx2:1360/64 scope global dynamic noprefixroute
valid_lft 2152792056sec preferred_lft 86136sec
inet6 fe80::62be:b4ff:fe02:1360/64 scope link noprefixroute
valid_lft forever preferred_lft forever
root@SoftRouting:~#
This can lead to serious consequences. If the physical interface oscillates and the interface IP routing cannot be restored, it can cause serious impacts on BGP, OSPF, and other routes, including but not limited to learning from neighbors but rendering the interface routing inactive, making it impossible to install it properly in the IP routing table.
The current solution to this problem can only be to restart the FRR program, which may result in the complete disconnection and re establishment of BGP and OSPF sessions, leading to a prolonged network interruption.
Could you please take some time out of your busy schedule to fix this bug? Thank you.
Version
root@SoftRouting:~# vtysh
Hello, this is FRRouting (version 10.5.3).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
SoftRouting# show version
FRRouting 10.5.3 (SoftRouting) on Linux(6.18.14-x64v2-xanmod1).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
configured with:
'--build=x86_64-linux-gnu' '--prefix=/usr' '--includedir=${prefix}/include' '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--disable-option-checking' '--disable-silent-rules' '--libdir=${prefix}/lib/x86_64-linux-gnu' '--libexecdir=${prefix}/lib/x86_64-linux-gnu' '--disable-maintainer-mode' '--sbindir=/usr/lib/frr' '--with-vtysh-pager=/usr/bin/pager' '--libdir=/usr/lib/x86_64-linux-gnu/frr' '--with-moduledir=/usr/lib/x86_64-linux-gnu/frr/modules' '--disable-dependency-tracking' '--enable-rpki' '--disable-scripting' '--enable-pim6d' '--disable-grpc' '--disable-address-sanitizer' '--with-libpam' '--enable-doc' '--enable-doc-html' '--enable-snmp' '--enable-fpm' '--disable-protobuf' '--disable-zeromq' '--enable-ospfapi' '--enable-bgp-vnc' '--enable-multipath=256' '--enable-pcre2posix' '--enable-user=frr' '--enable-group=frr' '--enable-vty-group=frrvty' '--enable-configfile-mask=0640' '--enable-logfile-mask=0640' 'build_alias=x86_64-linux-gnu' 'PYTHON=python3'
SoftRouting#
How to reproduce
To reproduce this bug, simply restart the interface under Linux bash, for example:
nmcli con up eth0
Expected behavior
Physical interface IP routing cannot recover from 'inactive' state
Actual behavior
SoftRouting# do show ip route connected
Codes: K - kernel route, C - connected, L - local, S - static,
R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
f - OpenFabric, t - Table-Direct,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
IPv4 unicast VRF default:
C>* 10.10.17.0/24 is directly connected, vpls-wg0, weight 1, 1d06h22m
C>* 192.168.2.1/32 is directly connected, lo, weight 1, 2d08h34m
C>* 192.168.10.0/24 [0/425] is directly connected, lan, weight 1, 1d08h27m
C 192.168.71.0/24 [0/102] is directly connected, eth0 inactive, weight 1, 1d08h28m
Additional context
No response
Checklist
Description
Hello, developer:
I often use Linux FRRouting to interface with BGP networks at network edge nodes in data centers. During my daily work and maintenance, I found that there is an issue with Zebra, specifically manifested as:
When the physical interface restarts for some reason (manual operation under bash, shutdown of the upstream router to restore the interconnect interface, or oscillation of the physical interface), Zebra will not recognize the interface recovery, and the IP routing table will still display "inactive" for the interface IP routing
For example
This can lead to serious consequences. If the physical interface oscillates and the interface IP routing cannot be restored, it can cause serious impacts on BGP, OSPF, and other routes, including but not limited to learning from neighbors but rendering the interface routing inactive, making it impossible to install it properly in the IP routing table.
The current solution to this problem can only be to restart the FRR program, which may result in the complete disconnection and re establishment of BGP and OSPF sessions, leading to a prolonged network interruption.
Could you please take some time out of your busy schedule to fix this bug? Thank you.
Version
How to reproduce
To reproduce this bug, simply restart the interface under Linux bash, for example:
nmcli con up eth0
Expected behavior
Physical interface IP routing cannot recover from 'inactive' state
Actual behavior
SoftRouting# do show ip route connected
Codes: K - kernel route, C - connected, L - local, S - static,
R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
f - OpenFabric, t - Table-Direct,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
IPv4 unicast VRF default:
C>* 10.10.17.0/24 is directly connected, vpls-wg0, weight 1, 1d06h22m
C>* 192.168.2.1/32 is directly connected, lo, weight 1, 2d08h34m
C>* 192.168.10.0/24 [0/425] is directly connected, lan, weight 1, 1d08h27m
C 192.168.71.0/24 [0/102] is directly connected, eth0 inactive, weight 1, 1d08h28m
Additional context
No response
Checklist