You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This mitigation script automatically detects and removes stale Load Balancer Direct Server Return (LB DSR) rules from VFP (Virtual Filtering Platform) that reference non-existent backend endpoints. It runs continuously to maintain network health by cleaning up orphaned rules that can cause connectivity issues.
6
+
7
+
## Problem Statement
8
+
9
+
When backend endpoints are removed or become unavailable, the corresponding LB DSR rules in VFP may not be cleaned up properly. These stale rules can:
10
+
- Cause packet routing failures
11
+
- Lead to connection timeouts
12
+
- Create unnecessary overhead in the networking stack
13
+
- Result in traffic being sent to non-existent endpoints
14
+
15
+
## Solution
16
+
17
+
The `cleanup-stale-lb-rules.ps1` script:
18
+
1. Checks and sets the required registry configuration for LB DSR feature management
2. If the key value is 1, set it to 0 and restart the node (this disables PR 13179278 which is causing delete LB RPC calls from KubeProxy to fail with Invalid IP Error - ICM: 719903780)
41
+
3. Start a continuous monitoring loop with 10-second intervals
42
+
4. Clean up any stale LB DSR rules found
43
+
44
+
**Note:** This approach fixes issues on a single node. If the issue is widespread across the cluster, deploy the solution using a DaemonSet:
45
+
46
+
```powershell
47
+
kubectl create -f cleanup-stale-lb-rules.yaml
48
+
```
49
+
50
+
This will run the mitigation script as HPC pods on all affected nodes.
51
+
52
+
### Configuration
53
+
54
+
You can modify these parameters at the top of the script:
55
+
56
+
-**`$groups`**: VFP groups to monitor (default: `LB_DSR_IPv4_OUT`, `LB_DSR_IPv6_OUT`)
57
+
-**`$refreshIntervalSeconds`**: Time between cleanup iterations (default: 10 seconds)
58
+
59
+
## How It Works
60
+
61
+
### 1. Registry Check
62
+
The script first ensures the feature flag registry key (140377743) is set to 0. If not, it sets the value and restarts the node.
63
+
64
+
### 2. Endpoint Collection
65
+
- Retrieves all HNS policies
66
+
- Extracts endpoint references
67
+
- Builds a dictionary of valid endpoint IP addresses
68
+
69
+
### 3. Rule Validation
70
+
For each VFP port and LB DSR group:
71
+
- Lists all rules in the `LB_DSR` layer
72
+
- Extracts DIP (Destination IP) ranges from each rule
73
+
- Compares DIPs against the valid endpoint dictionary
74
+
75
+
### 4. Cleanup
76
+
- Rules with DIPs not found in active endpoints are flagged as stale
77
+
- Stale rules are automatically deleted using `vfpctrl /remove-rule`
0 commit comments