-
Notifications
You must be signed in to change notification settings - Fork 462
Kubernetes
- Overview
- Architecture Patterns
- DaemonSet Deployment
- StatefulSet Deployment
- Configuration Management
- Service Mesh Integration
- MetalLB Comparison
- Pod Networking with BGP
- Complete Examples
- Operators and Controllers
- Best Practices
- Troubleshooting
- See Also
ExaBGP integrates seamlessly with Kubernetes to provide BGP-based networking, load balancing, and service advertisement. ExaBGP runs as pods (typically DaemonSets) and announces Kubernetes services to external BGP peers.
Key Use Cases:
- Load Balancer IP Advertisement: Announce LoadBalancer service IPs via BGP
- Pod Network Routing: BGP-based pod networking (alternative to overlay networks)
- Service Discovery: BGP-based service advertisement to external networks
- Multi-Cluster Networking: BGP for cluster-to-cluster communication
- Bare Metal Load Balancing: Alternative to cloud provider load balancers
Important: ExaBGP does NOT manipulate the node's routing table. It only announces routes via BGP. Route installation is handled by the network fabric or external routers.
┌─────────────────────────────────────────┐
│ Kubernetes Cluster │
│ │
│ ┌──────────┐ ┌──────────┐ │
│ │ Pod │ │ Pod │ │
│ │ (app) │ │ (app) │ │
│ └────┬─────┘ └────┬─────┘ │
│ │ │ │
│ ┌────▼─────────────▼─────┐ │
│ │ Service (LoadBalancer) │ │
│ │ IP: 203.0.113.10 │ │
│ └────────────┬────────────┘ │
│ │ │
│ ┌────────────▼────────────┐ │
│ │ ExaBGP DaemonSet │ │
│ │ (announces service IP) │ │
│ └────────────┬────────────┘ │
└───────────────┼────────────────────────┘
│ BGP
┌───────▼────────┐
│ Network Fabric │
│ (BGP Routers) │
└─────────────────┘
┌─────────────────────────────────────────┐
│ Kubernetes Cluster │
│ │
│ Pod CIDR: 10.244.0.0/16 │
│ │
│ ┌──────────────────────────────┐ │
│ │ ExaBGP on each node │ │
│ │ Announces node's pod CIDR │ │
│ │ (e.g., 10.244.1.0/24) │ │
│ └──────────────┬───────────────┘ │
└─────────────────┼──────────────────────┘
│ BGP
┌───────▼────────┐
│ Network Fabric │
│ Routes to pods │
└─────────────────┘
┌─────────────┐ BGP ┌─────────────┐
│ Cluster 1 │◄──────────────────►│ Cluster 2 │
│ │ │ │
│ ExaBGP │ │ ExaBGP │
│ Announces │ │ Announces │
│ Services │ │ Services │
└─────────────┘ └─────────────┘
Deploy ExaBGP on every node:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: exabgp
namespace: kube-system
labels:
app: exabgp
component: network
spec:
selector:
matchLabels:
app: exabgp
template:
metadata:
labels:
app: exabgp
spec:
# Use host network for BGP
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
# Tolerate master node taints
tolerations:
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
serviceAccountName: exabgp
containers:
- name: exabgp
image: exabgp/exabgp:5.0.0
args:
- /etc/exabgp/exabgp.conf
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
volumeMounts:
- name: exabgp-config
mountPath: /etc/exabgp
readOnly: true
- name: exabgp-scripts
mountPath: /opt/scripts
readOnly: true
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
securityContext:
capabilities:
add:
- NET_ADMIN
- NET_RAW
volumes:
- name: exabgp-config
configMap:
name: exabgp-config
defaultMode: 0644
- name: exabgp-scripts
configMap:
name: exabgp-scripts
defaultMode: 0755---
apiVersion: v1
kind: ServiceAccount
metadata:
name: exabgp
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: exabgp
rules:
- apiGroups: [""]
resources: ["services", "endpoints", "nodes"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: exabgp
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: exabgp
subjects:
- kind: ServiceAccount
name: exabgp
namespace: kube-systemFor dedicated BGP route reflectors:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: exabgp-rr
namespace: kube-system
spec:
serviceName: exabgp-rr
replicas: 2
selector:
matchLabels:
app: exabgp-rr
template:
metadata:
labels:
app: exabgp-rr
spec:
hostNetwork: true
serviceAccountName: exabgp
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- exabgp-rr
topologyKey: kubernetes.io/hostname
containers:
- name: exabgp
image: exabgp/exabgp:5.0.0
args: ["/etc/exabgp/exabgp.conf"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: ROUTER_ID
value: "10.0.0.$(POD_ORDINAL + 1)"
volumeMounts:
- name: config
mountPath: /etc/exabgp
- name: data
mountPath: /var/lib/exabgp
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 1GiapiVersion: v1
kind: ConfigMap
metadata:
name: exabgp-config
namespace: kube-system
data:
exabgp.conf: |
process k8s-controller {
run /opt/scripts/k8s_controller.py;
encoder json;
}
neighbor 192.168.1.1 {
router-id 10.0.0.1;
local-address 192.168.1.10;
local-as 65000;
peer-as 65001;
family {
ipv4 unicast;
ipv6 unicast;
}
api {
processes [ k8s-controller ];
}
}
exabgp.env: |
[exabgp.daemon]
user = 'nobody'
[exabgp.log]
all = false
configuration = true
network = true
packets = false
rib = true
message = true
timers = false
[exabgp.api]
ack = trueapiVersion: v1
kind: ConfigMap
metadata:
name: exabgp-scripts
namespace: kube-system
data:
k8s_controller.py: |
#!/usr/bin/env python3
"""
Kubernetes controller that watches LoadBalancer services
and announces their IPs via ExaBGP
"""
import sys
import time
import json
import os
from kubernetes import client, config, watch
def main():
# Load in-cluster config
config.load_incluster_config()
v1 = client.CoreV1Api()
# Get node name
node_name = os.getenv('NODE_NAME')
# Watch for LoadBalancer services
w = watch.Watch()
# Track announced routes
announced = {}
for event in w.stream(v1.list_service_for_all_namespaces):
service = event['object']
event_type = event['type']
# Only process LoadBalancer services
if service.spec.type != 'LoadBalancer':
continue
# Get service name
svc_name = f"{service.metadata.namespace}/{service.metadata.name}"
# Get ingress IPs
ingress_ips = []
if service.status.load_balancer.ingress:
for ingress in service.status.load_balancer.ingress:
if ingress.ip:
ingress_ips.append(ingress.ip)
# Handle service events
if event_type in ['ADDED', 'MODIFIED']:
for ip in ingress_ips:
route_key = f"{svc_name}/{ip}"
# Skip if already announced
if route_key in announced and announced[route_key]:
continue
# Announce route
msg = f'announce route {ip}/32 next-hop self\n'
sys.stdout.write(msg)
sys.stdout.flush()
announced[route_key] = True
print(f"Announced: {ip}/32 for service {svc_name}", file=sys.stderr)
elif event_type == 'DELETED':
for ip in ingress_ips:
route_key = f"{svc_name}/{ip}"
# Skip if not announced
if route_key not in announced:
continue
# Withdraw route
msg = f'withdraw route {ip}/32 next-hop self\n'
sys.stdout.write(msg)
sys.stdout.flush()
del announced[route_key]
print(f"Withdrew: {ip}/32 for service {svc_name}", file=sys.stderr)
if __name__ == '__main__':
try:
main()
except Exception as e:
print(f"Error: {e}", file=sys.stderr)
sys.exit(1)ExaBGP announces Istio ingress gateway IPs:
# Istio ingress gateway with LoadBalancer
apiVersion: v1
kind: Service
metadata:
name: istio-ingressgateway
namespace: istio-system
labels:
app: istio-ingressgateway
spec:
type: LoadBalancer
loadBalancerIP: 203.0.113.100
selector:
app: istio-ingressgateway
ports:
- name: http2
port: 80
targetPort: 8080
- name: https
port: 443
targetPort: 8443ExaBGP automatically announces 203.0.113.100 via BGP.
Similar pattern - ExaBGP announces Linkerd ingress:
apiVersion: v1
kind: Service
metadata:
name: linkerd-ingress
namespace: linkerd
spec:
type: LoadBalancer
loadBalancerIP: 203.0.113.101
selector:
app: linkerd-ingress
ports:
- port: 80
- port: 443| Feature | ExaBGP | MetalLB |
|---|---|---|
| Purpose | General BGP engine | Kubernetes-specific LB |
| Flexibility | Highly flexible | K8s-focused |
| Configuration | Python/API scripting | CRDs/ConfigMaps |
| FlowSpec | Full support | No support |
| Learning Curve | Steeper | Easier for K8s |
| Customization | Unlimited | Limited |
| Protocol Support | 55+ RFCs | Basic BGP |
| Health Checks | Custom Python | Built-in |
Choose ExaBGP when you need:
- FlowSpec support for DDoS mitigation
- Custom logic in route announcements
- Integration with existing ExaBGP deployments
- Advanced BGP features (communities, AS-path manipulation)
- Multi-purpose (not just Kubernetes)
- Fine-grained control over BGP behavior
Choose MetalLB when you need:
- Simple Kubernetes load balancing
- Quick setup with minimal configuration
- Standard BGP without advanced features
- K8s-native CRD-based configuration
- ARP mode (Layer 2) option
You can run ExaBGP and MetalLB together:
- MetalLB: Standard service load balancing
- ExaBGP: FlowSpec for DDoS, advanced routing, custom logic
ExaBGP can announce pod CIDRs for BGP-based pod networking:
# ConfigMap with pod CIDR announcement script
apiVersion: v1
kind: ConfigMap
metadata:
name: exabgp-pod-cidr
namespace: kube-system
data:
announce_pod_cidr.py: |
#!/usr/bin/env python3
"""
Announce node's pod CIDR via BGP
"""
import sys
import os
import time
from kubernetes import client, config
def get_pod_cidr():
config.load_incluster_config()
v1 = client.CoreV1Api()
node_name = os.getenv('NODE_NAME')
node = v1.read_node(node_name)
return node.spec.pod_cidr
def main():
pod_cidr = get_pod_cidr()
print(f"Pod CIDR: {pod_cidr}", file=sys.stderr)
# Announce pod CIDR
msg = f'announce route {pod_cidr} next-hop self\n'
sys.stdout.write(msg)
sys.stdout.flush()
# Keep running
while True:
time.sleep(60)
if __name__ == '__main__':
main()apiVersion: v1
kind: ConfigMap
metadata:
name: exabgp-pod-network-config
namespace: kube-system
data:
exabgp.conf: |
process pod-cidr-announcer {
run /opt/scripts/announce_pod_cidr.py;
encoder json;
}
neighbor 192.168.1.1 {
router-id {{ NODE_IP }};
local-address {{ NODE_IP }};
local-as 65000;
peer-as 65001;
family {
ipv4 unicast;
}
api {
processes [ pod-cidr-announcer ];
}
}1. Deploy ExaBGP DaemonSet:
# exabgp-daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: exabgp
namespace: kube-system
spec:
selector:
matchLabels:
app: exabgp
template:
metadata:
labels:
app: exabgp
spec:
hostNetwork: true
serviceAccountName: exabgp
containers:
- name: exabgp
image: exabgp/exabgp:5.0.0
args: ["/etc/exabgp/exabgp.conf"]
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
volumeMounts:
- name: config
mountPath: /etc/exabgp
- name: scripts
mountPath: /opt/scripts
volumes:
- name: config
configMap:
name: exabgp-config
- name: scripts
configMap:
name: exabgp-scripts
defaultMode: 07552. Create LoadBalancer Service:
# app-service.yaml
apiVersion: v1
kind: Service
metadata:
name: my-app
namespace: default
spec:
type: LoadBalancer
loadBalancerIP: 203.0.113.10
selector:
app: my-app
ports:
- port: 80
targetPort: 80803. Deploy Application:
# app-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
namespace: default
spec:
replicas: 3
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: app
image: nginx:latest
ports:
- containerPort: 8080Result: ExaBGP announces 203.0.113.10/32 via BGP. External routers route traffic to Kubernetes nodes, which forward to pods via kube-proxy.
Cluster 1 announces service:
# cluster1-exabgp-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: exabgp-config
namespace: kube-system
data:
exabgp.conf: |
neighbor 10.0.1.1 {
router-id 10.0.1.10;
local-address 10.0.1.10;
local-as 65001;
peer-as 65000;
static {
# Announce service to cluster 2
route 203.0.113.10/32 next-hop self community [65001:100];
}
}Cluster 2 receives announcement and creates external service:
# cluster2-external-service.yaml
apiVersion: v1
kind: Service
metadata:
name: cluster1-app
namespace: default
spec:
type: ExternalName
externalName: 203.0.113.10# flowspec-controller-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: flowspec-controller
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app: flowspec-controller
template:
metadata:
labels:
app: flowspec-controller
spec:
serviceAccountName: exabgp
containers:
- name: controller
image: myorg/flowspec-controller:latest
env:
- name: EXABGP_HOST
value: "localhost"
- name: EXABGP_PORT
value: "5000"
- name: exabgp
image: exabgp/exabgp:5.0.0
args: ["/etc/exabgp/flowspec.conf"]
volumeMounts:
- name: config
mountPath: /etc/exabgp
volumes:
- name: config
configMap:
name: flowspec-configBuild a Kubernetes operator for ExaBGP:
#!/usr/bin/env python3
# exabgp_operator.py
import kopf
import kubernetes.client as k8s_client
from kubernetes.client.rest import ApiException
@kopf.on.create('v1', 'services')
@kopf.on.update('v1', 'services')
def service_created_or_updated(spec, status, namespace, name, **kwargs):
"""
Watch for LoadBalancer services and announce via ExaBGP
"""
if spec.get('type') != 'LoadBalancer':
return
# Get LoadBalancer IP
if not status.get('loadBalancer') or not status['loadBalancer'].get('ingress'):
return
for ingress in status['loadBalancer']['ingress']:
if ingress.get('ip'):
ip = ingress['ip']
# Send to ExaBGP
announce_route(ip, namespace, name)
@kopf.on.delete('v1', 'services')
def service_deleted(spec, status, namespace, name, **kwargs):
"""
Withdraw route when service is deleted
"""
if spec.get('type') != 'LoadBalancer':
return
if status.get('loadBalancer') and status['loadBalancer'].get('ingress'):
for ingress in status['loadBalancer']['ingress']:
if ingress.get('ip'):
ip = ingress['ip']
withdraw_route(ip, namespace, name)
def announce_route(ip, namespace, name):
"""Send announce command to ExaBGP"""
# Implementation: send to ExaBGP via API/FIFO
print(f"Announcing {ip}/32 for {namespace}/{name}")
def withdraw_route(ip, namespace, name):
"""Send withdraw command to ExaBGP"""
print(f"Withdrawing {ip}/32 for {namespace}/{name}")apiVersion: apps/v1
kind: Deployment
metadata:
name: exabgp-operator
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app: exabgp-operator
template:
metadata:
labels:
app: exabgp-operator
spec:
serviceAccountName: exabgp-operator
containers:
- name: operator
image: myorg/exabgp-operator:latest
env:
- name: PYTHONUNBUFFERED
value: "1"Deploy ExaBGP as DaemonSet when announcing node-specific information (pod CIDRs, node IPs).
Deploy ExaBGP as Deployment/StatefulSet for centralized route controllers.
Store configs in ConfigMaps for easy updates without rebuilding images.
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512MiGrant minimal permissions for Kubernetes API access.
livenessProbe:
exec:
command:
- pgrep
- -f
- exabgp
initialDelaySeconds: 30
periodSeconds: 30Spread ExaBGP pods across nodes:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- exabgp
topologyKey: kubernetes.io/hostnameEnable structured logging:
env:
- name: EXABGP_LOG_FORMAT
value: json
- name: EXABGP_LOG_LEVEL
value: INFORestrict ExaBGP pod network access:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: exabgp-policy
namespace: kube-system
spec:
podSelector:
matchLabels:
app: exabgp
policyTypes:
- Egress
egress:
- to:
- ipBlock:
cidr: 192.168.1.0/24 # BGP peers
ports:
- protocol: TCP
port: 179
- to: # Kubernetes API
- namespaceSelector: {}
ports:
- protocol: TCP
port: 443Store all Kubernetes manifests in Git for GitOps workflows.
# Check pod status
kubectl get pods -n kube-system -l app=exabgp
# View logs
kubectl logs -n kube-system -l app=exabgp
# Describe pod
kubectl describe pod -n kube-system -l app=exabgp# Check if BGP port is accessible from pod
kubectl exec -n kube-system <exabgp-pod> -- netstat -tlnp | grep 179
# Test connectivity to peer
kubectl exec -n kube-system <exabgp-pod> -- ping -c 3 192.168.1.1
# Check BGP session state
kubectl exec -n kube-system <exabgp-pod> -- netstat -tn | grep :179# Validate config
kubectl exec -n kube-system <exabgp-pod> -- \
exabgp configuration validate /etc/exabgp/exabgp.conf
# Check ConfigMap
kubectl get configmap -n kube-system exabgp-config -o yaml# Check script exists and is executable
kubectl exec -n kube-system <exabgp-pod> -- ls -la /opt/scripts/
# Test script independently
kubectl exec -n kube-system <exabgp-pod> -- python3 /opt/scripts/k8s_controller.py# Check service account
kubectl get sa -n kube-system exabgp
# Check permissions
kubectl auth can-i get services --as=system:serviceaccount:kube-system:exabgp
# View role bindings
kubectl get clusterrolebinding -l app=exabgp# Enable debug logging
kubectl set env daemonset/exabgp -n kube-system EXABGP_LOG_LEVEL=DEBUG
# Watch logs for announcements
kubectl logs -n kube-system -l app=exabgp -f | grep announce
# Check services
kubectl get svc --all-namespaces -o wide | grep LoadBalancer- Docker Integration - Docker deployment patterns
- Prometheus Integration - Monitoring and metrics
- Cloud Platforms - AWS, GCP, Azure integration
- Service High Availability - HA patterns
- Configuration Syntax - ExaBGP configuration
- API Overview - Control ExaBGP programmatically
- Debugging - Troubleshooting guide
- Kubernetes Documentation
- MetalLB - Alternative BGP load balancer
- Calico - BGP-based pod networking
- kube-vip - Another BGP alternative
- Facebook Katran - L4 load balancer using ExaBGP
Getting Started
Configuration
- Configuration Syntax
- Neighbor Configuration
- Directives A-Z
- Templates
- Environment Variables
- Process Configuration
API
- API Overview
- Text API Reference
- JSON API Reference
- API Commands
- Writing API Programs
- Error Handling
- Production Best Practices
Address Families
- Overview
- IPv4 Unicast
- IPv6 Unicast
- FlowSpec
- EVPN
- L3VPN
- BGP-LS
- VPLS
- SRv6 / MUP
- Multicast
- RT Constraint
Features
Use Cases
Tools
Operations
Reference
- Architecture
- Design
- Attribute Reference
- Command Reference
- BGP State Machine
- Capabilities
- Communities
- Examples Index
- Glossary
- RFC Support
Integration
Migration
Community
External