Skip to content

Commit 10b7bbf

Browse files
committed
docs: add architecture documentation for config caching optimization (#3026)
1 parent 8f393d4 commit 10b7bbf

2 files changed

Lines changed: 58 additions & 0 deletions

File tree

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,9 @@ All notable changes to this project will be documented in this file.
88

99
### Changes
1010

11+
- Device agents
12+
- Reduce agent CPU usage by continuing to fetch the full config every 5 seconds but only applying when it has changed or after 60s timeout
13+
1114
## [v0.14.0](https://github.com/malbeclabs/doublezero/compare/client/v0.13.0...client/v0.14.0) - 2026-03-24
1215

1316
### Breaking

controlplane/controller/README.md

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,61 @@
22

33
The controller generates device configurations from Solana smart contract state and serves them to agents running on network devices via gRPC.
44

5+
## Architecture
6+
7+
### Agent-Controller Communication Flow
8+
9+
The controller provides a gRPC endpoint (GetConfig) that returns both the configuration and its hash. The agent polls the controller every 5 seconds, but only applies the configuration to the EOS device when it has changed (based on hash comparison) or after a 60-second timeout.
10+
11+
The design includes an optimization to reduce EOS device CPU usage:
12+
- Applying configuration to an Arista EOS device causes the EOS ConfigAgent process CPU to spike
13+
- The agent computes a SHA256 hash of the received config and only applies it when:
14+
1. The hash differs from the last applied configuration, OR
15+
2. 60 seconds have elapsed since the last application (as a safety measure)
16+
17+
Here's how the agent uses the endpoint:
18+
19+
```
20+
┌─────────┐ ┌────────────┐ ┌────────────┐ ┌─────────┐
21+
│ Agent │ │ Controller │ │ Controller │ │ EOS │
22+
│ main() │ │ GetConfig()│ │ Config │ │ Device │
23+
│ │ │ (gRPC) │ │ Generator │ │ │
24+
└────┬────┘ └─────┬──────┘ └─────┬──────┘ └────┬────┘
25+
│ │ │ │
26+
│ Every 5s: │ │ │
27+
│ │ │ │
28+
│ GetBgpNeighbors() │ │ │
29+
├──────────────────────────────────────────────────────────────────────────────────────────►│
30+
│◄──────────────────────────────────────────────────────────────────────────────────────────┤
31+
│ [peer IPs] │ │ │
32+
│ │ │ │
33+
│ GetConfigFromServer() │ │ │
34+
├────────────────────────────►│ │ │
35+
│ │ processConfigRequest() │ │
36+
│ ├─────────────────────────────►│ │
37+
│ │ │ generateConfig() │
38+
│ │ │ • deduplicateTunnels() │
39+
│ │ │ • renderConfig() │
40+
│ │ │ (~50KB config text) │
41+
│ │ │ • compute SHA256 hash │
42+
│ │◄─────────────────────────────┤ │
43+
│ │ [config string + hash] │ │
44+
│◄────────────────────────────┤ │ │
45+
│ ConfigResponse │ │ │
46+
│ {config: "...", hash: "..."}│ │ │
47+
│ │ │ │
48+
│ Compare hash with cached │ │ │
49+
│ If changed OR 60s elapsed: │ │ │
50+
│ AddConfigToDevice(config) │ │ │
51+
├──────────────────────────────────────────────────────────────────────────────────────────►│
52+
```
53+
54+
**Key Benefits:**
55+
- **CPU**: EOS device only processes config when it actually changes (or every 60s as safety)
56+
- **Responsiveness**: Still checks for changes every 5 seconds
57+
- **Simplicity**: Single endpoint, agent handles caching logic
58+
- **Safety**: Full config application every 60s ensures eventual consistency
59+
560
## Configuration
661

762
### ClickHouse Integration

0 commit comments

Comments
 (0)