Skip to content

gRPC + CLI runtime [[fib_tables]] CRUD (ADR-0061 FIB operational hardening)#336

Merged
lance0 merged 7 commits into
mainfrom
feat/fib-tables-grpc-crud
May 31, 2026
Merged

gRPC + CLI runtime [[fib_tables]] CRUD (ADR-0061 FIB operational hardening)#336
lance0 merged 7 commits into
mainfrom
feat/fib-tables-grpc-crud

Conversation

@lance0
Copy link
Copy Markdown
Owner

@lance0 lance0 commented May 31, 2026

Runtime CRUD for [[fib_tables]] (ADR-0061 general unicast FIB export), building on the SIGHUP hot-reload command channel from #335. Closes the "FIB operational hardening → hot-swap [[fib_tables]]" roadmap item.

Surface

  • proto / RibService: SetFibTable (create-or-replace by name — full definition, not a patch), DeleteFibTable, ListFibTables, plus a FibTableConfig message.
  • CLI: rustbgpctl fib-table {list,set,delete}.
  • authz: SetFibTable/DeleteFibTable = mutating, ListFibTables = sensitive_read. Inventory regenerated (78 → 81; JSON + MD).

Design (mirrors the EVPN ApplyEvpnRuntime control-hook pattern)

RibService gains a FibTableControlFn hook wired in main.rs — the API crate never reaches across the crate boundary into the binary. The hook orchestrates, under a coordinator Mutex shared with the SIGHUP reload FIB step:

  1. read the reconciler's current set (GetTables),
  2. apply the upsert/delete to build the candidate,
  3. validate and stage the candidate into the live config in one atomic peer-manager command (StageFibTables) — closing the TOCTOU against a concurrent peer-group delete,
  4. reserve a persistence permit, apply via ReplaceTables, and persist the exact accepted set (ConfigEvent::FibTablesReplaced) only after the actor acknowledges; roll the staged snapshot back on any post-stage failure.

The mutation runs in a detached task so a canceled gRPC call cannot split a successful apply from its persistence. Both snapshot writers (SIGHUP reload + CRUD) ack under the lock, so they can't reorder. A table_id/metric change for an existing name is a real table-key move (old kernel rows withdraw, new table back-fills). Enabling FIB from an empty config stays restart-required (FAILED_PRECONDITION).

Tests

apply_mutation (upsert/replace/delete-not-found); rib_service read-only + missing-table + no-control-hook gating; CLI dispatch; peer_group_references FIB-table coverage; and an M58 FRR interop that drives SetFibTable/DeleteFibTable/ListFibTables against a real kernel — add a table, table-key move, persist-across-restart, delete withdraws only its rows, NOT_FOUND on a missing name. fmt + clippy -D warnings clean; full workspace suite green.

lance0 added 2 commits May 31, 2026 12:53
…FibTable/ListFibTables)

Add runtime CRUD for FIB tables on RibService, mirroring the EVPN
ApplyEvpnRuntime control-hook pattern so the API crate never reaches across
the crate boundary:

- proto: SetFibTable (upsert by name) / DeleteFibTable / ListFibTables on
  RibService + a FibTableConfig message.
- RibService gains a FibTableControlFn hook (wired in main.rs) + access-mode
  gating on the mutating RPCs; authz tiers SetFibTable/DeleteFibTable=Mutating,
  ListFibTables=SensitiveRead (inventory regenerated, 78->81).
- fib_table_control module orchestrates under a coordinator lock shared with
  the SIGHUP reload FIB step: read (GetTables) -> validate against the live
  config (peer manager) -> apply (ReplaceTables) -> persist the exact accepted
  set, all in a detached task so a canceled call can't split apply from persist.
- FibRuntimeCommand::GetTables for the read side; PeerManagerCommand::
  ValidateFibTables for live-config validation.
- ConfigEvent::FibTablesReplaced persists via the bridge, which now applies
  events on a clone and commits only on success (no poisoned snapshot).
- rustbgpctl fib-table subcommands wrapping the RibService FIB-table RPCs.
- Unit tests: apply_mutation upsert/replace/delete-not-found; rib_service
  read-only + missing-table + no-control-hook gating; CLI dispatch.
- Docs: CONFIGURATION.md (gRPC/CLI runtime CRUD + lifecycle), API.md (new
  RPCs + corrected dynamic-neighbor status), grpc-method-inventory.md
  (RibService 12->15, tier totals), CHANGELOG, ROADMAP (FIB hot-swap done).
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds runtime CRUD for [[fib_tables]] through gRPC and rustbgpctl, integrating the ADR-0061 FIB reconciler with validation, hot-apply, and config persistence.

Changes:

  • Adds SetFibTable, DeleteFibTable, and ListFibTables proto/API/authz surface.
  • Introduces daemon-side FIB table control logic with reconciler commands and persistence events.
  • Adds CLI commands and documentation/inventory updates for runtime FIB table management.

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/reload.rs Makes config bridge apply events on a clone before committing.
src/policy_admin.rs Adds FIB table persistence event application.
src/peer_manager/reconcile.rs Adds live-config candidate validation for FIB tables.
src/peer_manager/mod.rs Handles FIB table validation command.
src/peer_manager/events.rs Adds audit/event classification for FIB table replacement.
src/main.rs Wires FIB table control hook and coordinator lock.
src/fib_table_control.rs Implements runtime FIB table list/set/delete orchestration.
src/fib_runtime.rs Adds command to read current reconciler table set.
proto/rustbgpd.proto Adds FIB table CRUD RPCs and messages.
crates/api/src/rib_service.rs Exposes and dispatches FIB table RPCs.
crates/api/src/server.rs Plumbs FIB table control into listeners.
crates/api/src/peer_types.rs Adds FIB table snapshots/events and validation command.
crates/api/src/authz.rs Classifies new RPCs in authz matrix.
crates/api/src/lib.rs Exposes rib_service module for binary wiring.
crates/cli/src/main.rs Adds fib-table CLI subcommands.
crates/cli/src/commands/fib_table.rs Implements CLI list/set/delete behavior.
crates/cli/src/commands/mod.rs Registers CLI command module.
crates/cli/src/test_support.rs Extends mock RIB service for FIB table RPCs.
docs/API.md Documents new API surface.
docs/CONFIGURATION.md Documents runtime FIB table CRUD lifecycle.
docs/grpc-method-inventory.md Updates human-readable RPC inventory.
docs/grpc-method-inventory.json Updates machine-readable RPC inventory.
CHANGELOG.md Adds release note for runtime FIB table CRUD.
ROADMAP.md Marks FIB table hot-swap/CRUD roadmap item done.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/fib_table_control.rs
Comment on lines +282 to +295
fn proto_to_config(table: proto::FibTableConfig) -> FibTableConfig {
FibTableConfig {
name: table.name,
table_id: table.table_id,
metric: table.metric,
families: table.families,
allowed_peer_groups: table.allowed_peer_groups,
allowed_neighbors: table.allowed_neighbors,
max_routes: table.max_routes,
maximum_paths: table.maximum_paths,
maximum_paths_ebgp: table.maximum_paths_ebgp,
maximum_paths_ibgp: table.maximum_paths_ibgp,
}
}
Comment thread src/fib_table_control.rs Outdated
Comment on lines +141 to +142
// Persist exactly the accepted set, only after the ack.
permit.send(ConfigEvent::FibTablesReplaced(snapshots));
…readback)

- SIGHUP serialization: hold the FIB coordinator lock across BOTH reload and
  apply_reload_outcome (run outcome application inside the locked task), so a
  concurrent gRPC CRUD can't slip into the gap and have its applied/persisted
  set clobbered by the stale reload snapshot.
- SetFibTable empty families now defaults to both unicast families (matching
  TOML and the proto/CLI docs) instead of failing validation.
- Keep the peer manager's current_config.fib_tables in sync after a CRUD
  mutation (PeerManagerCommand::SetFibTablesSnapshot) so DiffRuntimeConfig
  readback doesn't report the just-applied set as pending; peer_group_references
  now also counts fib_tables.allowed_peer_groups so peer-group deletion reports
  FIB-table references cleanly.
- ListFibTables falls back to the startup table set when the reconciler isn't
  running, so a non-Linux / netlink-failure daemon still shows configured tables.
- run_config_bridge applies events on a clone (no poisoned snapshot).
- Fix cargo doc: backtick [[fib_tables]] in proto comments (broken intra-doc link).
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 3 comments.

Comment thread src/fib_table_control.rs Outdated
Comment on lines +135 to +146
// Validate against the live config before touching the reconciler.
validate_candidate(&peer_mgr_tx, snapshots.clone()).await?;

// Reserve persistence capacity before applying, so we never apply a
// change we then can't record (which would drift runtime vs disk).
let permit = reserve_persist_permit(&config_tx).await?;

// Apply to the reconciler and wait for its acknowledgement.
replace_tables(&fib_cmd_tx, candidate.clone()).await?;

// Persist exactly the accepted set, only after the ack.
permit.send(ConfigEvent::FibTablesReplaced(snapshots.clone()));
}

#[tokio::test]
async fn set_sends_full_table() {
}

#[tokio::test]
async fn delete_sends_name() {
@lance0 lance0 changed the title gRPC + CLI runtime [[fib_tables]] CRUD (FIB operational hardening, PR2) gRPC + CLI runtime [[fib_tables]] CRUD (ADR-0061 FIB operational hardening) May 31, 2026
Add an FRR containerlab interop (M58) driving the runtime [[fib_tables]] CRUD
surface end-to-end against a real kernel: SetFibTable adds a table at runtime,
a table_id/metric change is a key-move (old kernel rows withdraw, new install),
the post-CRUD set persists across a restart, DeleteFibTable withdraws only its
own rows, and a missing-name delete returns NOT_FOUND. Companion to M42, which
covers the startup FIB path; wired into the kernel-dataplane CI matrix and
docs/milestones.md. Verified locally: 24/24 assertions pass.

The config is mounted as a read-only template and copied to a writable
container-local path at startup so the atomic config persist (temp + rename)
works — a single-file read-write bind mount would EBUSY on the rename.

Also drop the un-mergeable PR-ordinal references from the fib_table_control
module doc (name by function, not by the PR that introduced it).
@lance0 lance0 merged commit b3bd9c1 into main May 31, 2026
46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants