feat(kad-dht): add RoutingTableDiagnostics for routing health inspection by bhuvan-somisetty · Pull Request #1332 · libp2p/py-libp2p

bhuvan-somisetty · 2026-05-16T16:25:00Z

What problem does this solve?

When a KadDHT node starts misbehaving - slow lookups, failed bootstraps, keys that can never be found operators are flying blind. The routing table is a black box. The only way to debug it today is to sprinkle print statements through the Kademlia internals or stare at raw bucket lists.

This PR adds a first-class diagnostic surface so you can answer the real questions in seconds, not hours:

Which k-buckets are under-populated or empty?
Where are the keyspace coverage gaps that explain why certain keys can't be found?
How fresh are the known peers? Are half of them stale?
What is the overall routing-table health as a single score I can log or alert on?

What was added

`libp2p/kad_dht/diagnostics.py` - core engine

A new RoutingTableDiagnostics class that analyses a live routing table and produces a RoutingTableReport. The analyser is read-only - it never touches the routing table state.

from libp2p.kad_dht import KadDHT

dht = KadDHT(host, mode)

report = dht.get_diagnostics().analyse()
print(report.summary())
print(f"Health score: {report.health_score}/100  ({report.verdict})")

Sample output:

=== KadDHT Routing Table Report ===
Timestamp       : 2026-05-15 23:29:54
Local peer      : 12D3KooWAbcd1234…
Health score    : 73.4/100  (good)

Peers           : 18
Buckets         : 4/6 populated
Keyspace cover  : 61.2%
Coverage gaps   : 2

Peer freshness
  Fresh  (<1 h) : 14
  Aging (1–12 h): 3
  Stale (12–24h): 1
  Very stale    : 0

Top coverage gaps (first 3):
  bucket #5: 00000000…–1fffffff… (0 peers)
  bucket #4: 20000000…–3fffffff… (1 peers)

Health score breakdown (0–100, composite):

Component	Weight	Notes
Fill score	40 pts	Weighted by proximity - nearest buckets count more
Freshness score	35 pts	Ratio of peers seen in the last hour
Coverage score	25 pts	Fraction of the 256-bit keyspace covered

The weighting reflects Kademlia reality: a full bucket closest to the local node is worth far more than a full bucket at the far end of the keyspace.

Convenience entry points

# Via KadDHT instance
report = dht.get_diagnostics().analyse()

# Via RoutingTable directly
report = dht.routing_table.get_diagnostics().analyse()

# Partial queries (no full report allocation)
score  = dht.get_diagnostics().get_health_score()
gaps   = dht.get_diagnostics().get_coverage_gaps()
fresh  = dht.get_diagnostics().get_freshness_distribution()
stats  = dht.get_diagnostics().get_bucket_stats()

Public types exported from `libp2p.kad_dht`

from libp2p.kad_dht import (
    RoutingTableDiagnostics,
    RoutingTableReport,
    BucketStat,
    CoverageGap,
    FreshnessDistribution,
)

Tests

tests/core/kad_dht/test_routing_table_diagnostics.py - 27 unit tests, fully offline (mock host, no network required). Covers:

BucketStat properties (is_full, is_empty, health)
FreshnessDistribution ratios and totals
RoutingTableDiagnostics with empty tables, single-peer tables, and multi-bucket tables
Coverage gap detection and ordering (emptiest first)
Health score components individually and composite
RoutingTableReport.summary() output
Freshness time-band boundaries (exactly at 1 h, 12 h, 24 h)
Stale peer counting in bucket stats

Example

examples/kademlia/routing_table_diagnostics.py - two-node runnable demo:

# Terminal 1 - bootstrap node
python examples/kademlia/routing_table_diagnostics.py --port 8888 --mode server

# Terminal 2 - client (connects, warms up 5s, prints full report)
python examples/kademlia/routing_table_diagnostics.py \
    --port 9999 --mode server \
    --bootstrap /ip4/127.0.0.1/tcp/8888/p2p/<PeerID>

Non-goals / out of scope

No metrics export (Prometheus, OpenTelemetry) - that belongs in a separate PR
No periodic background task - callers decide when to run diagnostics
No routing table modification - purely observational

Checklist

New module libp2p/kad_dht/diagnostics.py with full docstrings
RoutingTable.get_diagnostics() convenience factory
KadDHT.get_diagnostics() convenience factory
All new public types exported in libp2p/kad_dht/__init__.py
27 offline unit tests
Runnable example in examples/kademlia/
No changes to routing table logic — read-only analyser
TYPE_CHECKING guard to avoid circular imports

When a KadDHT node misbehaves — slow lookups, unreachable keys, failed bootstraps — operators previously had no visibility into why. This commit adds a first-class diagnostic surface that answers the core questions: • Which k-buckets are under-populated or empty? • Where are the keyspace coverage gaps? • How fresh are my known peers? (fresh / aging / stale / very stale) • What is the overall routing-table health as a single 0–100 score? Changes: libp2p/kad_dht/diagnostics.py New RoutingTableDiagnostics class (read-only analyser). Produces a RoutingTableReport with BucketStat list, CoverageGap list, FreshnessDistribution, composite health score, and human-readable summary. libp2p/kad_dht/routing_table.py Add RoutingTable.get_diagnostics() convenience factory. libp2p/kad_dht/kad_dht.py Add KadDHT.get_diagnostics() convenience factory. libp2p/kad_dht/__init__.py Export all new public types. tests/core/kad_dht/test_routing_table_diagnostics.py 27 unit tests; fully offline (mock host, no network required). examples/kademlia/routing_table_diagnostics.py Two-node demo that prints a full report after bootstrapping. Usage: report = dht.get_diagnostics().analyse() print(report.summary()) print(f"Health score: {report.health_score}/100")

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(kad-dht): add RoutingTableDiagnostics for routing health inspection#1332

feat(kad-dht): add RoutingTableDiagnostics for routing health inspection#1332
bhuvan-somisetty wants to merge 1 commit into
libp2p:mainfrom
bhuvan-somisetty:feat/kad-dht-routing-table-diagnostics-clean

bhuvan-somisetty commented May 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bhuvan-somisetty commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this solve?

What was added

libp2p/kad_dht/diagnostics.py - core engine

Convenience entry points

Public types exported from libp2p.kad_dht

Tests

Example

Non-goals / out of scope

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bhuvan-somisetty commented May 16, 2026 •

edited

Loading

`libp2p/kad_dht/diagnostics.py` - core engine

Public types exported from `libp2p.kad_dht`