Skip to content

KV Cache Phase 1: instrument cache-hit metrics, prefix divergence, and warm/cold latency #46

@shuyhere

Description

@shuyhere

Summary

Implement request-level cache observability so BB-Agent can measure cache reuse directly and make cache-hit rate the primary optimization target.

Why

The KV cache refactor should be driven by measurable outcomes, not architecture changes alone. We already store provider cache_read_tokens / cache_write_tokens, but we are missing the request-level telemetry needed to answer:

  • what changed between two consecutive requests,
  • how much stable prefix was reused,
  • whether a turn was warm or cold,
  • whether compaction or tool waits caused divergence,
  • whether follow-up TTFT improves after cache-friendly refactors.

This is the highest-priority issue in the KV-cache plan.

Scope

Add request instrumentation for every provider call.

Required metrics

  • request/session/turn identifiers
  • provider/model
  • full request hash
  • stable prefix hash
  • system prompt hash
  • tool defs hash
  • previous request hash
  • first divergence byte/token estimate
  • reused prefix length estimate
  • cache read tokens
  • cache write tokens
  • input/output tokens
  • request start time
  • first stream event time
  • first text delta time
  • request finished time
  • TTFT
  • total latency
  • tool wait time
  • resume latency
  • post-compaction flag
  • mutation flags for system/context/request rewrite

Code touchpoints

  • crates/cli/src/turn_runner/runner.rs
  • crates/provider/src/types.rs
  • crates/provider/src/streaming.rs
  • crates/cli/src/session_info.rs
  • new instrumentation module(s)

Acceptance criteria

  • repeated turns emit comparable request fingerprints
  • metrics clearly show warm vs cold behavior
  • cache-hit proxy can be derived from cache_read_tokens and prefix reuse estimate
  • TTFT is recorded and inspectable for follow-up turns
  • compaction and hook-driven mutations are visible in telemetry

Reference

  • knowledge/internal/KV_CACHE_REFACTOR_MASTER_PLAN.md

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions