Skip to content

Show cache hit rate percentage #2

@WynnD

Description

@WynnD

Feature Request: Show cache hit rate percentage

Currently, cache read tokens are tracked but the cache hit rate isn't displayed. This would help understand how effective the prefix caching is.

Proposed Implementation

Add cache hit rate to the /savings output:

Cache Hit Rate: 71.4%
  Cached tokens: 1,234,567
  Fresh tokens: 3,456,789

Why This Is Useful

  1. Diagnose cache effectiveness - Is prefix caching working as expected?
  2. Spot issues - Low hit rate might indicate problems
  3. Encourage longer conversations - Higher hit rate with more context

Implementation

Calculate:

cacheHitRate = cacheReadTokens / totalInputTokens × 100

Priority

Low (nice to have)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions