Feature Request: Show cache hit rate percentage
Currently, cache read tokens are tracked but the cache hit rate isn't displayed. This would help understand how effective the prefix caching is.
Proposed Implementation
Add cache hit rate to the /savings output:
Cache Hit Rate: 71.4%
Cached tokens: 1,234,567
Fresh tokens: 3,456,789
Why This Is Useful
- Diagnose cache effectiveness - Is prefix caching working as expected?
- Spot issues - Low hit rate might indicate problems
- Encourage longer conversations - Higher hit rate with more context
Implementation
Calculate:
cacheHitRate = cacheReadTokens / totalInputTokens × 100
Priority
Low (nice to have)
Feature Request: Show cache hit rate percentage
Currently, cache read tokens are tracked but the cache hit rate isn't displayed. This would help understand how effective the prefix caching is.
Proposed Implementation
Add cache hit rate to the /savings output:
Why This Is Useful
Implementation
Calculate:
Priority
Low (nice to have)