Skip to content

kvcache: skip multi-turn cache reads in decode-only mode#284

Merged
FileSystemGuy merged 1 commit intomlcommons:mainfrom
LouisDDN:ld/skip-multiturn-decode-only
Mar 24, 2026
Merged

kvcache: skip multi-turn cache reads in decode-only mode#284
FileSystemGuy merged 1 commit intomlcommons:mainfrom
LouisDDN:ld/skip-multiturn-decode-only

Conversation

@LouisDDN
Copy link
Copy Markdown
Contributor

Skip multi-turn conversation cache reads when running in decode-only mode, since previous turn cache entries are never written in this mode.

This change:

  • Prevents wasteful cache lookups that always miss
  • Cleans up multi_turn_cache_misses metrics (no longer polluted)
  • Improves code correctness by not checking cache that was never written

The multi-turn cache read (Step 2) is now guarded by the same if not self.decode_only check as the prefill write (Step 3), since both operations are meaningless in decode-only mode.

Performance impact: negligible (<0.01%), but improves code clarity.

@LouisDDN LouisDDN requested a review from a team March 20, 2026 15:52
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 20, 2026

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

Skip multi-turn conversation cache reads when running in decode-only mode,
since previous turn cache entries are never written in this mode.

This change:
- Prevents wasteful cache lookups that always miss
- Cleans up multi_turn_cache_misses metrics (no longer polluted)
- Improves code correctness by not checking cache that was never written

The multi-turn cache read (Step 2) is now guarded by the same
`if not self.decode_only` check as the prefill write (Step 3),
since both operations are meaningless in decode-only mode.

Performance impact: negligible (<0.01%), but improves code clarity.
@LouisDDN LouisDDN force-pushed the ld/skip-multiturn-decode-only branch from 984ee49 to f45de66 Compare March 20, 2026 15:55
Copy link
Copy Markdown
Contributor

@hazemawadalla hazemawadalla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. thank you!

@dslik
Copy link
Copy Markdown
Contributor

dslik commented Mar 24, 2026

@FileSystemGuy Waiting for your approval.

@FileSystemGuy FileSystemGuy merged commit d7d52e0 into mlcommons:main Mar 24, 2026
1 check passed
@github-actions github-actions bot locked and limited conversation to collaborators Mar 24, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants