Dynamic Tools KV Cache Game

A simulation game for testing and validating the tools_at_end feature in the qvac LLM addon. It simulates the KV cache behavior of dynamic tool placement without requiring a model — pure Python, no dependencies.

What it simulates

The tools_at_end feature places tool definitions at the end of prompts and trims them from the KV cache after each completion. This prevents stale tool token accumulation across multi-turn conversations.

The game ports the C++ cache logic from qvac-lib-infer-llamacpp-llm into Python:

DynamicToolsState (tool boundary tracking)
removeLastNTokens (cache trimming)
applyContextDiscard (context sliding during generation)
CacheManager save/load (session persistence)
BOS tokens, generation prompt tokens, Qwen3 reasoning injection

Three modes

Test mode (automated validation)

python3 dynamic-tools-game.py
python3 dynamic-tools-game.py test

Runs 21 test scenarios with 124 assertions covering:

Single/multi-turn with changing tools
Session save/reload/switch
Context sliding with tools
Qwen3 reasoning token injection
Stop with EOT token
Stateless (no-session) prompts
BOS and generation prompt token handling
Sliding that would eat into tool tokens (regression)

Play mode (interactive)

python3 dynamic-tools-game.py play

Configure context parameters at startup, then play turn by turn:

=== Dynamic Tools KV Cache Game ===
Configure context parameters (press Enter for defaults):

  Context size (n_ctx) [2048]:
  Tokens to discard on slide (n_discarded) [256]:
  BOS tokens [1]:
  Gen prompt tokens [3]:

Actions (single letter shortcuts):

u — add user message
a — add assistant message
s — add system message
t — add tool response (result from tool execution)
T — add Tool definitions (weather/search/email/calc/custom)
g — generate (eval + generate + trim)
S — Save session
l — load/switch session
d — show detailed cache contents
r — reset state
q — quit

Add multiple tools at once with first-letter shortcuts:

Tools: w s e    # adds weather + search + email

Typical agentic flow

# Turn 1: user asks question
s 10    # system message, 10 tokens
u 15    # user message, 15 tokens
T w s   # add weather + search tools
g 20    # generate 20 tokens -> trims tools + generated

# Turn 2: tool response (model called a tool)
t 30    # tool response, 30 tokens
T w s   # same tools (model might call another)
g 15    # generate -> trim

# Turn 3: final answer (no tool call), then new user question
a 10    # assistant's final answer
u 20    # new user question
T e c   # new tools for new question
g 25    # generate -> trim

After each generation, the game:

Shows a color-coded KV cache diagram
Reports nPast, firstMsgTokens, nPastBeforeTools, nSlides
Validates no stale tool tokens leaked into the cache
Tracks your score (turns without violations)

Auto mode (infinite stress test)

python3 dynamic-tools-game.py auto

Runs an infinite agentic loop based on the first principles:

Tools are always at the end, always trimmed after generation
After tool call: add tool response + same tools
After final answer: add assistant response + user message + new tools

Randomly samples token counts, tool sets, and tool-call vs final-answer decisions. Validates after every turn that the cache contains only expected tokens (system, user, assistant, tool responses). Runs until context overflow or a bug is found.

=== Dynamic Tools — Infinite Agentic Loop ===
Configure (press Enter for defaults):

  Context size (n_ctx) [2048]:
  Tokens to discard on slide (n_discarded) [256]:
  Random seed (0=random) [0]: 42

Bugs found by this game

The simulation found two real bugs in the C++ implementation:

Context sliding during generation doesn't adjust nPastBeforeTools — when sliding shifts tokens down, the tool trim boundary becomes stale, leaving tool tokens in the cache after trim.
Sliding can eat into tool tokens — when nDiscarded exceeds the number of conversation tokens between firstMsgTokens and the tool boundary, the discard removes tool tokens instead of just conversation tokens.

Both bugs have been fixed in the game and in the C++ code (TextLlmContext.cpp, MtmdLlmContext.cpp).

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
dynamic-tools-game.py		dynamic-tools-game.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dynamic Tools KV Cache Game

What it simulates

Three modes

Test mode (automated validation)

Play mode (interactive)

Typical agentic flow

Auto mode (infinite stress test)

Bugs found by this game

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Dynamic Tools KV Cache Game

What it simulates

Three modes

Test mode (automated validation)

Play mode (interactive)

Typical agentic flow

Auto mode (infinite stress test)

Bugs found by this game

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages