Background
cache_response is a stop-event builtin that persists an agent's final assistant message into the agent's response cache. Unlike most builtins it does not live in pkg/hooks/builtins — it's defined in pkg/runtime/cache.go as a method on *LocalRuntime and registered in NewLocalRuntime via:
hooksRegistry.RegisterBuiltin(BuiltinCacheResponse, r.cacheResponseBuiltin)
The same file also contains a replay path (tryReplayCachedResponse) that is hard-wired into the run loop and is not a hook.
Current coupling
- The hook captures
*LocalRuntime so it can call r.team.Agent(in.AgentName).Cache().
- A separate, runtime-private
applyCacheDefault function appends a cache_response stop hook to every agent that has a cache configured — parallel to builtins.ApplyAgentDefaults.
- The replay leg (
tryReplayCachedResponse) is not expressible as a hook today: there's no event whose output can short-circuit a turn with a synthetic assistant response. So the runtime hard-wires it inline.
The result: caching is half a hook (the write leg) and half runtime plumbing (the replay leg, the auto-injection helper, the closure on *LocalRuntime). The builtin can't move to pkg/hooks/builtins without solving both.
Isolation grade
D — captures *LocalRuntime directly, cannot move out of pkg/runtime as-is.
Proposed fixes
1. Inject a narrow AgentLookup at registration time
Both legs of the asymmetry come from the same root cause: builtins have no way to resolve an agent (or its cache, or its models) from Input.AgentName. Today the only escape hatch is "capture *LocalRuntime in a closure", which forces the builtin to live in the runtime package.
Introduce a minimal interface that the runtime hands to the registry at construction time:
type AgentLookup interface {
Agent(name string) (AgentView, error)
}
type AgentView interface {
Cache() cache.Cache // for cache_response
ConfiguredModels() []provider.Provider // for unload (the sibling `unload` builtin (not yet tracked))
// ... only what builtins legitimately need
}
builtins.Register(r, AgentLookup{...}) then closes over the lookup, not over *LocalRuntime. cache_response moves to pkg/hooks/builtins/cache_response.go alongside the other builtins. The same seam fixes unload (the sibling unload builtin (not yet tracked)) and would let strip_unsupported_modalities collapse into a regular before_llm_call builtin.
2. Move auto-injection to the builtin
applyCacheDefault currently lives in the runtime and bolts itself onto buildHooksExecutors after ApplyAgentDefaults. Once the cache builtin is in pkg/hooks/builtins, its auto-injection rule moves there too — preferably via the per-builtin AutoInject mechanism proposed in #2701 so it's not yet another central switch.
3. Add a pre_user_turn (or equivalent) hook event for the replay leg
The replay path returns a cached assistant message instead of calling the model. No existing hook event can do that. Two options:
a. Synthetic-response output on user_prompt_submit. Extend HookSpecificOutput with SyntheticResponse string. When set, the runtime records that string as the assistant message, fires stop, and skips the model call entirely for that turn. cache_response then has both legs in one builtin: cache write on stop, cache replay on user_prompt_submit.
b. Dedicated pre_user_turn event. New event that fires after the user message is added to the session and before any model call, whose Output.SyntheticResponse short-circuits the turn. Slightly cleaner separation than overloading user_prompt_submit, but adds a new event.
Either way, the runtime's tryReplayCachedResponse becomes generic — it just honors whatever a hook returns — and cache_response is the only code that knows about caches.
Recommendation
Land (1) first — it's the minimum to move cache_response (and unload) into pkg/hooks/builtins. Then (2) to retire applyCacheDefault. Then (3) when there's appetite to also retire tryReplayCachedResponse.
After all three, the runtime has no code that mentions caching. The cache builtin is the single source of truth for both directions of the feature.
Acceptance criteria
Background
cache_responseis astop-event builtin that persists an agent's final assistant message into the agent's response cache. Unlike most builtins it does not live inpkg/hooks/builtins— it's defined inpkg/runtime/cache.goas a method on*LocalRuntimeand registered inNewLocalRuntimevia:The same file also contains a replay path (
tryReplayCachedResponse) that is hard-wired into the run loop and is not a hook.Current coupling
*LocalRuntimeso it can callr.team.Agent(in.AgentName).Cache().applyCacheDefaultfunction appends acache_responsestop hook to every agent that has a cache configured — parallel tobuiltins.ApplyAgentDefaults.tryReplayCachedResponse) is not expressible as a hook today: there's no event whose output can short-circuit a turn with a synthetic assistant response. So the runtime hard-wires it inline.The result: caching is half a hook (the write leg) and half runtime plumbing (the replay leg, the auto-injection helper, the closure on
*LocalRuntime). The builtin can't move topkg/hooks/builtinswithout solving both.Isolation grade
D — captures
*LocalRuntimedirectly, cannot move out ofpkg/runtimeas-is.Proposed fixes
1. Inject a narrow
AgentLookupat registration timeBoth legs of the asymmetry come from the same root cause: builtins have no way to resolve an agent (or its cache, or its models) from
Input.AgentName. Today the only escape hatch is "capture*LocalRuntimein a closure", which forces the builtin to live in the runtime package.Introduce a minimal interface that the runtime hands to the registry at construction time:
builtins.Register(r, AgentLookup{...})then closes over the lookup, not over*LocalRuntime.cache_responsemoves topkg/hooks/builtins/cache_response.goalongside the other builtins. The same seam fixesunload(the siblingunloadbuiltin (not yet tracked)) and would letstrip_unsupported_modalitiescollapse into a regularbefore_llm_callbuiltin.2. Move auto-injection to the builtin
applyCacheDefaultcurrently lives in the runtime and bolts itself ontobuildHooksExecutorsafterApplyAgentDefaults. Once the cache builtin is inpkg/hooks/builtins, its auto-injection rule moves there too — preferably via the per-builtinAutoInjectmechanism proposed in #2701 so it's not yet another central switch.3. Add a
pre_user_turn(or equivalent) hook event for the replay legThe replay path returns a cached assistant message instead of calling the model. No existing hook event can do that. Two options:
a. Synthetic-response output on
user_prompt_submit. ExtendHookSpecificOutputwithSyntheticResponse string. When set, the runtime records that string as the assistant message, firesstop, and skips the model call entirely for that turn.cache_responsethen has both legs in one builtin: cache write onstop, cache replay onuser_prompt_submit.b. Dedicated
pre_user_turnevent. New event that fires after the user message is added to the session and before any model call, whoseOutput.SyntheticResponseshort-circuits the turn. Slightly cleaner separation than overloadinguser_prompt_submit, but adds a new event.Either way, the runtime's
tryReplayCachedResponsebecomes generic — it just honors whatever a hook returns — andcache_responseis the only code that knows about caches.Recommendation
Land (1) first — it's the minimum to move
cache_response(andunload) intopkg/hooks/builtins. Then (2) to retireapplyCacheDefault. Then (3) when there's appetite to also retiretryReplayCachedResponse.After all three, the runtime has no code that mentions caching. The cache builtin is the single source of truth for both directions of the feature.
Acceptance criteria
cache_responselives inpkg/hooks/builtins, not inpkg/runtime.AgentLookup-style seam at registration time;cache_responseconsumes it.applyCacheDefaultis removed (or moved next to the builtin).tryReplayCachedResponseis removed in favour of a synthetic-response hook output.*LocalRuntime.