Skip to content

refactor(hooks): move cache_response out of pkg/runtime via an AgentLookup injection seam #2702

@dgageot

Description

@dgageot

Background

cache_response is a stop-event builtin that persists an agent's final assistant message into the agent's response cache. Unlike most builtins it does not live in pkg/hooks/builtins — it's defined in pkg/runtime/cache.go as a method on *LocalRuntime and registered in NewLocalRuntime via:

hooksRegistry.RegisterBuiltin(BuiltinCacheResponse, r.cacheResponseBuiltin)

The same file also contains a replay path (tryReplayCachedResponse) that is hard-wired into the run loop and is not a hook.

Current coupling

  • The hook captures *LocalRuntime so it can call r.team.Agent(in.AgentName).Cache().
  • A separate, runtime-private applyCacheDefault function appends a cache_response stop hook to every agent that has a cache configured — parallel to builtins.ApplyAgentDefaults.
  • The replay leg (tryReplayCachedResponse) is not expressible as a hook today: there's no event whose output can short-circuit a turn with a synthetic assistant response. So the runtime hard-wires it inline.

The result: caching is half a hook (the write leg) and half runtime plumbing (the replay leg, the auto-injection helper, the closure on *LocalRuntime). The builtin can't move to pkg/hooks/builtins without solving both.

Isolation grade

D — captures *LocalRuntime directly, cannot move out of pkg/runtime as-is.

Proposed fixes

1. Inject a narrow AgentLookup at registration time

Both legs of the asymmetry come from the same root cause: builtins have no way to resolve an agent (or its cache, or its models) from Input.AgentName. Today the only escape hatch is "capture *LocalRuntime in a closure", which forces the builtin to live in the runtime package.

Introduce a minimal interface that the runtime hands to the registry at construction time:

type AgentLookup interface {
    Agent(name string) (AgentView, error)
}

type AgentView interface {
    Cache() cache.Cache              // for cache_response
    ConfiguredModels() []provider.Provider // for unload (the sibling `unload` builtin (not yet tracked))
    // ... only what builtins legitimately need
}

builtins.Register(r, AgentLookup{...}) then closes over the lookup, not over *LocalRuntime. cache_response moves to pkg/hooks/builtins/cache_response.go alongside the other builtins. The same seam fixes unload (the sibling unload builtin (not yet tracked)) and would let strip_unsupported_modalities collapse into a regular before_llm_call builtin.

2. Move auto-injection to the builtin

applyCacheDefault currently lives in the runtime and bolts itself onto buildHooksExecutors after ApplyAgentDefaults. Once the cache builtin is in pkg/hooks/builtins, its auto-injection rule moves there too — preferably via the per-builtin AutoInject mechanism proposed in #2701 so it's not yet another central switch.

3. Add a pre_user_turn (or equivalent) hook event for the replay leg

The replay path returns a cached assistant message instead of calling the model. No existing hook event can do that. Two options:

a. Synthetic-response output on user_prompt_submit. Extend HookSpecificOutput with SyntheticResponse string. When set, the runtime records that string as the assistant message, fires stop, and skips the model call entirely for that turn. cache_response then has both legs in one builtin: cache write on stop, cache replay on user_prompt_submit.

b. Dedicated pre_user_turn event. New event that fires after the user message is added to the session and before any model call, whose Output.SyntheticResponse short-circuits the turn. Slightly cleaner separation than overloading user_prompt_submit, but adds a new event.

Either way, the runtime's tryReplayCachedResponse becomes generic — it just honors whatever a hook returns — and cache_response is the only code that knows about caches.

Recommendation

Land (1) first — it's the minimum to move cache_response (and unload) into pkg/hooks/builtins. Then (2) to retire applyCacheDefault. Then (3) when there's appetite to also retire tryReplayCachedResponse.

After all three, the runtime has no code that mentions caching. The cache builtin is the single source of truth for both directions of the feature.

Acceptance criteria

  • cache_response lives in pkg/hooks/builtins, not in pkg/runtime.
  • The runtime exposes a narrow AgentLookup-style seam at registration time; cache_response consumes it.
  • applyCacheDefault is removed (or moved next to the builtin).
  • (Stretch) tryReplayCachedResponse is removed in favour of a synthetic-response hook output.
  • Existing cache tests pass; a new test confirms the cache builtin works without depending on *LocalRuntime.

Metadata

Metadata

Assignees

Labels

area/agentFor work that has to do with the general agent loop/agentic features of the appeffort:mediumMultiple files or components, some design decisions neededpriority:lowNice-to-have, can be deferred indefinitely

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions