Skip to content

fix: make max_iterations builtin stateless (#2698)#2708

Open
dgageot wants to merge 3 commits intodocker:mainfrom
dgageot:fix/2698-max-iterations-stateless
Open

fix: make max_iterations builtin stateless (#2698)#2708
dgageot wants to merge 3 commits intodocker:mainfrom
dgageot:fix/2698-max-iterations-stateless

Conversation

@dgageot
Copy link
Copy Markdown
Member

@dgageot dgageot commented May 7, 2026

Fixes #2698.

The max_iterations builtin needed a per-session counter map, a mutex, and a session_end cleanup hook to do its job. The runtime already tracks the loop iteration counter — there's no reason the builtin needs its own.

What changed

Surfaces the loop iteration counter to hooks via a new hooks.Input.Iteration field, then rewrites the builtin as a pure stateless function. With no state to clean up, the *builtins.State wrapper around the snapshot tracker becomes pure indirection too — promoted *Snapshots to the package's public type.

Split into two separately-buildable commits so the cleanup can be reverted independently if undesired:

Commit Scope
make max_iterations builtin stateless The actual fix for #2698 — adds Input.Iteration, plumbs it through runStreamLooprunTurnexecuteBeforeLLMCallHooks, rewrites the builtin as a pure function, drops the per-session counter / mutex / ClearSession plumbing
drop builtins.State wrapper, expose Snapshots directly With max_iterations gone stateless, State was a single-field struct with three pure pass-throughs around *Snapshots. Exposed *Snapshots directly; Register returns it.
docs: add godoc on Snapshots type Doc-only follow-up.

Behaviour

Identical to the old stateful builtin: with max_iterations: ["N"], exactly N model calls succeed before the gate trips on call N+1. Lenient on missing/invalid args (no-op rather than instant trip).

This is still a hard stop — distinct from the agent.MaxIterations flag, which has its own UX (MaxIterationsReachedEvent, the resume dialog, the on_max_iterations hook).

Tests

  • Unit tests in pkg/hooks/builtins/max_iterations_test.go exercise the pure function across the iteration range and lenient-args contract.
  • Real e2e tests in pkg/runtime/before_llm_call_test.go (TestMaxIterationsBuiltin_TripsAfterConfiguredLimit, TestMaxIterationsBuiltin_NoOpOnInvalidLimit) stand up an agent whose model issues a tool call on every iteration, wire the builtin in via YAML config, and assert the loop hard-stops at exactly the configured budget.
  • Extended TestBeforeLLMCallHookFiresOncePerLoopIteration to also pin Input.Iteration == 1 on the first model call.

dgageot added 3 commits May 7, 2026 17:46
The runtime now passes the loop iteration via hooks.Input.Iteration,
so the builtin no longer needs a per-session map, mutex, or
session_end cleanup. Fixes docker#2698.
With max_iterations gone stateless, State was a single-field struct
with three pure pass-through methods around the snapshot tracker.
Promote *Snapshots to the package's public type and have Register
return it directly — one fewer layer between the runtime and the
checkpoint history.
@dgageot dgageot requested a review from a team as a code owner May 7, 2026 18:12
Copy link
Copy Markdown
Member

@rumpl rumpl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No more state, yay

Copy link
Copy Markdown

@docker-agent docker-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟡 NEEDS ATTENTION

PR makes max_iterations stateless by plumbing Input.Iteration through the run loop. The stateless rewrite is clean and the tests are comprehensive. One medium-severity API-safety concern is flagged below; two low-severity observations are noted in the summary.

Low (not posted as inline comments):

  • pkg/runtime/loop.go: The existing enforceMaxIterations flag (0-based, pre-increment) and the new Input.Iteration field (1-based, post-increment) use different counting conventions. Users who configure both simultaneously may see surprising off-by-one behavior — worth a comment near the call sites.
  • pkg/runtime/before_llm_call_test.go (TestMaxIterationsBuiltin_TripsAfterConfiguredLimit): The residual-stream assertion callsMade := 5 - len(prov.streams) assumes queueProvider shrinks prov.streams in place. If queueProvider uses an index instead of shrinking the slice, len(prov.streams) is always 5 and the assertion silently passes for any call count. Worth verifying the provider's pop strategy.

r.RegisterBuiltin(Snapshot, snapshots.Hook),
r.RegisterBuiltin(RedactSecrets, redactSecrets),
); err != nil {
return nil, err
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*[MEDIUM] Register returns a nil Snapshots on error — callers that use the pointer without a nil-guard risk a panic

Register now returns *Snapshots (previously *State). Both old and new code correctly return nil, err on failure, and the only production caller (runtime.go) checks the error before use:

snapshots, err := builtins.Register(hooksRegistry)
if err != nil {
    return nil, fmt.Errorf("register builtin hooks: %w", err)
}

However, the public API contract has subtly changed: *State was an opaque wrapper whose nil-receiver methods were all explicitly nil-safe (e.g. ClearSession had if s == nil { return }). *Snapshots is now a promoted public type whose methods do not carry nil-receiver guards. Any future caller that stores snapshots before checking err — or any test helper that calls Register and ignores the error — will receive a nil pointer that panics on the first method call rather than degrading gracefully.

Consider adding a nil-guard at the top of Snapshots' public methods (e.g. UndoLastSnapshot, ListSnapshots, ResetSnapshot), or documenting that a nil *Snapshots must never be used, to preserve the safety property the old *State wrapper provided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

max_iterations builtin hook: drop the per-session counter map

3 participants