Skip to content

Python: [BREAKING] Integrate looping into HarnessAgent#6607

Open
westey-m wants to merge 2 commits into
microsoft:mainfrom
westey-m:python-loop-harness-integrate
Open

Python: [BREAKING] Integrate looping into HarnessAgent#6607
westey-m wants to merge 2 commits into
microsoft:mainfrom
westey-m:python-loop-harness-integrate

Conversation

@westey-m

Copy link
Copy Markdown
Contributor

Motivation & Context

The core AgentLoopMiddleware makes it possible to re-invoke an agent until configurable criteria are met. This PR builds on that to make looping a first-class, opt-in capability of the harness, delivering the "loop until completion" enforcement scenario for harnesses.

It also adds shared todos_remaining / todos_remaining_message helpers so a harness can keep working autonomously until every todo item is complete — optionally scoped to specific agent modes — which is exactly the kind of completion-enforcement behaviour the harness experience needs. This is the Python counterpart of the .NET work in #6544.

Description & Review Guide

  • What are the major changes?

    • create_harness_agent gains loop_should_continue, loop_next_message, and loop_max_iterations keyword arguments. When loop_should_continue is supplied, the harness is wrapped in an AgentLoopMiddleware wired as the outermost middleware (each iteration is a full agent run, including tool approval); when it is None, behaviour is unchanged.
    • Approval escape hatch: if a loop iteration returns a pending tool-approval request, the loop stops and returns it to the caller before evaluating should_continue or injecting next_message, so looping is HITL-safe even when wrapped around ToolApprovalMiddleware.
    • New shared helpers in _loop.py, exported from the package: todos_remaining(*, modes=None) resolves the TodoProvider (and AgentModeProvider when modes is set) from agent.context_providers and loops while incomplete todos remain (modes=None applies in all modes; a non-empty sequence gates by mode, case-insensitively; an empty sequence raises ValueError). todos_remaining_message is a next_message callable that lists the still-open todos. These merge and replace the former provider-argument helper.
    • The harness_research.py sample demonstrates the loop scoped to "execute" mode with a loop_max_iterations safety cap; the two agent_loop_middleware_* samples switch to the no-argument todos_remaining().
    • Tests and AGENTS.md updated; conforms to the new split type-checker setup (pyright on source; pyright/mypy/pyrefly/ty/zuban on tests/samples).
  • What is the impact of these changes?

    • Looping is opt-in and additive for create_harness_agent (no loop unless loop_should_continue is set).
    • Breaking: the public todos_remaining(provider) signature is removed and replaced by todos_remaining(*, modes=None), which resolves the provider from the running agent's context_providers instead of taking it as an argument. Callers passing a provider must update to the new form.
  • What do you want reviewers to focus on?

    • The middleware ordering (loop outermost vs. tool approval) and the approval escape-hatch semantics.
    • The merged todos_remaining API shape and its mode-gating behaviour.

Related Issue

Fixes #6478

Contribution Checklist

  • The code builds clean without any errors or warnings
  • All unit tests pass, and I have added new tests where possible
  • The PR follows the Contribution Guidelines
  • This PR is linked to an issue and there is no other open PR for this issue (see Related Issue above).
  • This is not a breaking change. If it is a breaking change, add the breaking change label (or add "[BREAKING]" to the title prefix, before or after any language prefix) — a workflow keeps the label and title prefix in sync automatically.

Copilot AI review requested due to automatic review settings June 18, 2026 17:55
@moonbox3 moonbox3 added documentation Improvements or additions to documentation python Issues related to the Python codebase breaking change Introduces changes that are not backward compatible and may require updates to dependent code. labels Jun 18, 2026
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/core/agent_framework/_harness
   _agent.py107496%161, 482–483, 485
   _loop.py267797%480, 488, 563, 637, 680, 755, 869
TOTAL39904450388% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
8004 34 💤 0 ❌ 0 🔥 2m 9s ⏱️

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes looping an opt-in, first-class capability of the Python harness by wiring AgentLoopMiddleware as the outermost harness middleware when configured, and adds new provider-resolving todo-loop helpers to support “work until all todos are complete” scenarios (including optional mode-gating). It also adds an approval “escape hatch” so looping stops immediately on pending tool-approval requests and returns control to the caller.

Changes:

  • Add loop_should_continue, loop_next_message, and loop_max_iterations to create_harness_agent, wiring the loop outermost (ahead of tool approval) when enabled.
  • Replace the old provider-argument todo loop helper with todos_remaining(*, modes=None) and add todos_remaining_message, both resolving providers from agent.context_providers.
  • Update samples/docs and add/adjust tests for harness loop wiring, mode-gating behavior, and approval escape hatch semantics.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
python/samples/02-agents/middleware/agent_loop_middleware_todos.py Updates todo loop sample to use the new no-argument todos_remaining() helper.
python/samples/02-agents/middleware/agent_loop_middleware_report.py Updates composed loop sample to use the new no-argument todos_remaining() helper.
python/samples/02-agents/harness/README.md Documents harness looping as an available harness feature and clarifies the harness research sample description.
python/samples/02-agents/harness/harness_research.py Demonstrates harness looping with mode-gated todo completion and a max-iteration safety cap.
python/packages/core/tests/core/test_harness_loop.py Replaces old todo helper tests with new provider-resolving todo helper and approval escape hatch tests.
python/packages/core/tests/core/test_harness_agent.py Adds tests asserting loop middleware is wired only when configured and that it is outermost relative to tool approval and user middleware.
python/packages/core/AGENTS.md Updates public docs to describe new todo-loop helpers and harness integration.
python/packages/core/agent_framework/_harness/_loop.py Adds approval escape hatch detection and implements the new todos_remaining / todos_remaining_message helpers.
python/packages/core/agent_framework/_harness/_agent.py Extends create_harness_agent to accept loop params and wire AgentLoopMiddleware outermost when enabled.
python/packages/core/agent_framework/init.py Exports todos_remaining_message at the package level.

Comment thread python/packages/core/agent_framework/_harness/_loop.py Outdated
Comment thread python/packages/core/AGENTS.md Outdated

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 90% | Result: All clear

Reviewed: Correctness, Security Reliability, Test Coverage, Failure Modes


Automated review by westey-m's agents

@westey-m westey-m marked this pull request as ready for review June 18, 2026 18:09

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 5 | Confidence: 90%

✓ Correctness

The PR correctly integrates loping into HarnessAgent with proper escape-hatch semantics, middleware ordering, and a well-designed todos_remaining/todos_remaining_message API. The approval escape hatch correctly fires before should_continue evaluation in both streaming and non-streaming paths. The _resolve_context_provider helper safely resolves via getattr(agent, 'context_providers', []). Mode gating uses case-insensitive set matching built at factory time. Middleware insertion with insert(0, ...) correctly positions the loop outermost. All paths are covered by comprehensive tests.

✓ Security Reliability

This PR integrates loping into the harness agent with a well-designed approval escape hatch, provider resolution via context_providers, and mode-gating. The implementation is security-sound: the escape hatch correctly stops before evaluating should_continue/max_iterations (preventing HITL bypass), the loop has a safe default cap (DEFAULT_MAX_ITERATIONS=10), input validation rejects empty modes, and provider resolution is defensive (getattr with default). No injection risks, resource leaks, or unhandled failure modes were identified.

✓ Test Coverage

Test coverage for the new loop integration is thorough: harness wiring (7 tests), todos_remaining/todos_remaining_message helpers (6 tests), and the approval escape hatch (4 tests) are all well-covered with meaningful assertions. One branch in todos_remaining — the fallback when modes is specified but no AgentModeProvider is registered — is not exercised by any test.

✓ Failure Modes

The PR is well-implemented. The approval escape hatch correctly preserves the response in both streaming (holder['final'] set before the check) and non-streaming (aggregated includes the approval messages before break) paths. The todos_remaining/todos_remaining_message helpers properly guard against missing session/agent/provider by returning False/None. Provider resolution via _resolve_context_provider safely uses getattr with default [] and next() with default None. No silent failures, lost errors, or stale state issues were found.

✗ Design Approach

I found one design-level correctness issue in the new todo-loop helper API: the provider lookup strategy does not match the newly documented "works with any agent that registers a TodoProvider via context_providers" contract when more than one matching provider is present.

Flagged Issues

  • todos_remaining() can silently ignore a caller-supplied TodoProvider when used with create_harness_agent(..., context_providers=[...]), because the harness appends extra providers after its built-ins (_harness/_agent.py:150-175) but _resolve_context_provider() always returns the first match (_harness/_loop.py:816-818). The helper needs to disambiguate duplicate providers or reject them instead of always taking the first match.

Suggestions

  • Consider adding a test for todos_remaining(modes=[...]) when the agent has a TodoProvider but no AgentModeProvider — this exercises the else-branch fallback at _loop.py:869 (get_agent_mode(session) without provider-specific config).

Automated review by westey-m's agents

@github-actions

Copy link
Copy Markdown
Contributor

Flagged issue

todos_remaining() can silently ignore a caller-supplied TodoProvider when used with create_harness_agent(..., context_providers=[...]), because the harness appends extra providers after its built-ins (_harness/_agent.py:150-175) but _resolve_context_provider() always returns the first match (_harness/_loop.py:816-818). The helper needs to disambiguate duplicate providers or reject them instead of always taking the first match.


Source: automated DevFlow PR review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking change Introduces changes that are not backward compatible and may require updates to dependent code. documentation Improvements or additions to documentation python Issues related to the Python codebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

.NET: Integrate Looping into HarnessAgent

3 participants