Python: [BREAKING] Integrate looping into HarnessAgent by westey-m · Pull Request #6607 · microsoft/agent-framework

westey-m · 2026-06-18T17:55:37Z

Motivation & Context

The core AgentLoopMiddleware makes it possible to re-invoke an agent until configurable criteria are met. This PR builds on that to make looping a first-class, opt-in capability of the harness, delivering the "loop until completion" enforcement scenario for harnesses.

It also adds shared todos_remaining / todos_remaining_message helpers so a harness can keep working autonomously until every todo item is complete — optionally scoped to specific agent modes — which is exactly the kind of completion-enforcement behaviour the harness experience needs. This is the Python counterpart of the .NET work in #6544.

Description & Review Guide

What are the major changes?
- create_harness_agent gains loop_should_continue, loop_next_message, and loop_max_iterations keyword arguments. When loop_should_continue is supplied, the harness is wrapped in an AgentLoopMiddleware wired as the outermost middleware (each iteration is a full agent run, including tool approval); when it is None, behaviour is unchanged.
- Approval escape hatch: if a loop iteration returns a pending tool-approval request, the loop stops and returns it to the caller before evaluating should_continue or injecting next_message, so looping is HITL-safe even when wrapped around ToolApprovalMiddleware.
- New shared helpers in _loop.py, exported from the package: todos_remaining(*, modes=None) resolves the TodoProvider (and AgentModeProvider when modes is set) from agent.context_providers and loops while incomplete todos remain (modes=None applies in all modes; a non-empty sequence gates by mode, case-insensitively; an empty sequence raises ValueError). todos_remaining_message is a next_message callable that lists the still-open todos. These merge and replace the former provider-argument helper.
- The harness_research.py sample demonstrates the loop scoped to "execute" mode with a loop_max_iterations safety cap; the two agent_loop_middleware_* samples switch to the no-argument todos_remaining().
- Tests and AGENTS.md updated; conforms to the new split type-checker setup (pyright on source; pyright/mypy/pyrefly/ty/zuban on tests/samples).
What is the impact of these changes?
- Looping is opt-in and additive for create_harness_agent (no loop unless loop_should_continue is set).
- Breaking: the public todos_remaining(provider) signature is removed and replaced by todos_remaining(*, modes=None), which resolves the provider from the running agent's context_providers instead of taking it as an argument. Callers passing a provider must update to the new form.
What do you want reviewers to focus on?
- The middleware ordering (loop outermost vs. tool approval) and the approval escape-hatch semantics.
- The merged todos_remaining API shape and its mode-gating behaviour.

Related Issue

Fixes #6478

Contribution Checklist

The code builds clean without any errors or warnings
All unit tests pass, and I have added new tests where possible
The PR follows the Contribution Guidelines
This PR is linked to an issue and there is no other open PR for this issue (see Related Issue above).
This is not a breaking change. If it is a breaking change, add the breaking change label (or add "[BREAKING]" to the title prefix, before or after any language prefix) — a workflow keeps the label and title prefix in sync automatically.

github-actions · 2026-06-18T17:59:54Z

Python Test Coverage Report •

File	Stmts	Miss	Cover	Missing
packages/core/agent_framework/_harness
_agent.py	107	4	96%	161, 482–483, 485
_loop.py	267	7	97%	480, 488, 563, 637, 680, 755, 869
TOTAL	39904	4503	88%

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
8004	34 💤	0 ❌	0 🔥	2m 9s ⏱️

Copilot

Pull request overview

This PR makes looping an opt-in, first-class capability of the Python harness by wiring AgentLoopMiddleware as the outermost harness middleware when configured, and adds new provider-resolving todo-loop helpers to support “work until all todos are complete” scenarios (including optional mode-gating). It also adds an approval “escape hatch” so looping stops immediately on pending tool-approval requests and returns control to the caller.

Changes:

Add loop_should_continue, loop_next_message, and loop_max_iterations to create_harness_agent, wiring the loop outermost (ahead of tool approval) when enabled.
Replace the old provider-argument todo loop helper with todos_remaining(*, modes=None) and add todos_remaining_message, both resolving providers from agent.context_providers.
Update samples/docs and add/adjust tests for harness loop wiring, mode-gating behavior, and approval escape hatch semantics.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
python/samples/02-agents/middleware/agent_loop_middleware_todos.py	Updates todo loop sample to use the new no-argument `todos_remaining()` helper.
python/samples/02-agents/middleware/agent_loop_middleware_report.py	Updates composed loop sample to use the new no-argument `todos_remaining()` helper.
python/samples/02-agents/harness/README.md	Documents harness looping as an available harness feature and clarifies the harness research sample description.
python/samples/02-agents/harness/harness_research.py	Demonstrates harness looping with mode-gated todo completion and a max-iteration safety cap.
python/packages/core/tests/core/test_harness_loop.py	Replaces old todo helper tests with new provider-resolving todo helper and approval escape hatch tests.
python/packages/core/tests/core/test_harness_agent.py	Adds tests asserting loop middleware is wired only when configured and that it is outermost relative to tool approval and user middleware.
python/packages/core/AGENTS.md	Updates public docs to describe new todo-loop helpers and harness integration.
python/packages/core/agent_framework/_harness/_loop.py	Adds approval escape hatch detection and implements the new `todos_remaining` / `todos_remaining_message` helpers.
python/packages/core/agent_framework/_harness/_agent.py	Extends `create_harness_agent` to accept loop params and wire `AgentLoopMiddleware` outermost when enabled.
python/packages/core/agent_framework/init.py	Exports `todos_remaining_message` at the package level.

github-actions

Automated Code Review

Reviewers: 4 | Confidence: 90% | Result: All clear

Reviewed: Correctness, Security Reliability, Test Coverage, Failure Modes

Automated review by westey-m's agents

github-actions

Automated Code Review

Reviewers: 5 | Confidence: 90%

✓ Correctness

The PR correctly integrates loping into HarnessAgent with proper escape-hatch semantics, middleware ordering, and a well-designed todos_remaining/todos_remaining_message API. The approval escape hatch correctly fires before should_continue evaluation in both streaming and non-streaming paths. The _resolve_context_provider helper safely resolves via getattr(agent, 'context_providers', []). Mode gating uses case-insensitive set matching built at factory time. Middleware insertion with insert(0, ...) correctly positions the loop outermost. All paths are covered by comprehensive tests.

✓ Security Reliability

This PR integrates loping into the harness agent with a well-designed approval escape hatch, provider resolution via context_providers, and mode-gating. The implementation is security-sound: the escape hatch correctly stops before evaluating should_continue/max_iterations (preventing HITL bypass), the loop has a safe default cap (DEFAULT_MAX_ITERATIONS=10), input validation rejects empty modes, and provider resolution is defensive (getattr with default). No injection risks, resource leaks, or unhandled failure modes were identified.

✓ Test Coverage

Test coverage for the new loop integration is thorough: harness wiring (7 tests), todos_remaining/todos_remaining_message helpers (6 tests), and the approval escape hatch (4 tests) are all well-covered with meaningful assertions. One branch in todos_remaining — the fallback when modes is specified but no AgentModeProvider is registered — is not exercised by any test.

✓ Failure Modes

The PR is well-implemented. The approval escape hatch correctly preserves the response in both streaming (holder['final'] set before the check) and non-streaming (aggregated includes the approval messages before break) paths. The todos_remaining/todos_remaining_message helpers properly guard against missing session/agent/provider by returning False/None. Provider resolution via _resolve_context_provider safely uses getattr with default [] and next() with default None. No silent failures, lost errors, or stale state issues were found.

✗ Design Approach

I found one design-level correctness issue in the new todo-loop helper API: the provider lookup strategy does not match the newly documented "works with any agent that registers a TodoProvider via context_providers" contract when more than one matching provider is present.

Flagged Issues

todos_remaining() can silently ignore a caller-supplied TodoProvider when used with create_harness_agent(..., context_providers=[...]), because the harness appends extra providers after its built-ins (_harness/_agent.py:150-175) but _resolve_context_provider() always returns the first match (_harness/_loop.py:816-818). The helper needs to disambiguate duplicate providers or reject them instead of always taking the first match.

Suggestions

Consider adding a test for todos_remaining(modes=[...]) when the agent has a TodoProvider but no AgentModeProvider — this exercises the else-branch fallback at _loop.py:869 (get_agent_mode(session) without provider-specific config).

Automated review by westey-m's agents

github-actions · 2026-06-18T18:16:29Z

Flagged issue

todos_remaining() can silently ignore a caller-supplied TodoProvider when used with create_harness_agent(..., context_providers=[...]), because the harness appends extra providers after its built-ins (_harness/_agent.py:150-175) but _resolve_context_provider() always returns the first match (_harness/_loop.py:816-818). The helper needs to disambiguate duplicate providers or reject them instead of always taking the first match.

Source: automated DevFlow PR review

Integrate looping into harness

2a13790

Copilot AI review requested due to automatic review settings June 18, 2026 17:55

Copilot started reviewing on behalf of westey-m June 18, 2026 17:56 View session

moonbox3 added documentation Improvements or additions to documentation python Issues related to the Python codebase breaking change Introduces changes that are not backward compatible and may require updates to dependent code. labels Jun 18, 2026

Copilot AI reviewed Jun 18, 2026

View reviewed changes

Comment thread python/packages/core/agent_framework/_harness/_loop.py Outdated

Comment thread python/packages/core/AGENTS.md Outdated

github-actions Bot reviewed Jun 18, 2026

View reviewed changes

Address PR comments

eb9de4e

westey-m marked this pull request as ready for review June 18, 2026 18:09

github-actions Bot reviewed Jun 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: [BREAKING] Integrate looping into HarnessAgent#6607

Python: [BREAKING] Integrate looping into HarnessAgent#6607
westey-m wants to merge 2 commits into
microsoft:mainfrom
westey-m:python-loop-harness-integrate

westey-m commented Jun 18, 2026

Uh oh!

github-actions Bot commented Jun 18, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

westey-m commented Jun 18, 2026

Motivation & Context

Description & Review Guide

Related Issue

Contribution Checklist

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python Unit Test Overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Automated Code Review

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Automated Code Review

✓ Correctness

✓ Security Reliability

✓ Test Coverage

✓ Failure Modes

✗ Design Approach

Flagged Issues

Suggestions

Uh oh!

github-actions Bot commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions Bot commented Jun 18, 2026 •

edited

Loading