Skip to content

Python: Add hosting core and Responses channel#6580

Open
eavanvalkenburg wants to merge 6 commits into
microsoft:mainfrom
eavanvalkenburg:eavan/python-hosting-core-responses
Open

Python: Add hosting core and Responses channel#6580
eavanvalkenburg wants to merge 6 commits into
microsoft:mainfrom
eavanvalkenburg:eavan/python-hosting-core-responses

Conversation

@eavanvalkenburg

@eavanvalkenburg eavanvalkenburg commented Jun 18, 2026

Copy link
Copy Markdown
Member

Motivation & Context

Adds the initial Python hosting implementation for exposing one Agent Framework agent or workflow through a shared host and the OpenAI Responses-shaped channel.

This is the first implementation slice after the hosting/channel ADRs: it introduces the channel-neutral host package, the Responses channel package, and local samples that show the run-hook and workflow-checkpoint seams without pulling in later channel packages.

Description & Review Guide

  • What are the major changes?
    • Adds agent-framework-hosting with AgentFrameworkHost, channel contribution primitives, host-owned invocation hooks, explicit session continuity via ChannelSession(isolation_key=...), workflow checkpoint wiring, and Foundry isolation middleware.
    • Adds agent-framework-hosting-responses, a channel package that maps Responses-style requests/streams onto the host and renders Responses-compatible output.
    • Adds two local samples under samples/04-hosting/af-hosting: a minimal Responses agent sample and a Responses-hosted workflow sample with structured intake/checkpoints.
  • What is the impact of these changes?
    • Applications can expose an agent or workflow through the Responses protocol using one shared host instead of hand-composing Starlette routes and lifecycle handling.
    • The base host remains channel-neutral; later protocol channels can land in separate PRs.
  • What do you want reviewers to focus on?
    • Whether the base host and Responses channel boundary is clear enough for future channel packages.
    • Whether the local samples are useful as the first learning path for run hooks, workflow targets, and checkpointing.

Related Issue

Fixes #6585
Refs #6265

Contribution Checklist

  • The code builds clean without any errors or warnings
  • All unit tests pass, and I have added new tests where possible
  • The PR follows the Contribution Guidelines
  • This PR is linked to an issue and there is no other open PR for this issue (see Related Issue above).
  • This is not a breaking change. If it is a breaking change, add the breaking change label (or add "[BREAKING]" to the title prefix, before or after any language prefix) — a workflow keeps the label and title prefix in sync automatically.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 18, 2026 05:56
@moonbox3 moonbox3 added documentation Improvements or additions to documentation python Issues related to the Python codebase labels Jun 18, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds the first Python “hosting core + Responses channel” slice to the Agent Framework Python monorepo, enabling a channel-neutral Starlette host (agent-framework-hosting) and an OpenAI Responses-shaped HTTP/SSE channel (agent-framework-hosting-responses), plus local samples demonstrating run hooks and workflow checkpointing.

Changes:

  • Introduces AgentFrameworkHost + channel contribution primitives, session continuity, optional disk-backed session-alias persistence, and workflow checkpoint wiring.
  • Adds ResponsesChannel that parses Responses requests, invokes the host, and renders Responses-compatible JSON/SSE outputs.
  • Adds samples and a comprehensive unit test suite covering isolation middleware, host behavior (agent/workflow), checkpointing, and Responses parsing/channel behavior.

Reviewed changes

Copilot reviewed 35 out of 39 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
python/uv.lock Adds new workspace packages + deps
python/pyproject.toml Registers hosting packages in workspace + pyright envs
python/PACKAGE_STATUS.md Marks new packages as alpha
python/samples/04-hosting/af-hosting/README.md Top-level hosting samples guide
python/samples/04-hosting/af-hosting/local_responses/README.md Minimal Responses sample instructions
python/samples/04-hosting/af-hosting/local_responses/pyproject.toml Sample deps + uv sources wiring
python/samples/04-hosting/af-hosting/local_responses/call_server.py Local OpenAI SDK client for sample
python/samples/04-hosting/af-hosting/local_responses/app.py Responses-hosted agent sample app
python/samples/04-hosting/af-hosting/local_responses_workflow/storage/checkpoints/.gitkeep Keeps checkpoints dir in repo
python/samples/04-hosting/af-hosting/local_responses_workflow/README.md Workflow + checkpoints sample guide
python/samples/04-hosting/af-hosting/local_responses_workflow/pyproject.toml Workflow sample deps + uv sources
python/samples/04-hosting/af-hosting/local_responses_workflow/call_server.rest REST examples for Responses endpoint
python/samples/04-hosting/af-hosting/local_responses_workflow/call_server.py OpenAI SDK client for workflow sample
python/samples/04-hosting/af-hosting/local_responses_workflow/app.py Responses-hosted workflow sample app
python/packages/hosting/README.md Hosting package documentation + quickstart
python/packages/hosting/LICENSE MIT license for hosting package
python/packages/hosting/pyproject.toml Package metadata, deps, tooling config
python/packages/hosting/agent_framework_hosting/init.py Public exports for hosting package
python/packages/hosting/agent_framework_hosting/_types.py Channel-neutral envelope + protocols
python/packages/hosting/agent_framework_hosting/_isolation.py Isolation contextvar + header constants
python/packages/hosting/agent_framework_hosting/_persistence.py State-dir normalization + lock helpers
python/packages/hosting/agent_framework_hosting/_state_store.py Diskcache-backed session alias store
python/packages/hosting/agent_framework_hosting/_host.py Starlette host + session/checkpoint logic
python/packages/hosting/tests/init.py Test package marker
python/packages/hosting/tests/conftest.py Loads workflow fixtures for tests
python/packages/hosting/tests/_workflow_fixtures.py Simple workflows for host tests
python/packages/hosting/tests/test_types.py Tests for envelope types
python/packages/hosting/tests/test_isolation.py Tests for isolation middleware/contextvars
python/packages/hosting/tests/test_host.py Extensive host behavior tests
python/packages/hosting/tests/test_host_disk.py Disk-backed session-alias persistence tests
python/packages/hosting-responses/README.md Responses channel package documentation
python/packages/hosting-responses/LICENSE MIT license for responses channel
python/packages/hosting-responses/pyproject.toml Package metadata + deps (openai, hosting)
python/packages/hosting-responses/agent_framework_hosting_responses/init.py Public exports for responses channel
python/packages/hosting-responses/agent_framework_hosting_responses/_parsing.py Parses Responses input/options/identity
python/packages/hosting-responses/agent_framework_hosting_responses/_channel.py ResponsesChannel HTTP/SSE implementation
python/packages/hosting-responses/tests/init.py Test package marker
python/packages/hosting-responses/tests/test_parsing.py Unit tests for request parsing helpers
python/packages/hosting-responses/tests/test_channel.py End-to-end Starlette TestClient channel tests

Comment thread python/packages/hosting-responses/agent_framework_hosting_responses/_channel.py Outdated
Comment thread python/packages/hosting/agent_framework_hosting/_host.py Outdated

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 5 | Confidence: 84%

✓ Correctness

The hosting core and types are well-structured with solid path-traversal defense, proper contextvar lifecycle, and correct ExitStack management across streaming boundaries. One correctness bug found: _wrap_input silently drops user content when the input is a Sequence[str | Content] (a valid AgentRunInputs variant) because it wraps the entire sequence as a single contents item rather than spreading it.

✓ Security Reliability

The new hosting packages demonstrate solid security practices overall: thorough input validation in _parsing.py, proper contextvar scoping in the isolation middleware with reset-in-finally, and strong path-traversal protection for checkpoint directories. The one notable concern is information disclosure in the SSE streaming error path, where raw Python exception messages are forwarded to the client. The non-streaming path correctly delegates error handling to Starlette (which returns generic 500s in non-debug mode), but the streaming path catches exceptions and embeds str(exc) into the SSE response.failed event, creating an asymmetry that could leak internal details (file paths, connection strings, downstream service URLs) in production. The hosting core is well-defended against the most important security risk (CWE-22 path traversal in checkpoint paths) with a layered denylist + is_relative_to check. The isolation middleware is properly gated behind the FOUNDRY_HOSTING_ENVIRONMENT env var and correctly resets the contextvar in a finally block. The main reliability concern is the unbounded in-memory _sessions dict that grows with each unique isolation_key and is never evicted, which could lead to memory exhaustion in long-lived hosts with many users.

✓ Test Coverage

The test suites for the new hosting and hosting-responses packages are generally well-structured with good end-to-end coverage of the ResponsesChannel and thorough unit tests for the parsing module. However, there are notable gaps: the input_file parsing branch (3 sub-paths including error handling) has zero test coverage, the response_id_factory constructor parameter (with security-relevant documentation) is untested, and the validation for non-Mapping list items in messages_from_responses_input is uncovered. The PR introduces substantial new hosting infrastructure with good test coverage for the core happy paths (host invocation, session caching, workflow dispatch, checkpoint path traversal, isolation middleware, bind-request-context lifecycle). However, there are notable gaps: the disk-backed _PersistedDict mutation methods (pop, clear, update, __delitem__) have no direct tests verifying that disk state is correctly mirrored, and the _suppress_already_consumed context manager's three distinct exception-handling branches are untested. These are important for durability and operational correctness respectively. The test suite for the new hosting packages is comprehensive, covering the core host wiring, invocation paths, session caching, workflow targets, checkpointing with path-traversal hardening, streaming lifecycle, context binding, isolation middleware, and the Responses channel end-to-end. Two production modules (_persistence.py and _state_store.py) have no direct unit tests—they are exercised only indirectly through test_host_disk.py. The input_file parsing branch in _parsing.py has no test coverage. The serve() method is untested (marked pragma: no cover). Overall, the coverage for core behavior is strong; the gaps are in lower-level persistence primitives and one parsing branch.

✓ Failure Modes

The new hosting and hosting-responses packages are well-structured with good error handling in most paths. The streaming path has proper try/except with structured error responses. The disk-persistence layer handles write failures gracefully with logging fallbacks. I found one concrete operational failure mode: the response_id_factory comment promises zero-arg factory support but the call site always passes a positional argument, causing a TypeError for users who follow that guidance. The non-streaming error path omits structured Responses-API error envelopes (returning raw Starlette 500s instead), but this is standard HTTP behavior and consistent with the streaming path necessarily handling errors differently since the 200 status has already been sent. The hosting core is well-structured with thorough error aggregation in lifespan callbacks and solid path-traversal defence on checkpoint paths. Two concrete failure-mode issues stand out: (1) _HostResponseStream.get_final_response() wraps the result in HostedRunResult(result) without forwarding the session, creating an asymetry with the non-streaming path (_invoke at line 1112 passes session=run_kwargs.get("session")), so any streaming response_hook that reads result.session silently receives None; (2) in _invoke_stream, the ExitStack returned by _bind_request_context enters context-manager bindings immediately, but if self.target.run(stream=True, ...) raises synchronously before the stack is handed to _BoundResponseStream, those bindings are never exited — a resource/state leak.

✗ Design Approach

I found two design-level issues in the new Responses channel. The non-streaming path does not actually support workflow targets even though the host contract does, because it unconditionally reads .text from the result. Separately, the request parser silently drops malformed nested message.content items instead of rejecting the request, so bad Responses payloads can reach the target with content removed. I found one design issue in the new hosting core: the single-path state_dir convenience couples workflow checkpoint persistence to the optional disk-backed session-alias store. As written, a documented workflow configuration can fail at host construction unless diskcache is installed, even though checkpointing itself does not require that dependency. The new local hosting samples both advertise previous_response_id-based conversation continuity, but outside Foundry they never seed a stable first-turn anchor. As written, the plain-agent sample stores turn 1 under a random history session and the workflow sample skips checkpointing on turn 1 entirely, so the follow-up call using the returned response.id does not actually resume the earlier local conversation.

Flagged Issues

  • python/packages/hosting-responses/agent_framework_hosting_responses/_channel.py:225 breaks non-streaming workflow hosting: ResponsesChannel assumes result.result.text, but the host's workflow path returns HostedRunResult[WorkflowRunResult] and explicitly leaves output projection to the channel (python/packages/hosting/agent_framework_hosting/_host.py:1165-118).
  • python/samples/04-hosting/af-hosting/local_responses/app.py:99 wires a FileHistoryProvider, but the sample's documented --previous-response-id flow does not continue the first local turn because no host session is created until a later request supplies previous_response_id.
  • python/samples/04-hosting/af-hosting/local_responses_workflow/app.py:211 enables checkpointing, but the first local request has no ChannelRequest.session, so no checkpoint is written; the next request using the prior response.id starts a fresh workflow instead of resuming.

Suggestions

  • Validate every nested message.content item in python/packages/hosting-responses/agent_framework_hosting_responses/_parsing.py:115-120 instead of filtering out non-mapping entries, so malformed Responses payloads return 422 rather than silently losing content.

Automated review by eavanvalkenburg's agents

Comment thread python/packages/hosting-responses/agent_framework_hosting_responses/_channel.py Outdated
Comment thread python/samples/04-hosting/af-hosting/local_responses/app.py
@github-actions

Copy link
Copy Markdown
Contributor

Flagged issue

python/packages/hosting-responses/agent_framework_hosting_responses/_channel.py:225 breaks non-streaming workflow hosting: ResponsesChannel assumes result.result.text, but the host's workflow path returns HostedRunResult[WorkflowRunResult] and explicitly leaves output projection to the channel (python/packages/hosting/agent_framework_hosting/_host.py:1165-118).


Source: automated DevFlow PR review

@github-actions

Copy link
Copy Markdown
Contributor

Flagged issue

python/samples/04-hosting/af-hosting/local_responses/app.py:99 wires a FileHistoryProvider, but the sample's documented --previous-response-id flow does not continue the first local turn because no host session is created until a later request supplies previous_response_id.


Source: automated DevFlow PR review

@github-actions

Copy link
Copy Markdown
Contributor

Flagged issue

python/samples/04-hosting/af-hosting/local_responses_workflow/app.py:211 enables checkpointing, but the first local request has no ChannelRequest.session, so no checkpoint is written; the next request using the prior response.id starts a fresh workflow instead of resuming.


Source: automated DevFlow PR review

Comment thread python/packages/hosting/agent_framework_hosting/_host.py
Comment thread python/packages/hosting/agent_framework_hosting/_host.py
Comment thread python/packages/hosting-responses/agent_framework_hosting_responses/_channel.py Outdated
eavanvalkenburg and others added 5 commits June 18, 2026 09:00
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions

Copy link
Copy Markdown
Contributor

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/hosting-responses/agent_framework_hosting_responses
   _channel.py116199%374
   _parsing.py80396%106, 122, 129
packages/hosting/agent_framework_hosting
   _host.py4355088%88, 144, 159, 187, 243, 253–255, 257–258, 264, 360, 368–369, 372, 374, 381, 387, 390–392, 395, 413, 739–740, 743, 750–754, 760–761, 764, 770–771, 918–919, 926, 1036, 1073, 1238–1241, 1323–1324, 1329–1331
   _isolation.py190100% 
   _persistence.py641773%51, 53–57, 66–68, 72–76, 83, 123, 126
   _state_store.py853460%40–43, 78, 81–83, 87–88, 98–102, 108–112, 115, 119–126, 137–141
   _types.py69592%95–97, 108–109
TOTAL40718460788% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
8134 34 💤 0 ❌ 0 🔥 2m 12s ⏱️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation python Issues related to the Python codebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hosting: core host and Responses channel

3 participants