Python: Add hosting core and Responses channel#6580
Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds the first Python “hosting core + Responses channel” slice to the Agent Framework Python monorepo, enabling a channel-neutral Starlette host (agent-framework-hosting) and an OpenAI Responses-shaped HTTP/SSE channel (agent-framework-hosting-responses), plus local samples demonstrating run hooks and workflow checkpointing.
Changes:
- Introduces
AgentFrameworkHost+ channel contribution primitives, session continuity, optional disk-backed session-alias persistence, and workflow checkpoint wiring. - Adds
ResponsesChannelthat parses Responses requests, invokes the host, and renders Responses-compatible JSON/SSE outputs. - Adds samples and a comprehensive unit test suite covering isolation middleware, host behavior (agent/workflow), checkpointing, and Responses parsing/channel behavior.
Reviewed changes
Copilot reviewed 35 out of 39 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| python/uv.lock | Adds new workspace packages + deps |
| python/pyproject.toml | Registers hosting packages in workspace + pyright envs |
| python/PACKAGE_STATUS.md | Marks new packages as alpha |
| python/samples/04-hosting/af-hosting/README.md | Top-level hosting samples guide |
| python/samples/04-hosting/af-hosting/local_responses/README.md | Minimal Responses sample instructions |
| python/samples/04-hosting/af-hosting/local_responses/pyproject.toml | Sample deps + uv sources wiring |
| python/samples/04-hosting/af-hosting/local_responses/call_server.py | Local OpenAI SDK client for sample |
| python/samples/04-hosting/af-hosting/local_responses/app.py | Responses-hosted agent sample app |
| python/samples/04-hosting/af-hosting/local_responses_workflow/storage/checkpoints/.gitkeep | Keeps checkpoints dir in repo |
| python/samples/04-hosting/af-hosting/local_responses_workflow/README.md | Workflow + checkpoints sample guide |
| python/samples/04-hosting/af-hosting/local_responses_workflow/pyproject.toml | Workflow sample deps + uv sources |
| python/samples/04-hosting/af-hosting/local_responses_workflow/call_server.rest | REST examples for Responses endpoint |
| python/samples/04-hosting/af-hosting/local_responses_workflow/call_server.py | OpenAI SDK client for workflow sample |
| python/samples/04-hosting/af-hosting/local_responses_workflow/app.py | Responses-hosted workflow sample app |
| python/packages/hosting/README.md | Hosting package documentation + quickstart |
| python/packages/hosting/LICENSE | MIT license for hosting package |
| python/packages/hosting/pyproject.toml | Package metadata, deps, tooling config |
| python/packages/hosting/agent_framework_hosting/init.py | Public exports for hosting package |
| python/packages/hosting/agent_framework_hosting/_types.py | Channel-neutral envelope + protocols |
| python/packages/hosting/agent_framework_hosting/_isolation.py | Isolation contextvar + header constants |
| python/packages/hosting/agent_framework_hosting/_persistence.py | State-dir normalization + lock helpers |
| python/packages/hosting/agent_framework_hosting/_state_store.py | Diskcache-backed session alias store |
| python/packages/hosting/agent_framework_hosting/_host.py | Starlette host + session/checkpoint logic |
| python/packages/hosting/tests/init.py | Test package marker |
| python/packages/hosting/tests/conftest.py | Loads workflow fixtures for tests |
| python/packages/hosting/tests/_workflow_fixtures.py | Simple workflows for host tests |
| python/packages/hosting/tests/test_types.py | Tests for envelope types |
| python/packages/hosting/tests/test_isolation.py | Tests for isolation middleware/contextvars |
| python/packages/hosting/tests/test_host.py | Extensive host behavior tests |
| python/packages/hosting/tests/test_host_disk.py | Disk-backed session-alias persistence tests |
| python/packages/hosting-responses/README.md | Responses channel package documentation |
| python/packages/hosting-responses/LICENSE | MIT license for responses channel |
| python/packages/hosting-responses/pyproject.toml | Package metadata + deps (openai, hosting) |
| python/packages/hosting-responses/agent_framework_hosting_responses/init.py | Public exports for responses channel |
| python/packages/hosting-responses/agent_framework_hosting_responses/_parsing.py | Parses Responses input/options/identity |
| python/packages/hosting-responses/agent_framework_hosting_responses/_channel.py | ResponsesChannel HTTP/SSE implementation |
| python/packages/hosting-responses/tests/init.py | Test package marker |
| python/packages/hosting-responses/tests/test_parsing.py | Unit tests for request parsing helpers |
| python/packages/hosting-responses/tests/test_channel.py | End-to-end Starlette TestClient channel tests |
There was a problem hiding this comment.
Automated Code Review
Reviewers: 5 | Confidence: 84%
✓ Correctness
The hosting core and types are well-structured with solid path-traversal defense, proper contextvar lifecycle, and correct ExitStack management across streaming boundaries. One correctness bug found:
_wrap_inputsilently drops user content when the input is aSequence[str | Content](a validAgentRunInputsvariant) because it wraps the entire sequence as a singlecontentsitem rather than spreading it.
✓ Security Reliability
The new hosting packages demonstrate solid security practices overall: thorough input validation in _parsing.py, proper contextvar scoping in the isolation middleware with reset-in-finally, and strong path-traversal protection for checkpoint directories. The one notable concern is information disclosure in the SSE streaming error path, where raw Python exception messages are forwarded to the client. The non-streaming path correctly delegates error handling to Starlette (which returns generic 500s in non-debug mode), but the streaming path catches exceptions and embeds
str(exc)into the SSEresponse.failedevent, creating an asymmetry that could leak internal details (file paths, connection strings, downstream service URLs) in production. The hosting core is well-defended against the most important security risk (CWE-22 path traversal in checkpoint paths) with a layered denylist + is_relative_to check. The isolation middleware is properly gated behind the FOUNDRY_HOSTING_ENVIRONMENT env var and correctly resets the contextvar in a finally block. The main reliability concern is the unbounded in-memory _sessions dict that grows with each unique isolation_key and is never evicted, which could lead to memory exhaustion in long-lived hosts with many users.
✓ Test Coverage
The test suites for the new hosting and hosting-responses packages are generally well-structured with good end-to-end coverage of the ResponsesChannel and thorough unit tests for the parsing module. However, there are notable gaps: the
input_fileparsing branch (3 sub-paths including error handling) has zero test coverage, theresponse_id_factoryconstructor parameter (with security-relevant documentation) is untested, and the validation for non-Mapping list items inmessages_from_responses_inputis uncovered. The PR introduces substantial new hosting infrastructure with good test coverage for the core happy paths (host invocation, session caching, workflow dispatch, checkpoint path traversal, isolation middleware, bind-request-context lifecycle). However, there are notable gaps: the disk-backed_PersistedDictmutation methods (pop,clear,update,__delitem__) have no direct tests verifying that disk state is correctly mirrored, and the_suppress_already_consumedcontext manager's three distinct exception-handling branches are untested. These are important for durability and operational correctness respectively. The test suite for the new hosting packages is comprehensive, covering the core host wiring, invocation paths, session caching, workflow targets, checkpointing with path-traversal hardening, streaming lifecycle, context binding, isolation middleware, and the Responses channel end-to-end. Two production modules (_persistence.pyand_state_store.py) have no direct unit tests—they are exercised only indirectly throughtest_host_disk.py. Theinput_fileparsing branch in_parsing.pyhas no test coverage. Theserve()method is untested (marked pragma: no cover). Overall, the coverage for core behavior is strong; the gaps are in lower-level persistence primitives and one parsing branch.
✓ Failure Modes
The new hosting and hosting-responses packages are well-structured with good error handling in most paths. The streaming path has proper try/except with structured error responses. The disk-persistence layer handles write failures gracefully with logging fallbacks. I found one concrete operational failure mode: the
response_id_factorycomment promises zero-arg factory support but the call site always passes a positional argument, causing a TypeError for users who follow that guidance. The non-streaming error path omits structured Responses-API error envelopes (returning raw Starlette 500s instead), but this is standard HTTP behavior and consistent with the streaming path necessarily handling errors differently since the 200 status has already been sent. The hosting core is well-structured with thorough error aggregation in lifespan callbacks and solid path-traversal defence on checkpoint paths. Two concrete failure-mode issues stand out: (1)_HostResponseStream.get_final_response()wraps the result inHostedRunResult(result)without forwarding the session, creating an asymetry with the non-streaming path (_invokeat line 1112 passessession=run_kwargs.get("session")), so any streamingresponse_hookthat readsresult.sessionsilently receivesNone; (2) in_invoke_stream, theExitStackreturned by_bind_request_contextenters context-manager bindings immediately, but ifself.target.run(stream=True, ...)raises synchronously before the stack is handed to_BoundResponseStream, those bindings are never exited — a resource/state leak.
✗ Design Approach
I found two design-level issues in the new Responses channel. The non-streaming path does not actually support workflow targets even though the host contract does, because it unconditionally reads
.textfrom the result. Separately, the request parser silently drops malformed nestedmessage.contentitems instead of rejecting the request, so bad Responses payloads can reach the target with content removed. I found one design issue in the new hosting core: the single-pathstate_dirconvenience couples workflow checkpoint persistence to the optional disk-backed session-alias store. As written, a documented workflow configuration can fail at host construction unlessdiskcacheis installed, even though checkpointing itself does not require that dependency. The new local hosting samples both advertiseprevious_response_id-based conversation continuity, but outside Foundry they never seed a stable first-turn anchor. As written, the plain-agent sample stores turn 1 under a random history session and the workflow sample skips checkpointing on turn 1 entirely, so the follow-up call using the returnedresponse.iddoes not actually resume the earlier local conversation.
Flagged Issues
-
python/packages/hosting-responses/agent_framework_hosting_responses/_channel.py:225breaks non-streaming workflow hosting:ResponsesChannelassumesresult.result.text, but the host's workflow path returnsHostedRunResult[WorkflowRunResult]and explicitly leaves output projection to the channel (python/packages/hosting/agent_framework_hosting/_host.py:1165-118). -
python/samples/04-hosting/af-hosting/local_responses/app.py:99wires aFileHistoryProvider, but the sample's documented--previous-response-idflow does not continue the first local turn because no host session is created until a later request suppliesprevious_response_id. -
python/samples/04-hosting/af-hosting/local_responses_workflow/app.py:211enables checkpointing, but the first local request has noChannelRequest.session, so no checkpoint is written; the next request using the priorresponse.idstarts a fresh workflow instead of resuming.
Suggestions
- Validate every nested
message.contentitem inpython/packages/hosting-responses/agent_framework_hosting_responses/_parsing.py:115-120instead of filtering out non-mapping entries, so malformed Responses payloads return 422 rather than silently losing content.
Automated review by eavanvalkenburg's agents
|
Flagged issue
Source: automated DevFlow PR review |
|
Flagged issue
Source: automated DevFlow PR review |
|
Flagged issue
Source: automated DevFlow PR review |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Python Test Coverage Report •
Python Unit Test Overview
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Motivation & Context
Adds the initial Python hosting implementation for exposing one Agent Framework agent or workflow through a shared host and the OpenAI Responses-shaped channel.
This is the first implementation slice after the hosting/channel ADRs: it introduces the channel-neutral host package, the Responses channel package, and local samples that show the run-hook and workflow-checkpoint seams without pulling in later channel packages.
Description & Review Guide
agent-framework-hostingwithAgentFrameworkHost, channel contribution primitives, host-owned invocation hooks, explicit session continuity viaChannelSession(isolation_key=...), workflow checkpoint wiring, and Foundry isolation middleware.agent-framework-hosting-responses, a channel package that maps Responses-style requests/streams onto the host and renders Responses-compatible output.samples/04-hosting/af-hosting: a minimal Responses agent sample and a Responses-hosted workflow sample with structured intake/checkpoints.Related Issue
Fixes #6585
Refs #6265
Contribution Checklist
breaking changelabel (or add "[BREAKING]" to the title prefix, before or after any language prefix) — a workflow keeps the label and title prefix in sync automatically.