Summary
The Python AG-UI FastAPI endpoint (add_agent_framework_fastapi_endpoint) serves the event stream as a bare StreamingResponse(media_type="text/event-stream") with no application-level SSE keepalive/heartbeat. When an agent run has a long silent gap (a server-side tool that emits no AG-UI events for tens of seconds — OCR, long retrieval, slow first-token), the SSE connection sits idle, and any idle-timeout-bearing hop in front of the client (Azure Static Web Apps / Container Apps ingress, Vercel, nginx, API gateways) drops the connection. The client then sees a spurious HTTP 500 / backend call failure even though the server is healthy and still working.
Please add a configurable SSE ping/keepalive interval so the endpoint emits periodic : comment lines (an SSE no-op, invisible to parsers) during idle periods.
Where
python/packages/ag-ui/agent_framework_ag_ui/_endpoint.py returns a plain StreamingResponse with no ping/heartbeat:
|
return StreamingResponse( |
|
event_generator(), |
|
media_type="text/event-stream", |
|
headers={ |
|
"Cache-Control": "no-cache", |
|
"Connection": "keep-alive", |
|
"X-Accel-Buffering": "no", |
|
}, |
return StreamingResponse(
event_generator(),
media_type="text/event-stream",
headers={
"Cache-Control": "no-cache",
"Connection": "keep-alive",
"X-Accel-Buffering": "no",
},
)
Note: the Connection: keep-alive header is TCP-level only; it does not emit any application-level heartbeat during an idle stream. (agent-framework-ag-ui==1.0.0rc5.)
Why this is structural, not app-specific
Any AG-UI agent whose tools do non-trivial silent work will produce idle gaps. Deployed behind a serverless/proxy front door with an idle timeout, that gap terminates the stream. The application can't reliably avoid it: an in-app keepalive (yielding extra AG-UI events from the agent generator) is buffered/coalesced by the run/emitter pipeline and does not reliably flush during the gap — the heartbeat has to be at the transport layer (the StreamingResponse itself), which only the framework controls.
For symmetry, the Pydantic AI AG-UI server appears to use a plain Starlette StreamingResponse as well, i.e. the same gap exists there — suggesting this is a general AG-UI adapter-layer need rather than a Microsoft-specific quirk.
Proposed fix
Emit periodic SSE comment heartbeats during idle, with a configurable interval (default off or a sane value like 15s), e.g. one of:
- Use
sse-starlette's EventSourceResponse(..., ping=N) (it auto-emits : ping comments), or
- Keep
StreamingResponse but wrap event_generator() so that when no event is produced within N seconds it yields ": ping\n\n".
Expose it via add_agent_framework_fastapi_endpoint(..., sse_keepalive_interval=N) (and/or an env var), defaulting to a safe value.
Repro (observed)
A real run with a ~50–90s silent server-side OCR tool, served by the AG-UI endpoint behind Azure Static Web Apps, returned HTTP 500: Backend call failure to the CopilotKit client; packet capture showed a ~58s gap with zero data:/: lines. Adding a transport-level : ping every ~10s (as an ASGI middleware in front of the endpoint) eliminated the failure (max inter-line gap dropped from ~58s to ~11s) — which is exactly the behavior a built-in ping= would provide.
Workaround (today)
An ASGI middleware that wraps the app, decouples the downstream send via a queue, and emits : ping\n\n on text/event-stream responses whenever the queue is idle for N seconds. Works, but every AG-UI-over-serverless deployment re-implementing this is a sign the adapter should own it.
Summary
The Python AG-UI FastAPI endpoint (
add_agent_framework_fastapi_endpoint) serves the event stream as a bareStreamingResponse(media_type="text/event-stream")with no application-level SSE keepalive/heartbeat. When an agent run has a long silent gap (a server-side tool that emits no AG-UI events for tens of seconds — OCR, long retrieval, slow first-token), the SSE connection sits idle, and any idle-timeout-bearing hop in front of the client (Azure Static Web Apps / Container Apps ingress, Vercel, nginx, API gateways) drops the connection. The client then sees a spuriousHTTP 500 / backend call failureeven though the server is healthy and still working.Please add a configurable SSE ping/keepalive interval so the endpoint emits periodic
:comment lines (an SSE no-op, invisible to parsers) during idle periods.Where
python/packages/ag-ui/agent_framework_ag_ui/_endpoint.pyreturns a plainStreamingResponsewith noping/heartbeat:agent-framework/python/packages/ag-ui/agent_framework_ag_ui/_endpoint.py
Lines 211 to 218 in 7f7c88b
Note: the
Connection: keep-aliveheader is TCP-level only; it does not emit any application-level heartbeat during an idle stream. (agent-framework-ag-ui==1.0.0rc5.)Why this is structural, not app-specific
Any AG-UI agent whose tools do non-trivial silent work will produce idle gaps. Deployed behind a serverless/proxy front door with an idle timeout, that gap terminates the stream. The application can't reliably avoid it: an in-app keepalive (yielding extra AG-UI events from the agent generator) is buffered/coalesced by the run/emitter pipeline and does not reliably flush during the gap — the heartbeat has to be at the transport layer (the
StreamingResponseitself), which only the framework controls.For symmetry, the Pydantic AI AG-UI server appears to use a plain Starlette
StreamingResponseas well, i.e. the same gap exists there — suggesting this is a general AG-UI adapter-layer need rather than a Microsoft-specific quirk.Proposed fix
Emit periodic SSE comment heartbeats during idle, with a configurable interval (default off or a sane value like 15s), e.g. one of:
sse-starlette'sEventSourceResponse(..., ping=N)(it auto-emits: pingcomments), orStreamingResponsebut wrapevent_generator()so that when no event is produced withinNseconds it yields": ping\n\n".Expose it via
add_agent_framework_fastapi_endpoint(..., sse_keepalive_interval=N)(and/or an env var), defaulting to a safe value.Repro (observed)
A real run with a ~50–90s silent server-side OCR tool, served by the AG-UI endpoint behind Azure Static Web Apps, returned
HTTP 500: Backend call failureto the CopilotKit client; packet capture showed a ~58s gap with zerodata:/:lines. Adding a transport-level: pingevery ~10s (as an ASGI middleware in front of the endpoint) eliminated the failure (max inter-line gap dropped from ~58s to ~11s) — which is exactly the behavior a built-inping=would provide.Workaround (today)
An ASGI middleware that wraps the app, decouples the downstream
sendvia a queue, and emits: ping\n\nontext/event-streamresponses whenever the queue is idle for N seconds. Works, but every AG-UI-over-serverless deployment re-implementing this is a sign the adapter should own it.