RFC: Agent protocol support for apx apps#142
Open
stuart-gano wants to merge 43 commits intodatabricks-solutions:mainfrom
Open
RFC: Agent protocol support for apx apps#142stuart-gano wants to merge 43 commits intodatabricks-solutions:mainfrom
stuart-gano wants to merge 43 commits intodatabricks-solutions:mainfrom
Conversation
Proposes generating agent protocol endpoints (invocations, A2A discovery, MCP tools, eval bridge) from existing apx routes via pyproject.toml configuration. Routes are tools — no new abstractions. Implemented as a LifespanDependency addon following the same pattern as SQL and Lakebase addons. Co-authored-by: Isaac
Adds addons/agent/ following the same pattern as sql and lakebase: - addon.toml with Dependencies.Agent type alias - LifespanDependency that reads [tool.apx.agent] from pyproject.toml - Builds tool registry from app's OpenAPI spec (routes are tools) - Generates /.well-known/agent.json (A2A discovery card) - Generates /invocations (agent protocol dispatch to routes) - Generates /health (liveness probe) Zero application code changes — configure via pyproject.toml, existing routes with operation_id automatically become agent tools. Co-authored-by: Isaac
…ispatches via /invocations
49 checks covering _inspect_tool_fn, _make_input_model, Agent.build_local_tools, _build_fmapi_tool_schemas, build_router signature patching, structured output, protocol models, and A2A card generation. Runs directly with python3 — no APX wheel build required.
- Remove __signature__ from _ToolFn Protocol; regular Python functions satisfy __name__ + __doc__ but not __signature__ as a direct attr - Change _patch_handler_signature handler param to Any (does dynamic attr assignment, not Protocol reads) - Change _make_route_handler return type to Any (returns async coroutines) - Fix type: ignore comment from mypy syntax [call-overload] to bare type: ignore for ty compatibility on create_model call - Add rust-embed include-exclude feature + exclude pyc/__pycache__ from embedded templates to prevent template-not-found errors at scaffold time - Add httpx>=0.27.0 as Python dependency in agent addon.toml
- Add get_root_routers() to LifespanDependency base class + factory
for protocol routes that must live at / not /api/
- Agent get_root_routers(): /.well-known/agent.json, /invocations, /health
- Agent get_routers(): /api/tools/* (api-prefixed tool routes)
- Move addon pyproject.toml.jinja2 to addon root (was in src/base/,
which mapped to src/{app_slug}/ instead of project root)
- Result: [tool.apx.agent] config correctly written at scaffold time,
enabling AgentContext lifespan initialization
Builds the project and deploys to Databricks Apps via the Databricks CLI, then polls until the app reaches RUNNING state. - apx deploy [APP_PATH] [--skip-build] [--profile P] [--build-path P] - Reads DATABRICKS_CONFIG_PROFILE from .env if --profile not given - Polls databricks apps get every 3s (up to 3 min) for RUNNING state - Reports ERROR/CRASHED states with hint to check logs - Extracts run_build() from build.rs so deploy can reuse it
Adds a self-contained HTML/JS chat interface at /_agent for interactively testing agents during local development, inspired by Google ADK's `adk web` experience: - Fetches agent name/description/skills from /.well-known/agent.json context at render time (no round-trip needed) - Streams responses via SSE: sends InvocationRequest with stream=true and reads output_text.delta events token by token - Maintains full conversation history client-side and sends it each request (stateless agent, stateful client) - Shows registered skills in a collapsible panel - Auto-resizing textarea, Enter to send, Shift+Enter for newlines - Dark theme matching APX style Also fixes build.rs to skip UI build when the project has no frontend (pure-API agent projects), guarded by meta.has_ui().
…g capability - Remove unused JSONResponse import - Fix skills_json construction: !r produces Python repr (single-quoted strings), which is invalid JSON. Switch to json.dumps() so the browser can actually parse the skills array without error - Set A2ACapabilities.streaming = True — /invocations supports stream: true via SSE, so the discovery card should advertise it correctly
… test
1. Auto-discover agent_router.py — _AgentDependency.get_routers() now
auto-imports {backend_pkg}.agent_router via importlib if _agent_instance
is None. This removes the need for the addon to overwrite app.py with a
side-effect import, so existing app.py customisations are preserved.
The addon's app.py template is deleted.
2. /_agent setup banner — when AgentContext is None (missing pyproject
config or no Agent() call), the dev UI now shows a clear amber banner
with setup instructions instead of silently sending to a 404.
3. Rename Dependencies alias Agent → AgentContext — avoids a confusing
collision between the Agent builder class (used in agent_router.py)
and the route-parameter dependency type (used in FastAPI handlers).
Updated doc: `ctx: Dependencies.AgentContext`.
4. Integration test for agent addon — test_init_with_agent_addon verifies
that `apx init --addons agent` scaffolds agent_router.py, core/agent.py,
[tool.apx.agent] in pyproject.toml, httpx dep, and no ui/ directory.
Exposes all registered Agent tools as an MCP server over SSE transport,
mounted at /mcp/sse (GET) + /mcp/messages/ (POST).
- _build_mcp_components(ctx, app): builds mcp.server.Server + SseServerTransport
from the AgentContext tool registry. Tool calls dispatch via ASGI to the
existing /api/tools/<name> routes so FastAPI dep injection (auth, workspace
client) applies identically to MCP and REST callers.
- Lifespan wires the MCP server onto app.state; gracefully skips with a warning
if the mcp package isn't installed.
- /_agent dev UI gains an MCP info bar: shows the SSE URL computed from
window.location.origin with a one-click copy button.
- addon.toml: adds mcp>=1.0.0 to Python dependencies.
Claude Desktop config:
{"mcpServers": {"my-agent": {"transport": "sse", "url": "http://localhost:8000/mcp/sse"}}}
…t.json
- Add mcpEndpoint field to AgentCard — populated at request time with
"{base_url}/mcp/sse" when the MCP server is active, null otherwise.
- Populate card.url from request.base_url — was always "" before, which
breaks A2A clients that use the card to self-locate the agent.
- Both fields are filled via model_copy() in the route handler so the
stored ctx.card template stays clean (no request dependency at lifespan).
_dispatch_tool_call posted to /tools/<fn> but the actual routes live at
{api_prefix}/tools/<fn> (e.g. /api/tools/<fn>) because build_router()
returns a router that gets included under api_router which carries the
prefix. The LLM tool-calling loop would 404 on every tool call.
Fix: import api_prefix from ..._metadata (same pattern as _factory.py)
and use f"{api_prefix}/tools/{fn_name}" — matching what the MCP
dispatch already did correctly.
…gent hierarchy
Adds ADK-style agent composition types alongside the existing LlmAgent:
SequentialAgent([planner, writer]) — chains agents, each sees prior output
ParallelAgent([legal, finance]) — runs all concurrently, merges results
Key design changes:
- BaseAgent abstract base: run(), stream(), get_tool_routers(), collect_tools(),
fetch_remote_tools() — any custom orchestration pattern can subclass this.
- LlmAgent replaces Agent (Agent = LlmAgent alias kept for backwards compat).
__init__ no longer sets _agent_instance — sub-agents in a composite no longer
accidentally override the root registration.
- _auto_import_agent_router now looks for a module-level `agent` variable of
type BaseAgent in agent_router.py rather than relying on __init__ side-effects.
Explicit assignment makes intent clear and supports all agent types:
agent = SequentialAgent([LlmAgent(tools=[a]), LlmAgent(tools=[b])])
- AgentContext carries the root agent instance (ctx.agent). _handle_invocation
delegates to ctx.agent.run() / ctx.agent.stream() — no agent-type-specific
code in the protocol layer.
- _run_llm_loop now takes list[Message] instead of InvocationRequest, making
it callable from LlmAgent.run() without constructing a fake request body.
- Lifespan uses collect_tools() + fetch_remote_tools() instead of the
LlmAgent-specific build_local_tools() / fetch_sub_agent_tools().
Usage in agent_router.py:
planner = LlmAgent(tools=[search, outline])
writer = LlmAgent(tools=[draft])
agent = SequentialAgent([planner, writer]) # ← registered as root
_run_llm_loop now accepts an optional `tools` parameter. LlmAgent.run() and LlmAgent.stream() pass self.collect_tools() so each LlmAgent in a SequentialAgent or ParallelAgent hierarchy only exposes its own tools to FMAPI, preventing cross-agent tool leakage.
- Async tool support: _make_route_handler now checks iscoroutinefunction
and awaits the tool fn when it's a coroutine; sync tools unchanged
- MCP tool dispatch path: _build_mcp_components imported api_prefix from
_metadata so /mcp tool calls hit the correct {api_prefix}/tools/{name}
route instead of hardcoded /api/tools/{name}
- Instructions / system prompt: AgentConfig gains an optional `instructions`
field (maps to pyproject.toml [tool.apx.agent]); LlmAgent.__init__ accepts
an `instructions` kwarg that overrides the config value per-agent.
_run_llm_loop prepends a system message when instructions are non-empty.
Useful for per-agent persona in SequentialAgent/ParallelAgent compositions.
- InvocationRequest.input now accepts list[Message] | str; a plain string is coerced via .messages() so MLflow eval harness and curl one-liners work without wrapping in a list - app_predict_fn gains an optional token param that adds Authorization: Bearer <token> to every request — required for OBO-protected Databricks Apps during mlflow.genai.evaluate() - MLflow tracing: _handle_invocation opens a root CHAIN span per request; each FMAPI call opens a child LLM span; each tool dispatch opens a child TOOL span. All attributes (model, messages, result) are set on the spans. Tracing is no-op when mlflow is not importable so the addon remains usable in plain FastAPI dev without a tracking server.
- MLflow span leak on exception: all LLM and TOOL spans now use try/finally to guarantee span.end() on error paths; root CHAIN span in _handle_invocation also wrapped in try/finally - AgentConfig gains temperature, max_tokens (both optional, None = model default), and max_iterations (default 10). All three are documented as comments in the scaffolded pyproject.toml. - LlmAgent.__init__ accepts the same three params to override per-agent within a SequentialAgent/ParallelAgent composition. - _run_llm_loop resolves precedence: constructor arg > AgentConfig > model default. Builds fmapi_extra dict only with non-None values so missing fields are not sent to FMAPI at all.
…tom_inputs - Root span set_attribute-after-end: moved set_attribute inside the try/finally block so it runs before end() in all paths including errors - result undefined if _dispatch_tool_call raises: initialise result = "" before the try block so messages.append never hits a NameError - SequentialAgent/ParallelAgent now accept an optional instructions param. When set, a system message is prepended to the conversation before any sub-agent runs — framing the whole pipeline without overriding each LlmAgent's own system prompt. - custom_inputs wired up: _handle_invocation stashes custom_inputs on request.state; _run_llm_loop reads custom_inputs["instructions"] as the highest-priority system prompt override (> constructor > AgentConfig). custom_inputs also recorded as a span attribute on the root CHAIN span. InvocationRequest.instructions_override() helper added for callers.
- _load_agent_config: prefer __file__-relative pyproject.toml search over
cwd-relative; cwd may be unrelated in deployed Databricks Apps. Falls
back to cwd walk for interactive/test use.
- FMAPI tools payload: omit `tools` key entirely when tool list is empty
instead of sending `"tools": []`; some FMAPI backends reject the empty array.
- SSE error events: wrap ctx.agent.stream() in try/except inside the
generator; on exception yield `event: error\ndata: {...}` and log,
then close the span in finally. Previously the stream silently stopped
and the client UI hung indefinitely.
- MCP auth forwarding: mcp_sse handler captures the incoming Authorization
header onto app.state.mcp_auth_header; _call_tool reads it and forwards
it when making ASGI requests to tool routes, giving MCP clients the same
OBO token context as REST callers.
- app_predict_fn docstring: fix import path from `apx.agent` (wrong) to
`{{app_slug}}.backend.core.agent` (correct rendered module path).
- LlmAgent.stream(): add comment clarifying that streaming is simulated
(full response then chunked) because FMAPI lacks per-token streaming.
…xt window - LoopAgent: runs a LlmAgent in a loop until finish_loop() tool called or max_iterations reached; finish_loop registered as a real ASGI route so it shares the same dispatch path as all other tools - Tool hooks: before_tool(name, args) and after_tool(name, args, result) on LlmAgent; sync and async callables both accepted; fire around every tool dispatch in _run_llm_loop - Guardrails: input_guardrails and output_guardrails on LlmAgent; each is a list of callables returning None (pass) or str (short-circuit with that text); applied in LlmAgent.run() and stream() before/after _run_llm_loop - custom_outputs: set_custom_output(request, key, value) helper lets tool functions surface structured data alongside text; _handle_invocation initialises request.state.custom_outputs and includes it in InvocationResponse.custom_outputs; SSE path emits a custom_outputs event - Context window management: context_window_tokens on LlmAgent; _maybe_trim_context estimates token usage (4 chars/token), keeps system messages + last 2 messages intact, and summarises the middle with a single LLM call when budget exceeded - Type aliases: BeforeToolHook, AfterToolHook, InputGuardrailFn, OutputGuardrailFn
- RouterAgent: routes to one of several named sub-agents via a single upfront FMAPI call with synthetic transfer_to_<name> tools; no routes registered — the Python layer intercepts tool_calls directly; falls back to first agent when LLM does not call a transfer tool - HandoffAgent: agents dict + start key; each active agent receives transfer_to_<name> tools for every other agent injected into its own tool list; transfer routes registered as real ASGI endpoints (same signal pattern as LoopAgent.finish_loop) so FastAPI dep injection is preserved; handoff_to is set on request.state and checked after each _run_llm_loop call; supports up to max_handoffs transfers before stopping - _TransferBody Pydantic model shared by HandoffAgent transfer handlers Both types honour LlmAgent hooks (before_tool/after_tool) and context_window_tokens of the currently active sub-agent.
Adds a second page to the APX dev tooling namespace: /_apx/tools. Shows every registered tool exactly as the LLM sees it — dep-injected parameters stripped, FMAPI inputSchema rendered with syntax highlighting. An Invoke tab auto-generates a form from the schema and POSTs to the tool's /api/tools/<name> endpoint (or sub-agent /invocations), displaying the result with timing. Both /_apx/agent and /_apx/tools now have a nav bar linking between them. Also wires the route in get_root_routers() and updates the module docstring.
GET /_apx/probe?url=https://api.example.com makes a server-side GET and returns HTTP status, latency, content-type, server header, redirect count, and structured error details (ConnectError, Timeout, SSLError). Because the request runs from the server process, results reflect the deployed app's actual network path — useful for diagnosing egress restrictions on Databricks Apps before writing tool code. Also adds Probe to the nav bar on /_apx/agent and /_apx/tools.
- Add _build_apx_openapi_spec() — generates an OpenAPI 3.1 spec containing only tool endpoints with dep-stripped schemas (what the LLM sees, not what FastAPI sees with WorkspaceClient etc.) - Serve the filtered spec at /_apx/openapi.json - Replace 350-line hand-rolled tools UI with Scalar CDN embed (kepler theme) pointing at /_apx/openapi.json — sidebar, schema display, and try-it panel are now best-in-class without maintaining bespoke JS/CSS - Fixed APX nav bar overlays Scalar using position:fixed + ResizeObserver
The Rust dev server nested its own control router at /_apx (for /health, /logs, /stop), which consumed all /_apx/* traffic and returned 404 for the Python-side APX dev UI routes. Add /_apx/agent, /_apx/tools, /_apx/probe, /_apx/openapi.json, and /invocations as direct routes in api_utils_router. Axum gives specific .route() registrations priority over .nest() prefix matches, so these paths now reach the Python backend correctly.
Add /health, /.well-known/agent.json, /mcp/sse, /mcp/messages/ to api_utils_router alongside /invocations and the /_apx/* dev UI routes added in the previous commit. All routes registered by the APX agent protocol at app root are now forwarded to the Python backend through the dev proxy.
Each assistant turn now shows a collapsible 'tool calls' row below the response. For each tool: function name, ok/error badge, latency, input args, and result (truncated at 800 chars). Backend: _run_llm_loop writes a tool_trace entry (name, args, result, ms) to request.state after each dispatch. _sse_generator emits a 'tool.trace' SSE event after the text stream completes. Frontend: buildTraceEl() renders the trace as a <details> block. Error results are highlighted red. Multiple errors show a count in the summary line.
UV_NATIVE_TLS=1 in the project .env was not being seen by the uv subprocess during preflight because .env is not loaded into the process environment at that point. Add resolve_native_tls(app_dir) which checks the shell env first, then reads the project .env as fallback. Pass the result as the native_tls flag to Uv::sync() and Uv::tool_run(), which append --native-tls when true. This fixes apx dev start on corporate networks with SSL inspection where uv-dynamic-versioning fetches were returning 503.
_dispatch_tool_call only forwarded Authorization when dispatching tool calls via ASGI to /api/tools/<fn>. This meant X-Forwarded-Access-Token (injected by the dev proxy) was lost, causing OBO auth to fail on any tool that needs Dependencies.UserClient (e.g. Lakebase queries). Also fixes a JS syntax error in the /_apx/agent chat UI where a Python \n in an f-string produced a literal newline inside a JS string literal. Co-authored-by: Isaac
Replace verbose prose with an ASCII architecture diagram and a feature table. Keeps the code example showing tool definition. Co-authored-by: Isaac
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This RFC adds the agent protocol addon to APX — everything needed to build, run, and deploy Databricks-native AI agents with one command.
What's in this PR
Agent runtime (
core/agent.py)Agent type hierarchy
LlmAgent/AgentAgent = LlmAgent.SequentialAgentParallelAgentLoopAgentfinish_loop()ormax_iterationsis hit.RouterAgentHandoffAgenttransfer_to_*routes.LlmAgent features
before_tool/after_toolhooks — sync or async callables, called around every tool dispatchinput_guardrails/output_guardrails— lists of sync/async callables; returnNone(pass) orstr(short-circuit with that message)context_window_tokens— budget cap; when exceeded, the middle of the history is summarized with one extra LLM callcustom_outputs—set_custom_output(request, key, value)helper; surfaced inInvocationResponse.custom_outputsand asevent: custom_outputsin SSEComposition patterns match Google ADK and OpenAI Agents SDK feature-for-feature, plus Databricks-native additions:
/mcp/sse/.well-known/agent.json(liveurl+mcpEndpoint)WorkspaceClient/UserWorkspaceClientinjectable as typed FastAPI deps in tool functionsapx deploy— one command to productionapp_predict_fnMLflow eval bridge/_apx/agentand/_apx/toolsDev UI namespace:
/_apx/*/_apx/is the APX platform tooling namespace (underscore prefix signals "platform layer, not app layer")./_apx/agent— interactive chat UI/_apx/tools/_apx/tools— tool inspector and live invocation forminputSchemaas the LLM sees it (dep-injected params stripped, FMAPI JSON, syntax highlighted) — different from/docswhich shows all FastAPI paramsinputSchema, POSTs to/api/tools/<name>or sub-agent/invocations, shows result with timing/_apx/agentProtocol endpoints
Deployment (
crates/cli/src/deploy.rs)apx deploycommand — builds wheel, packages app, deploys to Databricks Apps, polls until runningREADME
Added "Why APX for agents?" section calling out the seven Databricks-native advantages vs Google ADK and OpenAI Agents SDK.
End-to-end validation: Energy billing agent
Validated the full agent flow against a common customer pattern — an energy billing Q&A agent with Lakebase-backed tools:
get_customer_profile,query_ami_readings,get_billing_summary,get_rate_schedule,compare_monthsDependencies.UserClient(OBO auth) → Lakebase Provisioned viagenerate_database_credentialdatabricks-claude-sonnet-4-6via FMAPI/_apx/agentchat UI — confirmed working end-to-endBugs found and fixed during validation:
_dispatch_tool_callonly passedAuthorizationto internal ASGI calls, droppingX-Forwarded-Access-Token. Tools usingDependencies.UserClientfailed withValueError: OBO token is not provided. Fixed by forwarding OBO headers./_apx/agent— Python\nin an f-string produced a literal newline inside a JS string, breaking the chat UI entirely. Fixed by escaping to\\n./_apx/*namespace conflict (prior commit) —/_apxroutes collided with app routes; fixed by separating the router merge./invocationsnot proxied (prior commit) — dev proxy only injected OBO tokens for/api/*, not root-level agent routes; fixed by addingapi_utils_router.Test plan
Agent(tools=[...])registers routes,/_apx/agentloads/_apx/agentchat UI: send message → streaming SSE response renders correctly/_apx/agentchat UI: tool calls display in trace panel with args/result/timing/_apx/toolsshows correct FMAPI schemas (dep-stripped), Invoke tab POSTs and returns resultDependencies.UserClientworks in tools called via/invocationsX-Forwarded-Access-Tokenfor both/api/*and root-level routes (/invocations,/.well-known/agent.json)apx init --addon agentscaffolds correctly (existing integration test)LoopAgentloops untilfinish_loop()is calledRouterAgentroutes to correct sub-agent without registering transfer tools in MCPHandoffAgenttransfers between agents viatransfer_to_*custom_outputsappear inInvocationResponseand SSE streamcontext_window_tokenstriggers summarization at budgetapx deploydeploys and polls to RUNNINGThis pull request was AI-assisted.