feat(appkit): supervisor API adapter for agents plugin by hubertzub-db · Pull Request #345 · databricks/appkit

hubertzub-db · 2026-05-05T13:30:54Z

Adds a fourth AgentAdapter that targets the Databricks AI Gateway Responses API, so AppKit apps can host server-side-orchestrated agents (Genie, Knowledge Assistants, UC functions, custom apps, UC connections) without managing tool execution locally.

import { fromSupervisorApi, supervisorTools } from "@databricks/appkit/agents/supervisor-api";

const supervisor = createAgent({
  instructions: "You are an assistant powered by the Databricks Supervisor API.",
  model: await fromSupervisorApi({
    model: "databricks-claude-sonnet-4-5",
    tools: [
      supervisorTools.genieSpace("01ABCDEF12345678", "NYC taxi trip records and zones"),
      supervisorTools.ucFunction("main.default.add", "Adds two integers and returns the sum."),
    ],
  }),
});

What's new

fromSupervisorApi({...}) — factory that returns an AgentAdapter. Uses the SDK's default credential chain (env, profiles, OAuth, OIDC) just like DatabricksAdapter.fromModelServing.
supervisorTools.* — concise factories for the five hosted tool types (genieSpace, ucFunction, knowledgeAssistant, app, ucConnection). description is required on every one — it's both validation and the routing hint the LLM uses to choose between tools.
@databricks/appkit/agents/supervisor-api — new subpath export so consumers pick only the adapter they want; mirrors the existing ./agents/{databricks,vercel-ai,langchain} pattern.

Reference app

Demo supervisor agent in apps/dev-playground/server/index.ts. Tools are commented out by default — uncomment any supervisorTools.* entry to give the model real powers.

Test plan

Manual tests
39 new tests (13 SSE reader + 26 adapter / factory).
Full appkit vitest suite passes; typecheck clean.

…agents The main product layer. Turns an AppKit app into an AI-agent host with markdown-driven agent discovery, code-defined agents, sub-agents, and a standalone run-without-HTTP executor. Agent runtime files land in core/agent/ from day one: core/agent/create-agent.ts — createAgent() definition factory core/agent/run-agent.ts — standalone adapter loop (no HTTP) core/agent/load-agents.ts — markdown agent discovery core/agent/system-prompt.ts — base system prompt + composition core/agent/types.ts — updated with AgentDefinition, AgentsPluginConfig, RegisteredAgent, etc. HTTP-facing concerns stay in plugins/agents/: agents.ts, thread-store.ts, tool-approval-gate.ts, event-channel.ts, event-translator.ts, schemas.ts, defaults.ts, manifest.json

Tool-agnostic guidelines instead of SQL/files-specific defaults; accept full PromptContext in buildBaseSystemPrompt for parity with custom callbacks. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

Register DATABRICKS_SERVING_ENDPOINT_NAME as optional CAN_QUERY so apps using Databricks-hosted agent models get resource wiring; optional when agents use only external adapters. Sync template/appkit.plugins.json. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

Align optional serving resource with `DatabricksAdapter.fromModelServing()`, which reads `DATABRICKS_AGENT_ENDPOINT` — not `DATABRICKS_SERVING_ENDPOINT_NAME` (serving plugin). Sync template. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

BREAKING CHANGE: top-level config/agents/*.md is no longer loaded. Use <agentId>/agent.md. The skills directory name is reserved and skipped. Orphan top-level .md files error at load; subdirs without agent.md error. Export agentIdFromMarkdownPath for path-based id resolution.

The MCP transport client and host policy aren't agents-specific; they are HTTP + JSON-RPC transport with URL/DNS allowlisting. Move them under packages/appkit/src/connectors/mcp/ so they sit alongside the other transport-layer modules (serving, genie, sql-warehouse, lakebase, …) and stop being reachable only through the agents plugin. - Move mcp-client.ts -> connectors/mcp/client.ts - Move mcp-host-policy.ts -> connectors/mcp/host-policy.ts - Move McpEndpointConfig type -> connectors/mcp/types.ts - Add connectors/mcp/index.ts barrel; re-export from connectors/index.ts - Move mcp-client / mcp-host-policy tests to connectors/mcp/tests/ - Agents plugin keeps hosted-tools.ts (HostedTool sugar + resolve) and imports connector types from ../../connectors/mcp. - tools/ barrel no longer re-exports AppKitMcpClient (never was public). No behaviour change. All existing tests pass against the new paths.

…dispatchToolCall Three small helpers pulled out of the AgentsPlugin streaming path to cut duplication and shrink the two large methods. - normalize-result.ts: void->"", JSON-stringify, 50K truncation with a human-readable marker. Unit-testable (previously covered only via the HTTP path). - consume-adapter-stream.ts: the 'message_delta' + 'message' accumulation loop shared between _streamAgent and runSubAgent. Accepts an optional signal and per-event side-effect callback (for SSE translation). - tool-dispatch.ts: one place that fans out toolkit/function/mcp/subagent entries. 'never'-typed default forces exhaustiveness: adding a fifth source is now a compile error at every call site. _streamAgent: executeTool closure shrinks from ~60 lines of dispatch + normalize to a single dispatchToolCall + normalizeToolResult call. Stream consumption collapses to consumeAdapterStream. runSubAgent: childExecute shrinks from ~30 lines of if/else dispatch to one dispatchToolCall call. Adapter loop collapses to consumeAdapterStream. Behaviour change (minor): childExecute previously silently fell through to 'Unsupported sub-agent tool source' when mcpClient or PluginContext was missing; now it throws the same specific error as the main stream. Matches the main-path behaviour. Tests: 15 new unit tests for normalizeToolResult + consumeAdapterStream. dispatchToolCall is exercised transitively through the full agent suite (288 existing tests still pass, 303 total on this branch).

… → def The `annotations` field (notably `destructive: true`) was silently dropped as tools flowed from `tool({...})` into the resolved `AgentToolDefinition`, so user-defined destructive tools never triggered the approval gate. - `ToolConfig` now accepts `annotations?: ToolAnnotations`. - `tool()` forwards it to the returned `FunctionTool`. - `FunctionTool` exposes `annotations` and `functionToolToDefinition` preserves it on the definition it builds. - `AgentsPlugin` reads the flag via `isDestructiveToolEntry()` (falls back to `functionTool.annotations` so a future divergence between def and function cannot re-introduce the bug) and emits the merged annotations via `combinedToolAnnotations()` on the `approval_pending` SSE payload. Covered by `tests/tool-approval-gate.test.ts` and `tests/function-tool.test.ts`.

ToolAnnotations.destructive is binary and has started to mislead: "save_view" captures a screenshot and creates a new file, which is nothing like deleting a dashboard, yet both trip the same red "destructive" approval card. This adds a semantic `effect` enum with four tiers — `read`, `write`, `update`, `destructive` — so tool authors can tell the UI what blast radius they actually have. The approval gate fires for any mutating effect (`write`/`update`/ `destructive`) and continues to honour the legacy `destructive: true` flag so existing tools keep their current red treatment without migration. Callers consuming `annotations` over the wire (MCP clients, approval UIs) can now differentiate; the playground will ship a tiered approval card as a follow-up.

…esolver DX centerpiece. Introduces the symbol-marker pattern that collapses plugin tool references in code-defined agents from a three-touch dance to a single line, and extracts the shared resolver that the agents plugin, auto-inherit, and standalone runAgent all now go through. `packages/appkit/src/plugins/agents/from-plugin.ts`. Returns a spread- friendly `{ [Symbol()]: FromPluginMarker }` record. The symbol key is freshly generated per call, so multiple spreads of the same plugin coexist safely. The marker's brand is a globally-interned `Symbol.for("@databricks/appkit.fromPluginMarker")` — stable across module boundaries. `packages/appkit/src/plugins/agents/toolkit-resolver.ts`. Single source of truth for "turn a ToolProvider into a keyed record of `ToolkitEntry` markers". Prefers `provider.toolkit(opts)` when available (core plugins implement it), falls back to walking `getAgentTools()` and synthesizing namespaced keys (`${pluginName}.${localName}`) for third-party providers, honoring `only` / `except` / `rename` / `prefix` the same way. Used by three call sites, previously all copy-pasted: 1. `AgentsPlugin.buildToolIndex` — fromPlugin marker resolution pass 2. `AgentsPlugin.applyAutoInherit` — markdown auto-inherit path 3. `runAgent` — standalone-mode plugin tool dispatch Before the existing string-key iteration, `buildToolIndex` now walks `Object.getOwnPropertySymbols(def.tools)`. For each `FromPluginMarker`, it looks up the plugin by name in `PluginContext.getToolProviders()`, calls `resolveToolkitFromProvider`, and merges the resulting entries into the per-agent index. Missing plugins throw at setup time with a clear `Available: ...` listing — wiring errors surface on boot, not mid-request. `hasExplicitTools` now counts symbol keys too, so a `tools: { ...fromPlugin(x) }` record correctly disables auto-inherit on code-defined agents. - `AgentTools` type: `{ [key: string]: AgentTool } & { [key: symbol]: FromPluginMarker }`. Preserves string-key autocomplete while accepting marker spreads under strict TS. - `AgentDefinition.tools` switched to `AgentTools`. `packages/appkit/src/core/run-agent.ts`. When an agent def contains `fromPlugin` markers, the caller passes plugins via `RunAgentInput.plugins`. A local provider cache constructs each plugin and dispatches tool calls via `provider.executeAgentTool()`. Runs as service principal (no OBO — there's no HTTP request). If a def contains markers but `plugins` is absent, throws with guidance. `fromPlugin`, `FromPluginMarker`, `isFromPluginMarker`, `AgentTools` added to the main barrel. - 14 new tests: marker shape, symbol uniqueness, type guard, factory-without-pluginName error, fromPlugin marker resolution in AgentsPlugin, fallback to getAgentTools for providers without .toolkit(), symbol-only tools disables auto-inherit, runAgent standalone marker resolution via `plugins` arg, guidance error when missing. - Full appkit vitest suite: 1311 tests passing. - Typecheck clean. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

runAgent()'s adapter-consumption loop is now the same consumeAdapterStream helper introduced in the agents-plugin layer. One loop covers all three execution paths: HTTP streaming (_streamAgent), sub-agents (runSubAgent), and standalone runAgent. The message_delta + message accumulation rule (with its LangChain on_chain_end quirk) lives in exactly one place.

…on A rewrite normalize-result, consume-adapter-stream, tool-dispatch were extracted to core/agent/ but agents.ts still imported them from plugins/agents/. Update the import paths to match the final file locations.

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

Flips the layering: agent types, helpers, and the standalone runner now live in core/agent/ instead of plugins/agents/. The HTTP-facing agents() plugin still owns its routes/streaming/threads but no longer re-exports framework primitives that peer plugins depend on. Moved (with git mv to preserve history): - plugins/agents/{types,from-plugin,build-toolkit,toolkit-resolver, consume-adapter-stream,normalize-result,tool-dispatch,system-prompt, load-agents}.ts -> core/agent/ - plugins/agents/tools/{tool,define-tool,function-tool,hosted-tools, sql-policy,json-schema,index}.ts -> core/agent/tools/ - core/{run-agent,create-agent-def}.ts -> core/agent/{run-agent,create-agent}.ts - 14 corresponding test files -> core/agent/tests/ Stayed in plugins/agents/ (HTTP/route concerns): - agents.ts, event-channel.ts, event-translator.ts, tool-approval-gate.ts, thread-store.ts, schemas.ts, defaults.ts, manifest.json, index.ts Updated imports across analytics, files, genie, lakebase to source from core/agent/ directly. plugins/agents/index.ts stays as a back-compat barrel that re-exports the moved primitives, so the public package surface (@databricks/appkit) is byte-identical. Verified: tsc --noEmit clean, 1581/1581 appkit tests pass.

Collapses the two parallel agent loops (`_streamAgent` in the plugin and `runAgent` in core) onto a single AgentRunner that drives the adapter to completion and surfaces events. Tool dispatch policy moves behind a ToolExecutor strategy injected by the caller. New: - core/agent/runner.ts (AgentRunner + ToolExecutor interface, ~65 lines) - core/agent/standalone-tool-executor.ts (in-process dispatch, ~78 lines) - plugins/agents/http-tool-executor.ts (HTTP-path executor: budget + approval gate + OBO dispatch + sub-agent recursion, ~243 lines) - plugins/agents/tests/http-tool-executor.test.ts (8 focused tests including sub-agent approval forwarding — was effectively untestable pre-refactor because the logic lived inside a private nested closure) Refactored: - core/agent/run-agent.ts: 348 -> 296 lines; the ~120-line executeTool closure is now a StandaloneToolExecutor + AgentRunner instantiation (~25 lines). - plugins/agents/agents.ts: 1362 -> 1262 lines; `_streamAgent` shrinks from 233 lines (with a 95-line nested executeTool closure) to ~150 lines that build an HttpToolExecutor + AgentRunner. Behaviour preserved: - Top-level budget enforcement (sub-agents pass budget=null, mirroring the original closure that only counted at the outer executeTool) - Approval gate fires on `effect: write|update|destructive` and the legacy `destructive: true` flag - Sub-agents reuse the parent's checkApproval + outboundEvents + translator + abortController so destructive sub-agent tools surface approval_pending on the parent's SSE stream - Sub-agent event forwarding skips `metadata` to avoid clobbering the parent thread state, matching the prior closure exactly Verified: tsc --noEmit clean, knip clean, 1589/1589 appkit tests pass.

Extracts `composePromptForAgent` + `normalizeAutoInherit` into plugins/agents/prompt.ts and `printRegistry` into plugins/agents/registry-printer.ts. These were free-function helpers at the bottom of agents.ts with no dependency on plugin state — pure candidates for extraction. Also opens the door for the bigger split (route handlers and `_streamAgent`/`runSubAgent` extracted into routes/*.ts and tool-execution.ts) by relaxing the access modifier on plugin members those modules will need (`agents`, `activeStreams`, `mcpClient`, `threadStore`, `approvalGate`, `resolvedApprovalPolicy`, `resolvedLimits`, `countUserStreams`). All marked `@internal` to keep the public surface unchanged. Note: the full split into `routes/` and `tool-execution.ts` proposed in plans/agent-architecture-followup.md is deferred. Route handlers and `_streamAgent`/`runSubAgent` remain as methods on AgentsPlugin because they have heavy plugin-state coupling and cross-call patterns (`runSubAgent` recurses, `_handleChat` calls `_streamAgent`, etc.) that don't translate cleanly to free functions without a larger refactor. Tracked as a follow-up. agents.ts: 1262 -> 1212 lines (-50). The plan's aspirational target of <=280 isn't met because the per-route extraction pass is deferred, but the helper extraction + access-modifier relaxation lays the groundwork. Verified: tsc --noEmit clean, 1589/1589 appkit tests pass.

…te manifest) Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

…template Final layer of the agents feature stack. Everything needed to exercise, demonstrate, and learn the feature. `apps/agent-app/` — a standalone app purpose-built around the agents feature. Ships with: - `server.ts` — full example of code-defined agents via `fromPlugin`: ```ts const support = createAgent({ instructions: "…", tools: { ...fromPlugin(analytics), ...fromPlugin(files), get_weather, "mcp.vector-search": mcpServer("vector-search", "https://…"), }, }); await createApp({ plugins: [server({ port }), analytics(), files(), agents({ agents: { support } })], }); ``` - `config/agents/assistant.md` — markdown-driven agent alongside the code-defined one, showing the asymmetric auto-inherit default. - Vite + React 19 + TailwindCSS frontend with a chat UI. - Databricks deployment config (`databricks.yml`, `app.yaml`) and deploy scripts. `apps/dev-playground/client/src/routes/agent.route.tsx` — chat UI with inline autocomplete (hits the `autocomplete` markdown agent) and a full threaded conversation panel (hits the default agent). `apps/dev-playground/server/index.ts` — adds a code-defined `helper` agent using `fromPlugin(analytics)` alongside the markdown-driven `autocomplete` agent in `config/agents/`. Exercises the mixed-style setup (markdown + code) against the same plugin list. `apps/dev-playground/config/agents/*.md` — both agents defined with valid YAML frontmatter. `docs/docs/plugins/agents.md` — progressive five-level guide: 1. Drop a markdown file → it just works. 2. Scope tools via `toolkits:` / `tools:` frontmatter. 3. Code-defined agents with `fromPlugin()`. 4. Sub-agents. 5. Standalone `runAgent()` (no `createApp` or HTTP). Plus a configuration reference, runtime API reference, and frontmatter schema table. `docs/docs/api/appkit/` — regenerated typedoc for the new public surface (fromPlugin, runAgent, AgentDefinition, AgentsPluginConfig, ToolkitEntry, ToolkitOptions, all adapter types, and the agents plugin factory). `template/appkit.plugins.json` — adds the `agent` plugin entry so `npx @databricks/appkit init --features agent` scaffolds the plugin correctly. - Full appkit vitest suite: 1311 tests passing - Typecheck clean across all 8 workspace projects - `pnpm docs:build` clean (no broken links) - `pnpm --filter=@databricks/appkit build:package` clean, publint clean Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com> Documents the new `mcp` configuration block and the rules it enforces: same-origin-only by default, explicit `trustedHosts` for external MCP servers, plaintext `http://` refused outside localhost-in-dev, and DNS-level blocking of private / link-local IP ranges (covers cloud metadata services). See PR databricks#302 for the policy implementation and PR databricks#304 for the `AgentsPluginConfig.mcp` wiring. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com> - `docs/docs/plugins/agents.md`: new "SQL agent tools" subsection covering `analytics.query` readOnly enforcement, `lakebase.query` opt-in via `exposeAsAgentTool`, and the approval flow. New "Human-in-the-loop approval for destructive tools" subsection documents the config, SSE event shape, and `POST /chat/approve` contract. - `apps/agent-app`: approval-card component rendered inline in the chat stream whenever an `appkit.approval_pending` event arrives. Destructive badge + Approve/Deny buttons POST to `/api/agent/approve` with the carried `streamId`/`approvalId`. - `apps/dev-playground/client`: matching approval-card on the agent route, using the existing appkit-ui `Button` component and Tailwind utility classes. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com> Updates `docs/docs/plugins/agents.md` to document the new two-key auto-inherit model introduced in PR databricks#302 (per-tool `autoInheritable` flag) and PR databricks#304 (safe-by-default `autoInheritTools: { file: false, code: false }`). Adds an "Auto-inherit posture" subsection explaining that the developer must opt into `autoInheritTools` AND the plugin author must mark each tool `autoInheritable: true` for a tool to spread without explicit wiring. Includes a table documenting the `autoInheritable` marking on each core plugin tool, plus an example of the setup-time audit log so operators can see exactly what's inherited vs. skipped. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com> - **Reference app no longer ships hardcoded dogfood URLs.** The three `https://e2-dogfood.staging.cloud.databricks.com/...` and `https://mario-mcp-hello-*.staging.aws.databricksapps.com/...` MCP URLs in `apps/agent-app/server.ts` are replaced with optional env-driven `VECTOR_SEARCH_MCP_URL` / `CUSTOM_MCP_URL` config. When set, their hostnames are auto-added to `agents({ mcp: { trustedHosts } })`. `.env.example` uses placeholder values the reader can replace instead of another team's workspace. - **`appkit.agent` → `appkit.agents` in the reference app.** The prior `appkit.agent as { list, getDefault }` cast papered over the plugin-name mismatch fixed in PR databricks#304. The runtime key now matches the docs, the manifest, and the factory name; the cast is gone. - **Auto-inherit opt-in added to the reference config.** Since the defaults flipped to `{ file: false, code: false }` (PR databricks#304, S-3), the reference now explicitly enables `autoInheritTools: { file: true }` so the markdown agents that ship alongside the code-defined one still pick up the analytics / files read-only tools. This is the pattern a real deployment should follow — opt in deliberately. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com> - `apps/dev-playground/config/agents/autocomplete.md` sets `ephemeral: true`. Each debounced autocomplete keystroke no longer leaves an orphan thread in `InMemoryThreadStore` — the server now deletes the thread in the stream's `finally` (PR databricks#304). Closes R1 from the MVP re-review. - `docs/docs/plugins/agents.md` documents the new `ephemeral` frontmatter key alongside the other AgentDefinition knobs. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com> Documents the MVP resource caps landed in PR databricks#304: the static request-body caps (enforced by the Zod schemas) and the three configurable runtime limits (`maxConcurrentStreamsPerUser`, `maxToolCalls`, `maxSubAgentDepth`). Includes the config-block shape in the main reference and a new "Resource limits" subsection under the Configuration section explaining the intent and per-user semantics of each cap. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

The agents plugin's manifest `name` is `agents` (plural), so routes mount at `/api/agents/*` and its client config is keyed as `agents` — but three call sites still referenced the old singular `agent`: - apps/agent-app/src/App.tsx: /api/agent/{info,chat,approve} returned an Express 404 HTML page, which the client then tried to JSON.parse, producing "Unexpected token '<', <!DOCTYPE ...". Swap to /api/agents/*. - apps/dev-playground/client/src/routes/agent.route.tsx: same three paths, plus getPluginClientConfig("agent") returned {} so hasAutocomplete was false and the autocomplete hook short-circuited before ever firing a request. Swap the lookup key to "agents". - template/appkit.plugins.json: the scaffolded plugin descriptor still used the singular name/key, which would have broken fresh apps the same way. Align with the plugin's real manifest name. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

Move reference apps to config/agents/<id>/agent.md; document migration and reserved skills folder; align generated API snippets and CHANGELOG.

typedoc picked up JSDoc changes from agent/v2/4-agents-plugin: - New public export `agentIdFromMarkdownPath` (helper for path-based id resolution used by `loadAgentFromFile`). - `loadAgentsFromDir` description/body now reflects the folder layout (`<id>/agent.md`, orphan `*.md` rejected, reserved `skills/` dir). Generated by docusaurus-plugin-typedoc during pnpm --filter=docs build.

… retire agent-app Stage 0 of the smart-dashboard-demo plan. Ports the prototype Smart Dashboard (NYC Taxi analytics) from the p3ju worktree into dev-playground as a new route, migrates its markdown agents to the folder layout, and deletes apps/agent-app — which is superseded by this demo as the integration test of the entire v2 agents stack. Client: - New route at client/src/routes/smart-dashboard.route.tsx with its own subdirectory for components/ and hooks/. - Ported 8 components (ActiveFilters, AgentSidebar, AnomalyCard, FareChart, InsightCard, KPICards, QuerySection, TripChart) and 4 hooks (useActionDispatcher, useAgentStream, useChartColors, useDashboardData) as-is. Relative imports preserved. - Nav link added in __root.tsx. - TanStack routeTree.gen.ts auto-regenerated. Server: - Ports apply_filter and highlight_period inline tools. - Adds sql_analyst (code-defined: fromPlugin(analytics)) and dashboard_pilot (code-defined: apply_filter + highlight_period) per the plan's Q2 = option B decision. - Adds query markdown dispatcher in config/agents/query/agent.md delegating to both specialists via the agents: frontmatter. - Ports insights and anomaly ephemeral markdown agents. Config: - Ports 4 SQL queries into config/queries/dashboard_*.sql. - Note: shared/appkit-types/analytics.d.ts not regenerated in this commit; useAnalyticsQuery("dashboard_*", ...) uses explicit as casts and works at runtime. Regenerate with 'npx @databricks/appkit generate-types' locally when convenient. Cleanup: - apps/agent-app/ removed in full. No references outside pnpm-lock.yaml (regenerated). - plans/smart-dashboard-demo.md added with the full staged plan. Verification: - pnpm --filter=dev-playground client typecheck: clean. - pnpm --filter=dev-playground client vite build: clean. - Server typecheck: same pre-existing errors as main (files plugin union type, telemetry CacheManager, playwright DOM lib) — no new regressions. Next stages (1-6, per the plan): dispatcher integration verified, save_view + approval card, dashboard-context injection + focus_chart, Stream Inspector, polish, demo script.

Stage 0 ported the dashboard shell verbatim from the prototype; this commit layers every v2-stack feature on top, moves the feature dir out of routes/ (TanStack was flagging files as stray routes), rewrites the agent -> UI action pipeline for correctness, and adds discoverability for the HITL flow. Server (apps/dev-playground/server/index.ts) - Split the polymorphic apply_filter into four narrower tools: filter_by_date_range, filter_by_pickup_zip, filter_by_fare, clear_filters. Each has exactly one client-side effect; removes the whole class of 'agent said it worked but nothing moved' bugs. - Add clear_highlights, focus_chart, save_view (destructive; triggers the approval gate). - dashboard_pilot instructions rewritten with a compact verb-per-line reference so the LLM picks the right single tool for each intent. Client - moved out of routes/ - Feature code relocates to client/src/features/smart-dashboard/ (components/, hooks/, lib/). TanStack Router was warning that every non-route file under routes/ 'does not contain any route piece.' - smart-dashboard.route.tsx uses @/features/ aliases; the route file is now the only thing under routes/. Client - correctness fixes in the action dispatcher - Act only on response.output_item.done (never .added, which fires with partial arguments and caused double-applied highlights plus silent JSON-parse races). - Dedupe by call_id with a bounded LRU; reset on appkit.metadata (new-run signal). - Use updater callbacks (onFilterUpdate(prev => ...)) instead of a currentFilters prop to eliminate stale-closure bugs when the agent fires multiple tool calls in one render cycle. - Validate arg shapes per tool; anything malformed or unrecognized surfaces through onUnknownTool (route renders as a red banner + console.warn). Silent failure was the worst failure mode. - Emit a human-readable summary for every applied action (onAction). Client - discoverability / HITL - New QuickActionsBar with Save view... (inline name input), Clear filters, Clear highlights. Each dispatches through the chat pipeline so the agent still reasons and the approval gate still fires for destructive actions - the bar just saves typing. - ActionToast (bottom-left) confirms every dispatcher-applied action for ~3s. Answers 'did it work?' without opening the inspector. - QuerySection refactored into a view: content/isLoading/onSend come from the route. Lifting useAgentStream one level up lets the Quick Actions bar and the chat input share a single agent stream. - QuerySection example queries refreshed to cover the new tools. Client - stream-inspector wiring - SSEEvent extended with approval_pending payload fields. - use-stream-inspector threaded through so every run's events flow into the inspector's module-level store. - FocusableChart renamed its 'id' prop to 'chartId' (logical registry key, not a DOM id - biome was right to complain). Verification - pnpm --filter=dev-playground client tsc --noEmit: clean. - pnpm --filter=dev-playground client vite build: clean. - Server typecheck: same pre-existing errors as main; no new regressions. - apps/dev-playground/shared/appkit-types/analytics.d.ts regenerated by vite build to register the four dashboard_* queries; kept in the commit so CI and downstream consumers have typed useAnalyticsQuery access out of the box.

…iews panel + floating chat Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

Fresh UC volumes don't have a saved-views/ subdirectory until the first save; the SDK throws FILES_API_DIRECTORY_IS_NOT_FOUND on list. The route was propagating that as a 500 which rendered as a red error banner in the SavedViewsPanel on first load. Catch the error explicitly, return { views: [] }, let the panel render its 'no saved views yet' empty state cleanly. Uploads still work the first time because the SDK auto-creates parent dirs on upload.

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

html2canvas 1.x throws on `oklch()` color values, which Tailwind v4 emits everywhere in computed styles. Swap to the maintained html2canvas-pro fork (drop-in API) so dashboard captures render without "Attempting to parse an unsupported color function 'oklch'" errors in the approval card. Keeps html2canvas pinned so types still resolve.

Databricks SDK `volume.download(path)` returns a wrapper `{ contents: ReadableStream, "content-type": string }`, not the stream itself. The previous handler tried to write the wrapper directly, which produced an empty body and broke thumbnails in the saved-views panel. Now we read `.contents`, drain the stream, and respond with the server-reported content-type (falling back to `image/png`). Also drops a couple of noisy console.logs left over from the debugging session.

… click Clicking a saved-view thumbnail was sending a chat prompt like "Load the saved view 'january'" and letting the agent reconstruct filters from the view name. That dropped the highlights (agent had no tool to fetch the stored metadata) so January-with-focus-on-week-1 came back as just January-wide. Since the client already holds the full authoritative metadata for the clicked thumbnail, bypass the agent and apply `meta.filters` and `meta.highlights` directly to local state, with a toast summarising what was restored. Also hardens the `appkit.approval_pending` handler: it now accepts both snake_case and camelCase fields and validates that approval_id/tool_name/stream_id are non-empty strings before enqueuing, so a malformed event can't push a broken approval card.

Picks up the new `annotations?: ToolAnnotations` field on `ToolConfig` and `FunctionTool` introduced upstream in the annotations-propagation fix.

…nable agent feed Reshapes the Smart Dashboard demo from a sparse 2-chart layout into a 2x2 chart grid with a right-rail agent feed, and turns the previously read-only insights/anomaly cards into clickable actions that drive the dashboard directly. New visualisations: - HourlyHeatmap: day-of-week × hour-of-day grid, click a cell to ask the agent to investigate that slot. - TopZonesChart: hand-rolled horizontal bar leaderboard with click-to- filter and a `highlight_zone` ring driven by the agent. - KPI sparklines: inline 7-day micro-charts with windowed trend deltas baked into each KPI card. Agent feed becomes interactive: - `feed-actions.ts` defines a structured action schema (filter_date, filter_zip, filter_fare, highlight_period, highlight_zone, focus_chart, ask) and a parser. The `insights` and `anomaly` ephemeral agents now emit JSON matching that schema. - `ActionableCard` renders insights/anomalies with action chips that invoke `useActionDispatcher.dispatch` directly — same code path the SSE function-call handler uses, so UI clicks and agent tool calls behave identically. - The feed re-runs (debounced) whenever filters or highlights change. Server-side wiring: - Adds `highlight_zone` and `clear_zone_highlights` tools. - Extends the `focus_chart` enum with `hourly_heatmap` and `top_zones`. - Updates `dashboard_pilot` instructions to prefer `highlight_zone` over `filter_by_pickup_zip` when calling out a single ZIP. - Adds three SQL queries: `dashboard_hourly_heatmap`, `dashboard_top_zones`, `dashboard_kpi_sparklines`. The top-zones query casts `pickup_zip` (an INT in samples.nyctaxi.trips) to STRING so the client's highlight Map keys, the agent's `highlight_zone` arg, and the filter parameter all speak the same type. Polish & defensive fixes: - Defensive `Number()` coercion in `kpi-cards.tsx` for sparkline values so trend math doesn't render `NaN%` or string-concatenated revenue totals if a driver hands back DECIMAL-as-string. - `Sparkline` reserves vertical space for intentionally-empty series (e.g. the categorical "Top Pickup Zone" KPI) instead of rendering a loading-style placeholder. - 2x2 chart grid uses `items-start` + `auto-rows-min content-start` so the rail no longer stretches the chart column and creates dead space. - `ChatDrawer` becomes a controlled component (`open` + `onOpenChange`) so any agent-triggering UI action can auto-open the chat — the user always sees the agent's response without manual disclosure.

The playground header was unscalable: 14 demo links rendered as side-by-side buttons that overflowed on narrow screens, and the home page maintained a parallel hand-curated grid that had already drifted (missing Smart Dashboard, Chart Inference, Vector Search, Policy Matrix, and Serving — ~30% of the catalog). Introduces `client/src/lib/nav.ts` as the single source of truth: each demo declares its label, one-line description, lucide icon, and category group. Both surfaces now read from the same list, so adding a demo is a one-line change and they can no longer drift. Header (`__root.tsx`): - Replaces the button wall with a single "Menu" hamburger dropdown grouping demos by purpose (Data / AI / Platform). - Active route is highlighted inside the dropdown and shown breadcrumb- style next to the brand, so the user always knows where they are. - Caps dropdown height at viewport-minus-header with overflow scroll, so adding more demos won't break the layout. Home page (`index.tsx`): - Restrained hero with a soft dual-radial gradient wash (~6-8% opacity, primary + accent) — depth without saturation. - Featured card for the Smart Dashboard flagship demo: gradient accent, icon tile, eyebrow badge, animated CTA. The featured demo also appears in its category grid, de-emphasised with a "Featured above" note. - Three category sections with one-line taglines, rendered as a 1/2/3-col responsive grid of icon + title + description cards. Each card is a real `<Link>` (not a button inside a decorative `<Card>`), so the whole surface is keyboard-accessible. - Footer shows live demo and category counts driven by the catalog.

…tive Retag save_view as effect: "write" (it creates a PNG; it doesn't delete anything) and teach the approval card to render three distinct tiers. Capturing a screenshot no longer masquerades as deletion: writes get a calm blue card with a plus-circle icon, updates get a warning-amber card with a pencil, and real destructive actions retain the red shield-alert. Legacy destructive: true still maps to the red tier, so tools that haven't migrated keep their current look.

Tailwind v4 compiles `bg-blue-50/50` to a two-layer rule: an sRGB hex fallback plus an `@supports (color-mix)` override that mixes the oklch palette token with transparent in oklab. Browsers with color-mix support (recent Chrome/Arc) take the oklab path; older embedded Chromiums (e.g. Cursor's built-in browser) fall through to the sRGB hex. Those two paths produce visibly different tints against the dark `--card` token, which is why the agent-feed cards rendered inconsistently across Chrome, Arc, and Cursor's browser. Pin the four insight/anomaly-tier backgrounds to arbitrary 8-digit hex (`bg-[#eff6ff80]` etc.) so every browser lands on the same sRGB path. Values taken from Tailwind's own fallback output to preserve the intended look on color-mix-capable browsers.

appkit-ui's globals.css already defines dark-theme tokens via two paths — an explicit `.dark` class on <html>, and `@media (prefers-color-scheme: dark)` guarded by `:root:not(.light)` so an explicit `.light` class wins. Tailwind v4's default `dark:` variant, however, is purely media driven. That mismatch shows up when the user forces light via the playground's theme selector while their OS is in dark mode: the bootstrap script sets `<html class="light">`, --card/--background correctly resolve to light, but every `dark:*` utility keeps firing under the media query — cards end up painted with dark-mode backgrounds layered under light-mode chrome. Declare a playground-local `@custom-variant dark` that mirrors the token logic exactly: fire when the element is (or descends from) `.dark`, or when `prefers-color-scheme: dark` matches and no `.light` ancestor is present. This rebinds every `dark:*` utility to respect the theme selector's forced choice, keeping the rest of appkit-ui's consumers — which don't ship the bootstrap script — on the existing media-only behaviour.

The streaming-message bubble in the smart-dashboard chat drawer used `animate-pulse` while tokens arrived. The constant fade in/out reads as visual noise when the agent is mid-stream — especially with longer replies where it pulses for many seconds. Drop the animation; the ellipsis placeholder still communicates the loading state for empty streaming bubbles.

`server({ autoStart: false }).then(appkit => appkit.server.extend(...).start())` is gone — `createApp` now orchestrates server start itself, with the post-setup hook surfaced as the `onPluginsReady` config callback. Drop `autoStart: false`, hoist the `extend` block from the trailing `.then` chain into `onPluginsReady`, and replace the dangling promise with `.catch(console.error)` so unhandled rejections still surface. Tracks databricks#280 / databricks#291 (autoStart removal + on-plugins-ready codemod).

Selecting `agents` in `databricks apps init` previously produced an app that booted, logged "No agents registered.", and rendered no UI for the plugin. Fixes that by scaffolding two starter agents (one markdown, one code-defined) and a chat surface, gated on `{{if .plugins.agents}}`. Added: - template/config/agents/assistant/agent.md — markdown agent, default, no tools. Demonstrates the declarative form. - template/server/agents/helper.ts — code-defined agent via createAgent({...}) with two inline tool({...}) definitions: current_time (returns ISO timestamp) and count_words. Tools are pure JS so the demo works regardless of which other plugins were selected at scaffold time. - template/client/src/pages/agents/AgentChat.tsx — minimal SSE consumer for /api/agents/chat with an agent picker, streaming text bubbles, and inline tool-call rows. Hand-rolled because @databricks/appkit-ui doesn't yet ship a generic agent chat primitive — replace with one when it lands. Modified: - template/server/server.ts: when {{if .plugins.agents}}, imports the helper agent and wires it as agents({ agents: { helper } }) instead of bare agents(). The markdown 'assistant' loads automatically from config/agents/. - template/client/src/App.tsx: conditional NavLink + route entry, mirroring the analytics/files/etc. blocks. End-to-end shape after init with --features agents: - GET /api/agents/info returns { agents: ['assistant', 'helper'], defaultAgent: 'assistant' } - /agents page renders chat with picker - 'what time is it?' to helper triggers a current_time tool round-trip - 'count words in: the quick brown fox' triggers count_words → 4 The serving-endpoint resource (DATABRICKS_SERVING_ENDPOINT_NAME) is already declared in template/appkit.plugins.json from PR 4, so the CLI prompts for an endpoint when agents is selected.

agents, createAgent, fromPlugin, tool and all agent-related exports are now under the beta subpath. Update the dev-playground server and the template helper to import from @databricks/appkit/beta.

- Document agents as beta in docs and set stability in app template manifest - Point Docusaurus Typedoc at typedoc.entry.ts so stable + beta APIs publish together (fixes agent symbol pages being dropped from index-only builds) - Regenerate api/appkit index and sidebar; knip-ignore docs-only entry file Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

Typedoc reference grew when the unified entry started exposing tool authoring primitives (defineTool, AppKitMcpClient, DatabricksAdapter, parseTextToolCalls, ToolEntry, ToolRegistry, etc.) that beta.ts now re-exports. Regenerating brings docs/docs/api/ back in sync so the docs:build CI gate passes. pnpm-lock.yaml gains the get-port@7.2.0 entry that was added to @databricks/appkit on main and merged into v4 during the stack rebase. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

hubertzub-db requested a review from a team as a code owner May 5, 2026 13:30

hubertzub-db requested a review from pkosiec May 5, 2026 13:30

hubertzub-db changed the title ~~Agent/v2/sa/1 adapter~~ Agents plugin: supervisor API adapter May 6, 2026

MarioCadenas added 27 commits May 7, 2026 11:45

refactor(appkit): generalize default base system prompt

3107741

Tool-agnostic guidelines instead of SQL/files-specific defaults; accept full PromptContext in buildBaseSystemPrompt for parity with custom callbacks. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

feat(appkit): sub-agent approval gate and HttpToolExecutor path (SDK)

e22db62

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

fix(appkit): forward all sub-agent events except metadata (SDK)

62dc7f3

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

feat(appkit): unify on DATABRICKS_SERVING_ENDPOINT_NAME (SDK + templa…

b739024

…te manifest) Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

docs(agents): folder layout on disk, migrate samples, sync API refs

8300070

Move reference apps to config/agents/<id>/agent.md; document migration and reserved skills folder; align generated API snippets and CHANGELOG.

feat(appkit): sub-agent approval gate + save view to volume + saved v…

60c8ea6

…iews panel + floating chat Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

fix(appkit): forward all sub-agent events except metadata

014e529

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

MarioCadenas and others added 18 commits May 7, 2026 11:46

docs(appkit): regenerate typedoc for tool annotations

912fbda

Picks up the new `annotations?: ToolAnnotations` field on `ToolConfig` and `FunctionTool` introduced upstream in the annotations-propagation fix.

fix(playground, template): import agents from @databricks/appkit/beta

cefe28d

agents, createAgent, fromPlugin, tool and all agent-related exports are now under the beta subpath. Update the dev-playground server and the template helper to import from @databricks/appkit/beta.

chore: remove plans scratch docs from agents stack branch

b8db81b

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

chore(appkit): Biome format load-agents imports

4dbe99e

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

feat(appkit): supervisor api adapter

bdcc529

hubertzub-db force-pushed the agent/v2/sa/1-adapter branch from 888f8ba to bdcc529 Compare May 7, 2026 13:27

hubertzub-db changed the title ~~Agents plugin: supervisor API adapter~~ feat(appkit): supervisor API adapter for agents plugin May 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(appkit): supervisor API adapter for agents plugin#345

feat(appkit): supervisor API adapter for agents plugin#345
hubertzub-db wants to merge 45 commits intodatabricks:mainfrom
hubertzub-db:agent/v2/sa/1-adapter

hubertzub-db commented May 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hubertzub-db commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's new

Reference app

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hubertzub-db commented May 5, 2026 •

edited

Loading