feat(mcp-tool-proxy): private MCP integration via per-tool SLXs#43
Open
theyashl wants to merge 36 commits into
Open
feat(mcp-tool-proxy): private MCP integration via per-tool SLXs#43theyashl wants to merge 36 commits into
theyashl wants to merge 36 commits into
Conversation
Draft
3 tasks
…P integration Spec covers four approaches (A multi-task SLX, B in-VPC gateway, C SLX-per-server, D SLX-per-tool) and recommends D. Plan scopes to the codecollection mcp-tool-proxy codebundle + the mcp_tools indexer in runwhen-local; papi DB/API/UI work is a separate plan. Defaults locked for §10 open decisions in the plan header. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…p path, split error policy
- Discovery: D1 → D2 (no papi work needed for v1; reads MCP_CONFIG setting from
Helm-provided mcpConfig values, mirroring CLOUD_CONFIG_SETTING pattern).
- SLX template: additionalContext gets path/hierarchy = "mcp/{server}"; access
tag flipped to read-only as safe default until we can classify tools.
- Error policy split: tools/call errors and result.isError surface as task
output (rc=0) so agentfarm can read and react; transport + initialize errors
fail the task (rc=1). Reflected in invoke_tool (returns string vs raises) and
main() exit codes.
- Tests rewritten accordingly; Phase 4 papi HTTP fetch replaced with
Helm-config parsing + validation; Phase 5 E2E drops papi mock.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…list SLX YAMLs generated by workspace-builder don't carry configProvided (that lives on the Runbook for the runner to read at exec time). additionalContext.hierarchy is a list of tag keys the UI walks to build the tree view, not a slash-path string. Use [source, mcp_server] so MCP SLXs group by source → server → tool, and surface tool_name as its own tag so the rendered alias isn't the only place it shows up.
RuntimeVarEntry in corestate-operator api/v1/common_types.go only declares name/default/description/validation. The Runbook CRD validation will reject envelopes with extra fields, so 'required' and 'type' (carried over from MCP's JSON Schema) had to come out. The Robot wrapper still receives the full input_schema via MCP_INPUT_SCHEMA so per-tool required-arg enforcement happens at MCP call time, not at Runbook level.
Maps MCP JSON Schema property metadata onto RuntimeVarValidation:
- properties[x].enum -> validation.type=enum, values=[...]
- properties[x].pattern -> validation.type=regex, pattern=...
- neither -> validation.type=regex, pattern='.*'
CRD constrains validation.type to {enum, regex} (corestate-operator
common_types.go:53-63), so the catch-all fallback is a permissive regex
rather than 'optional / nothing'. Every emitted runtime var now carries
a validation block, which matches the CRD's expectation in practice.
…hars
Live MCP servers return tool/property descriptions that may contain
colons, control characters (U+0080 seen on the RunWhen platform MCP
server), embedded quotes, and newlines — all of which break YAML when
emitted as `field: "<raw>"`. Our previous escape-only-quotes approach
caught quotes but missed everything else, leading to render errors:
Unexpected error rendering mcp-tool-proxy-slx.yaml:
mapping values are not allowed here
unacceptable character #x0080: special characters are not allowed
Switching to the `| tojson` filter produces JSON-escaped strings which
are also valid YAML scalars — handles quotes, backslashes, newlines,
and control characters in one go.
Affected fields:
- SLX template: spec.alias, spec.statement
- Runbook template: runtimeVarsProvided[].description / default /
validation.values / validation.pattern
Verified end-to-end against https://mcp.test.runwhen.com/mcp (37 tools
discovered; previous render-stage errors gone).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Other codebundles' SLX templates use:
labels:
{% include "common-labels.yaml" %}
Ours had it on the same line:
labels: {% include "common-labels.yaml" %}
which, when the include's first line starts with content, expands to
`labels: slx: <name>` — YAML reads that as `labels` with a scalar
value that contains a colon, so the parser bails with "mapping values
are not allowed here" on every mcp-tool SLX.
Verified by capturing the raw rendered output in-pod: this line was
the actual breakage, not the alias/statement scalars I fixed in
54198cd. tojson is still the right thing for those — keeping that.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Explains where mcpConfig.servers lives in Helm values and the workspaceinfo ConfigMap, what fields each server entry takes, the expected k8s Secret shape for bearer tokens, and links to the canonical workspaceInfo docs in runwhen-local. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
PAPI groups SLXs by resourcePath (the qualified_name minus the leaf
component). With hierarchy: [source, mcp_server] every tool on a given
server collapses to the same resourcePath (e.g. mcp/linear-mcp), which
hits a per-resource cap of 10 SLXs on the platform side — meaning
servers with >10 tools silently lose the surplus on upload.
Adding mcp_tool as a third hierarchy level keeps the qualified_name the
same (mcp/{server}/{tool}) but makes each tool its own resourcePath, so
the per-resource cap no longer applies.
Verified end-to-end against a linear MCP server with 41 tools: with
2-level hierarchy 10/41 stored, with 3-level all 41 stored.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ate defaults
PAPI's taskiq worker clones rw-generic-codecollection at whichever ref
the uploaded Runbook YAML specifies in `codeBundle.ref` to attach a
runbook (and its tasks) to each SLX. With a hardcoded `ref: main`, that
clone fails for any environment where the codebundle has not yet landed
on main:
PathNotFoundError: path codebundles/mcp-tool-proxy not found in local
clone /tmp/rw_upload_*/...rw-generic-codecollection_main
The error fires inside the runbook post-sync hook *after* the SLX row is
committed, so SLXs end up persisted with `runbook: null` and the entire
batch task aborts before remaining SLXs are processed.
Templates now read `match_resource.spec.codecollection_ref` (threaded
from the mcpConfig server entry by the indexer; defaults to "main") so a
workspace can point at a branch / tag while a change is in review.
Also switches Python-style `or ""` fallbacks to Jinja's `| default("")`
filter for `pschema.description`, `pschema.default`, and SLX `statement`
to tolerate MCP tools whose property schemas don't carry those keys.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The workspace-builder generation-rules engine populates `{{ ref }}`
with the codecollection ref the template was loaded from
(generation_rules.py:643). Using it for codeBundle.ref keeps the runner-
side clone pinned to whatever codecollection ref the workspace builder
was already pointed at via codeCollections — no extra knob needed on
mcpConfig.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…t resourcePath Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…+ k8s-prefixed secret
- SLX qualified_name now stops at server level (mcp/<server>); drop tool
from hierarchy so PAPI's resourcePath shows the parent path, not the leaf.
- Drop the "MCP: " alias prefix; alias is just "<server> / <tool>".
- Runbook task name uses ${MCP_TOOL_NAME} so reports show the actual tool.
- Runbook secret read uses the standard k8s:file@secret/<name>:token prefix
(matches kubernetes-auth.yaml / azure-auth.yaml convention). Workspace-vault
support tracked separately in RW-1150.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…rchy uses resource_name Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…erver → mcp_tool) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Mirrors the workspace-builder indexer escape hatch. New MCP_VERIFY_TLS configProvided var (defaults to true) flows from the indexer's verify_tls field → Runbook configProvided → Robot env → mcp_tool_proxy.py. Sets session.verify and passes verify= per-request (REQUESTS_CA_BUNDLE otherwise overrides it). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…re tools/call Robot's `Import User Variable` always returns a string, but MCP servers schema-check the JSON-RPC payload — sending `"true"` (string) where the schema says boolean fails with `invalid_type` (e.g. linear's list_teams rejecting `includeArchived: "true"`). The proxy now parses MCP_INPUT_SCHEMA (already passed as configProvided) and casts each arg to the declared JSON-Schema type: boolean / integer / number / array / object / string. Unknown types and coercion failures pass through so the MCP server can surface its own validation error. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…fault read-write Renders the SLX `access` tag from `match_resource.spec.access` so the workspace-builder indexer can classify each MCP tool independently (via `readOnlyHint` + tool-name verb heuristic). Defaults to `read-write` when the spec field is absent — safer to over-mark write capability than to silently flag a write tool as read-only. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…lx, task title prefixed with server - SLX alias: server/tool joined with " - " instead of " / " so the platform UI shows e.g. "linear-mcp - list_teams" without path-like separators inside the alias text. - SLX additionalContext.resourcePath: "mcp/<server>" — explicit field alongside qualified_name for downstream consumers that key off resourcePath rather than qualified_name. - Robot task title now "<server>_<tool>" instead of just "<tool>", so tasks from different MCP servers don't collide on identically named tools (e.g. multiple servers exposing "list_projects"). Plumbed MCP_SERVER_DISPLAY_NAME through configProvided + Suite Initialization so the variable resolves before the task name binds. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
a622144 to
8ac5413
Compare
Robot Framework runtime vars are always strings. If the MCP tool's input schema has a numeric/bool/list/dict default, the previous template let YAML parse it as that native type, which the runner rejects on type mismatch. Coerce non-string defaults through tojson first so YAML sees a string. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…n description JSON Schema lists required fields at the schema's top level, not as a per-property flag. Read that array and append "(required)" to the description of any parameter listed there so downstream UI/agent surfaces know which inputs are mandatory. If the parameter has no description, fall back to "Required parameter." instead of an awkward "(required)" alone. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Robot's `'''${schema_json}'''` interpolates the value as Python source
text, so a `\"` inside any MCP tool description (e.g. Linear's
list_issues description containing 'or "me"') gets re-interpreted by
Python's string-literal parser and the JSON is corrupted before
json.loads ever runs. Reproduced via runner log:
JSONDecodeError: Expecting ',' delimiter: line 1 column 206 (char 205)
Switch to `$schema_json` / `$tool_args` (no curly braces), which binds
the value to the expression's namespace as a Python object — no source
substitution, no escape re-parsing. Same fix applied to the FOR-loop
conditional (Run Keyword If → IF) so it doesn't break on values that
happen to contain triple quotes or backslashes either.
Adds a regression test file that pins both halves: a failing fixture
reproducing the original crash with the Linear schema, a passing
fixture using the object-pass form, and a static guard that fails the
build if any executable line ever reintroduces `'''${var}'''`.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
codebundles/mcp-tool-proxy/that proxies a single MCP tool call: Python script doesinitialize+tools/call(handles both JSON and SSE responses), Robot wrapper dynamically imports per-tool parameters fromMCP_INPUT_SCHEMA.mcp_toolresource. Generated SLXs carryplatform: mcp,resource_name: {server},resource_type: mcp_server,mcp_server: {server},mcp_tool: {tool}tags, anadditionalContext.hierarchy: [platform, mcp_server, mcp_tool]for grouping in the platform UI, and anaccesstag rendered fromspec.access(per-tool, classified by the indexer; defaults toread-writewhen absent).aliasis rendered as<server> - <tool>(dash separator); the runbook task name is<server>_<tool>. Both surface the server scope alongside the tool name so the UI doesn't show 37 anonymous "list_*" tasks.additionalContext.resourcePath: mcp/<server>(2 keys) — distinct from the 3-key hierarchy. UI grouping uses hierarchy; resource addressing uses resourcePath. Requires the pairedcompute_resource_path_from_hierarchychange inrunwhen-local#798so the explicit value isn't overwritten.42becomes"42",truebecomes"true",[1,2,3]becomes'"[1, 2, 3]"'). Robot Framework treats runtime vars as strings; un-coerced defaults caused type-mismatch failures in the runner.requiredarray get a(required)suffix appended to their rendered description ("Required parameter."as a fallback when the schema description is empty), so downstream UI/agent surfaces know which inputs are mandatory.Import User Variablealways returns string); the proxy coerces them to the JSON-Schema types inMCP_INPUT_SCHEMAbeforetools/callso boolean/integer/number/array/object parameters don't trip the MCP server's input validator.initializefailures exit 1 (task fails);tools/callerrors andresult.isError=trueare surfaced as task output (rc=0) so agentfarm can read and react to them.| tojsonfor any value sourced from upstream MCP descriptions/defaults (which may contain colons, control characters, or newlines), and the standardlabels: \n {% include "common-labels.yaml" %}pattern so the SLX YAML always parses cleanly.runwhen-local#798(themcp_toolsindexer that drives this codebundle from Helm-providedmcpConfigvalues).Design spec:
docs/superpowers/specs/2026-05-20-private-mcp-integration-design.md.Configuring MCP servers
MCP servers are declared on the workspace-builder side, not in this codecollection — see
codebundles/mcp-tool-proxy/README.mdfor the full configuration example. Short version:secret_refpoints to a k8sSecretwithdata.token: <bearer>. The same secret must be reachable from runner pods at execution time (the generated Runbook references it viasecretsProvided.workspaceKey).Optional per-server
verify_tls: falseskips TLS verification for environments where the pod's CA bundle doesn't yet trust the MCP server's issuer.Test Plan
cd codebundles/mcp-tool-proxy && PYTHONPATH=. .venv/bin/pytest tests/ -v— 22 pass + 1 skip (JSON-RPC envelope parsing, SSE handling, tool output rendering, error envelopes, transport failures, JSON-Schema arg coercion)from robot.api import get_model; get_model('runbook.robot')cleangeneration-rule-schema.jsonmcp_toolresource with realistic tool descriptions (colons, newlines, embedded quotes) — asserts on tags,additionalContext,runtimeVarsProvidedvalidation blocks, string-coerced defaults,(required)suffix on mandatory params,configProvided./.test/dry-run.sh) — stub MCP server + script round-trip; bypasses Robot sinceRW.Coreships only in the runner image (documented in.test/README.md)https://mcp.test.runwhen.com/mcp: 37 MCP tools discovered, 37 SLX + Runbook pairs rendered and uploaded to PAPIOut of scope (deferred follow-ups)
verify_tls: falseescape hatch (tracked in RW-1146)🤖 Generated with Claude Code