feat: self-improving agentic workflow — core MCP library by intel352 · Pull Request #402 · GoCodeAlone/workflow

intel352 · 2026-04-13T09:19:37Z

Summary

Add in-process wfctl MCP library (NewInProcessServer) with MCPProvider interface for direct tool invocation without HTTP/subprocess
Add mcp_tool trigger type and mcp workflow handler type for exposing pipelines as MCP tools
Add mcp.registry module with admin API (/admin/mcp/servers, /admin/mcp/tools) for audit/discovery
Add LSP in-process library (DiagnoseContent, CompleteAt, HoverAt) with MCP tool wiring
Add challenge-response override tokens (BIP39 3-word passphrases, 1hr expiry) for guardrail bypasses
Add wfctl ci validate subcommand with immutability enforcement and override support
Add documentation: self-improvement guide, guardrails guide, MCP reference, tutorial

Design

See: docs/plans/2026-04-13-self-improving-agentic-workflow-design.md

Implementation Plan

See: docs/plans/2026-04-13-self-improving-agentic-workflow-plan.md

Related PRs

workflow-plugin-agent: GoCodeAlone/workflow-plugin-agent (feat/self-improvement branch)
workflow-scenarios: GoCodeAlone/workflow-scenarios (feat/self-improving-scenarios branch)

🤖 Generated with Claude Code

Approved design for enabling optional self-improvement loops in Workflow applications. Three-layer architecture: engine MCP library + agent plugin guardrails + validation scenarios. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

11-phase plan covering workflow core MCP library, agent plugin guardrails/blackboard/safety, and 3 validation scenarios (85-87). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Adds Phase 12 with 14 tasks covering: MCPProvider interface, CLI equivalence tests, trigger runtime, handler registration, admin API, LSP hover, LSP-as-MCP tools, override CLI wiring, PR comment/API header overrides, tutorial doc, blackboard subscribe, additional bypass vectors, multi-agent review pattern, version bumps. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Extract wfctl MCP tools as a Go library for direct invocation without HTTP or subprocess overhead. All 25+ tools available. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds ToolHandlerFunc type alias, toolHandlers map to Server struct, and collectToolHandlers() method to support in-process MCP invocation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Add DiagnoseContent(), CompleteAt(), HoverAt() functions for in-process LSP feature invocation without an active client connection. Includes Diagnostic, CompletionResult, and HoverResult types with full test coverage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ugin New trigger type exposes pipelines as MCP tools. New handler type groups pipelines under named MCP servers. Registry module provides admin API for server/tool discovery and audit logging. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Address spec-reviewer feedback: - Add InProcessOption variadic opts to NewInProcessServer with WithInProcessPluginDir, WithInProcessRegistryDir, WithInProcessDocFile, WithInProcessAuditLog, WithInProcessEngine helpers - InProcessServer now holds pre-populated tools map from s.toolHandlers instead of re-fetching on every CallTool invocation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Add snake_case JSON tags to Diagnostic, CompletionResult, and HoverResult so fields serialize correctly in MCP tool responses. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… validate Deterministic 3-word passphrases for guardrail overrides. CI validate runs full validation suite with immutability enforcement. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…enge Add explicit time.Time parameter for testability. Use binary.BigEndian uint16 word-index calculation (2 bytes per word) per spec. Update all callers to pass time.Now(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…missing secret - Add 8 missing words (catalog..celery) to prevent empty token components - ciResultsHash: use sha256 instead of truncated string concatenation - Warn to stderr when --override used but WFCTL_ADMIN_SECRET is unset Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…uides New docs: - docs/self-improvement.md — feature overview, architecture, deploy strategies, safety model - docs/guardrails-guide.md — hierarchical scopes, tool access globs, immutable sections, challenge tokens, command safety - docs/mcp-tools-reference.md — all 25+ wfctl tools, LSP tools, workflow-defined MCP tools, in-process vs external comparison - docs/self-improvement-tutorial.md — step-by-step from base app to autonomous self-improvement DOCUMENTATION.md additions: - agent.provider, agent.guardrails, mcp.registry module types - mcp_tool trigger type - mcp workflow handler type - step.agent_execute, step.blackboard_post, step.blackboard_read, step.self_improve_validate, step.self_improve_diff, step.self_improve_deploy, step.lsp_diagnose step types

…fig, add plugin notes - wfctl challenge-token generate → wfctl override generate "sha256:$HASH" - WORKFLOW_ADMIN_SECRET → WFCTL_ADMIN_SECRET throughout all docs - mcp_tool trigger field name: → tool_name: - mcp.registry: remove fabricated fields (storage, db_path, require_schema) - Add workflow-plugin-agent v0.8.0+ requirement notes to DOCUMENTATION.md, self-improvement.md, mcp-tools-reference.md, and tutorial Prerequisites Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…p routes: [] - audit_log → audit_tool_calls (real MCPRegistryConfig field) - Remove default: "status" from MCPToolParameter example (field doesn't exist) - Remove routes: [] from tutorial Step 1 (wfctl modernize anti-pattern) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

This PR introduces foundational building blocks for a self-improving/agentic workflow setup in the core engine: in-process MCP tool invocation, MCP-triggered pipelines, an MCP registry, an in-process LSP helper library, and challenge/override + CI validation plumbing in wfctl, along with extensive documentation.

Changes:

Add an in-process MCP library (NewInProcessServer) plus an MCPProvider interface for direct tool invocation.
Add MCP-facing workflow surface area: mcp_tool trigger, mcp workflow handler, and an mcp.registry registry with JSON admin handlers.
Add challenge-response override tokens, wfctl override commands, and a new wfctl ci validate subcommand; document the self-improvement model and references.

Reviewed changes

Copilot reviewed 29 out of 30 changed files in this pull request and generated 14 comments.

Show a summary per file

File	Description
validation/override_handlers.go	Adds helpers for extracting override tokens from PR comments, headers, and workflow_dispatch inputs.
validation/override_handlers_test.go	Unit tests for override token parsing helpers.
validation/challenge.go	Implements time-bucketed 3-word HMAC challenge token generation/verification.
validation/challenge_test.go	Unit tests for challenge token behavior (shape, expiry, determinism).
plugins/mcp/plugin.go	Registers MCP plugin types: `mcp.registry`, `mcp_tool` trigger, and `mcp` workflow handler.
module/trigger_mcp_tool.go	Implements `mcp_tool` trigger and runtime for mapping tool calls to pipeline execution.
module/trigger_mcp_tool_test.go	Unit tests for MCP tool config → schema conversion and runtime invocation.
module/mcp_registry.go	Adds MCP registry data model plus HTTP JSON handlers for servers/tools listing.
module/mcp_registry_test.go	Unit tests for registry operations and JSON handler output.
mcp/server.go	Collects tool handlers after registration for in-process dispatch.
mcp/provider_interface.go	Introduces an interface for in-process MCP tool invocation.
mcp/library.go	Adds `NewInProcessServer` wrapper for direct tool calls without HTTP/subprocess.
mcp/library_test.go	Tests tool listing and basic in-process tool invocation behavior.
lsp/library.go	Adds in-process LSP helpers: diagnostics, completions, and hover.
lsp/library_test.go	Unit tests for LSP in-process helpers.
handlers/mcp.go	Adds `mcp` workflow handler type scaffold and route→tool definition helper.
handlers/mcp_test.go	Unit tests for MCP handler config helpers and `CanHandle`.
DOCUMENTATION.md	Documents new workflow handler/trigger type entries and self-improvement section.
docs/self-improvement.md	New feature guide describing architecture, safety model, and quick start.
docs/self-improvement-tutorial.md	Step-by-step tutorial for enabling self-improvement in an app config.
docs/plans/2026-04-13-self-improving-agentic-workflow-plan.md	Implementation plan document added.
docs/plans/2026-04-13-self-improving-agentic-workflow-design.md	Design document added.
docs/mcp-tools-reference.md	New MCP tools reference document.
docs/guardrails-guide.md	New guardrails configuration guide.
cmd/wfctl/override.go	Adds `wfctl override generate
cmd/wfctl/override_test.go	Unit tests for override CLI subcommands.
cmd/wfctl/main.go	Registers the new `override` command.
cmd/wfctl/ci.go	Wires `wfctl ci validate` subcommand.
cmd/wfctl/ci_validate.go	Implements `wfctl ci validate` logic (schema + refs + immutability-related checks + override).
cmd/wfctl/ci_validate_test.go	Unit tests for `wfctl ci validate`.

Copilot · 2026-04-13T09:28:29Z

+	switch *format {
+	case "json":
+		enc := json.NewEncoder(os.Stdout)
+		enc.SetIndent("", "  ")
+		return enc.Encode(map[string]any{"results": results, "passed": allPassed})
+	default:
+		for _, r := range results {
+			if r.Passed {
+				fmt.Printf("  PASS %s\n", r.File)
+			} else {
+				fmt.Printf("  FAIL %s\n", r.File)
+				for _, e := range r.Errors {
+					fmt.Printf("       %s\n", e)
+				}
+			}
+		}
+		if !allPassed {
+			return fmt.Errorf("%d file(s) failed ci validate", ciCountFailed(results))
+		}
+	}


When --format=json is used, the command returns the JSON output encode result directly and never returns a non-nil error when allPassed is false. This makes CI runs incorrectly succeed on validation failures whenever JSON output is requested. After writing JSON, return an error (or set a failing exit code) when passed is false so the behavior matches the text output mode.

Copilot · 2026-04-13T09:28:29Z

+			return fmt.Errorf("immutable-sections: triggers section is empty")
+		}
+	default:
+		// Unknown section — silently skip.


checkImmutableSection silently ignores unknown section names. For a CI guardrail, this is risky because typos in --immutable-sections will silently disable the intended check. Consider returning an error for unknown section values (and/or listing allowed values in the usage text).

Suggested change

// Unknown section — silently skip.

return fmt.Errorf(

"immutable-sections: unknown section %q (allowed: modules, workflows, pipelines, triggers)",

section,

)

Copilot · 2026-04-13T09:28:29Z

+	// Parse parameters from config.
+	var params []MCPToolParameter
+	if rawParams, ok := cfg["parameters"].([]any); ok {
+		for _, rp := range rawParams {
+			pm, ok := rp.(map[string]any)
+			if !ok {
+				continue
+			}
+			p := MCPToolParameter{}
+			if n, ok := pm["name"].(string); ok {
+				p.Name = n
+			}
+			if typ, ok := pm["type"].(string); ok {
+				p.Type = typ
+			}
+			if req, ok := pm["required"].(bool); ok {
+				p.Required = req
+			}
+			if desc, ok := pm["description"].(string); ok {
+				p.Description = desc
+			}
+			params = append(params, p)
+		}


Parameter parsing drops the enum field: MCPToolParameter.Enum is never populated from the trigger config, so ToToolDefinition() always emits an empty enum even when the YAML config supplies one. Parse enum (and validate its element types) when building MCPToolParameter.

Copilot · 2026-04-13T09:28:30Z

+		Description: description,
+		Parameters:  params,
+	}
+


Registering a tool overwrites any existing entry in t.tools with the same tool_name, which can lead to ambiguous or nondeterministic routing if multiple pipelines declare the same MCP tool. Prefer rejecting duplicate tool registrations unless the mapping is identical (e.g., same workflowType) to make configuration errors fail fast.

Suggested change

if _, exists := t.tools[toolName]; exists {

return fmt.Errorf("mcp_tool trigger: duplicate tool_name %q is already registered", toolName)

}

Copilot · 2026-04-13T09:28:30Z

+// ModuleFactories returns the factory for the mcp.registry module type.
+func (p *Plugin) ModuleFactories() map[string]plugin.ModuleFactory {
+	return map[string]plugin.ModuleFactory{
+		"mcp.registry": func(name string, _ map[string]any) modular.Module {
+			return newRegistryModule(name)
+		},
+	}


The mcp.registry module factory ignores the provided config map, so options like log_on_init, expose_admin_api, and audit_tool_calls can never take effect. Parse and apply the module config (or remove the config type/fields if not supported yet) so YAML configuration matches runtime behavior.

Copilot · 2026-04-13T09:28:31Z

+	// ListTools returns the names of all registered tools.
+	ListTools() []string
+	// CallTool invokes the named tool with the given arguments.
+	// Returns the text content of the result as a string, or an error.


The interface comment says CallTool “Returns the text content of the result as a string”, but the signature returns (any, error) and callers may receive non-string results. Update the comment to match the actual behavior (or change the return type if a string result is intended).

Suggested change

// Returns the text content of the result as a string, or an error.

// Returns the tool result, which may be of any type, or an error.

Copilot · 2026-04-13T09:28:32Z

+#### `validate_config`
+
+Validate a workflow YAML configuration string.
+
+| Parameter | Type | Required | Description |
+|-----------|------|----------|-------------|
+| `config` | string | yes | YAML config content to validate |
+| `skip_unknown_types` | bool | no | Skip unknown module/step type errors |
+| `strict` | bool | no | Fail on any warning |
+
+**Example:**
+```json
+{
+  "tool": "validate_config",
+  "arguments": {
+    "config": "modules:\n  - name: server\n    type: http.server\n    config:\n      address: ':8080'\n"
+  }
+}
+```
+
+**Returns:** Validation result with errors and warnings.
+
+---
+
+#### `template_validate_config`
+
+Validate a workflow config that uses template expressions.
+
+| Parameter | Type | Required | Description |
+|-----------|------|----------|-------------|
+| `config` | string | yes | YAML config with template expressions |
+| `context` | object | no | Template context variables for expression evaluation |
+
+---
+
+#### `inspect_config`
+
+Inspect a config and get a structured summary.
+
+| Parameter | Type | Required | Description |
+|-----------|------|----------|-------------|
+| `config` | string | yes | YAML config content |
+
+**Returns:** Structured summary with module names/types, workflow definitions, pipeline triggers, step counts.
+
+---
+
+#### `diff_configs`
+
+Compute a semantic diff between two workflow configs.
+
+| Parameter | Type | Required | Description |
+|-----------|------|----------|-------------|
+| `base` | string | yes | Base config YAML |
+| `proposed` | string | yes | Proposed config YAML |
+


This reference lists MCP tool parameter names that don’t match the actual server implementation (e.g., validate_config/inspect_config use yaml_content, and diff_configs uses old_yaml/new_yaml). Please update the docs to reflect the real argument keys so consumers can call tools successfully.

Copilot · 2026-04-13T09:28:32Z

+| `mcp.registry` | Registry and audit log for MCP tool registrations and invocations | agent |
+


The module type table attributes mcp.registry to workflow-plugin-agent and says it’s not part of the core engine, but this PR adds mcp.registry via the core workflow-plugin-mcp plugin. Please fix the plugin attribution/requirement note so it matches the implementation (or adjust the implementation if the intent is for agent plugin to own it).

Suggested change

| `mcp.registry` | Registry and audit log for MCP tool registrations and invocations | agent |

### MCP

> **Requires workflow-plugin-mcp.** Types in this section are provided by the MCP plugin.

> Install with: `wfctl plugin install workflow-plugin-mcp`

| Type | Description | Plugin |

|------|-------------|--------|

| `mcp.registry` | Registry and audit log for MCP tool registrations and invocations | mcp |

Copilot · 2026-04-13T09:28:32Z

+        M[Scenario 69<br/>Self-Improving API + Custom Go] --> N[Scenario 70<br/>Self-Extending MCP Tooling]
+        N --> O[Scenario 71<br/>Autonomous Agile Agent]


The design doc’s mermaid diagram references scenarios 69–71, but the PR plan/metadata references scenarios 85–87. Align the scenario numbers to avoid confusion when following the design/plan documents.

Suggested change

M[Scenario 69 Self-Improving API + Custom Go] --> N[Scenario 70 Self-Extending MCP Tooling]

N --> O[Scenario 71 Autonomous Agile Agent]

M[Scenario 85 Self-Improving API + Custom Go] --> N[Scenario 86 Self-Extending MCP Tooling]

N --> O[Scenario 87 Autonomous Agile Agent]

Copilot · 2026-04-13T09:28:33Z

+// ConfigureWorkflow sets up the MCP workflow from configuration.
+func (h *MCPWorkflowHandler) ConfigureWorkflow(_ modular.Application, workflowConfig any) error {
+	_, ok := workflowConfig.(map[string]any)
+	if !ok {
+		return fmt.Errorf("invalid MCP workflow configuration format")


ConfigureWorkflow currently only type-checks that the config is a map and otherwise accepts any content. This means invalid/misspelled keys (e.g., missing server_name/routes or wrong route shapes) won’t be caught at startup. Consider parsing into MCPHandlerConfig and validating required fields so configuration errors fail fast and tool routing can be set up reliably.

Suggested change

// ConfigureWorkflow sets up the MCP workflow from configuration.

func (h *MCPWorkflowHandler) ConfigureWorkflow(_ modular.Application, workflowConfig any) error {

_, ok := workflowConfig.(map[string]any)

if !ok {

return fmt.Errorf("invalid MCP workflow configuration format")

func parseMCPWorkflowConfig(workflowConfig any) (*MCPHandlerConfig, error) {

rawConfig, ok := workflowConfig.(map[string]any)

if !ok {

return nil, fmt.Errorf("invalid MCP workflow configuration format")

}

for key := range rawConfig {

switch key {

case "server_name", "log_tool_calls", "routes":

default:

return nil, fmt.Errorf("invalid MCP workflow configuration: unknown key %q", key)

}

}

serverNameValue, ok := rawConfig["server_name"]

if !ok {

return nil, fmt.Errorf("invalid MCP workflow configuration: missing required key %q", "server_name")

}

serverName, ok := serverNameValue.(string)

if !ok || strings.TrimSpace(serverName) == "" {

return nil, fmt.Errorf("invalid MCP workflow configuration: %q must be a non-empty string", "server_name")

}

cfg := &MCPHandlerConfig{

ServerName: serverName,

Routes: make(map[string]MCPHandlerRoute),

}

if logToolCallsValue, exists := rawConfig["log_tool_calls"]; exists {

logToolCalls, ok := logToolCallsValue.(bool)

if !ok {

return nil, fmt.Errorf("invalid MCP workflow configuration: %q must be a boolean", "log_tool_calls")

}

cfg.LogToolCalls = logToolCalls

}

routesValue, ok := rawConfig["routes"]

if !ok {

return nil, fmt.Errorf("invalid MCP workflow configuration: missing required key %q", "routes")

}

rawRoutes, ok := routesValue.(map[string]any)

if !ok {

return nil, fmt.Errorf("invalid MCP workflow configuration: %q must be a map of route definitions", "routes")

}

if len(rawRoutes) == 0 {

return nil, fmt.Errorf("invalid MCP workflow configuration: %q must define at least one route", "routes")

}

for toolName, routeValue := range rawRoutes {

if strings.TrimSpace(toolName) == "" {

return nil, fmt.Errorf("invalid MCP workflow configuration: route name must be non-empty")

}

rawRoute, ok := routeValue.(map[string]any)

if !ok {

return nil, fmt.Errorf("invalid MCP workflow configuration: route %q must be a map", toolName)

}

for key := range rawRoute {

switch key {

case "pipeline", "description":

default:

return nil, fmt.Errorf("invalid MCP workflow configuration: route %q has unknown key %q", toolName, key)

}

}

pipelineValue, ok := rawRoute["pipeline"]

if !ok {

return nil, fmt.Errorf("invalid MCP workflow configuration: route %q is missing required key %q", toolName, "pipeline")

}

pipeline, ok := pipelineValue.(string)

if !ok || strings.TrimSpace(pipeline) == "" {

return nil, fmt.Errorf("invalid MCP workflow configuration: route %q field %q must be a non-empty string", toolName, "pipeline")

}

route := MCPHandlerRoute{

Pipeline: pipeline,

}

if descriptionValue, exists := rawRoute["description"]; exists {

description, ok := descriptionValue.(string)

if !ok {

return nil, fmt.Errorf("invalid MCP workflow configuration: route %q field %q must be a string", toolName, "description")

}

route.Description = description

}

cfg.Routes[toolName] = route

}

return cfg, nil

}

// ConfigureWorkflow sets up the MCP workflow from configuration.

func (h *MCPWorkflowHandler) ConfigureWorkflow(_ modular.Application, workflowConfig any) error {

_, err := parseMCPWorkflowConfig(workflowConfig)

if err != nil {

return err

github-actions · 2026-04-13T09:31:34Z

⏱ Benchmark Results

✅ No significant performance regressions detected.

benchstat comparison (baseline → PR)

## benchstat: baseline → PR
baseline-bench.txt:245: parsing iteration count: invalid syntax
baseline-bench.txt:331695: parsing iteration count: invalid syntax
baseline-bench.txt:679735: parsing iteration count: invalid syntax
baseline-bench.txt:1000421: parsing iteration count: invalid syntax
baseline-bench.txt:1342368: parsing iteration count: invalid syntax
baseline-bench.txt:1653298: parsing iteration count: invalid syntax
benchmark-results.txt:245: parsing iteration count: invalid syntax
benchmark-results.txt:336618: parsing iteration count: invalid syntax
benchmark-results.txt:678508: parsing iteration count: invalid syntax
benchmark-results.txt:972152: parsing iteration count: invalid syntax
benchmark-results.txt:1299006: parsing iteration count: invalid syntax
benchmark-results.txt:1608349: parsing iteration count: invalid syntax
goos: linux
goarch: amd64
pkg: github.com/GoCodeAlone/workflow/dynamic
cpu: AMD EPYC 9V74 80-Core Processor                
                            │ baseline-bench.txt │        benchmark-results.txt        │
                            │       sec/op       │    sec/op      vs base              │
InterpreterCreation-4              3.048m ± 101%   3.456m ± 213%       ~ (p=0.818 n=6)
ComponentLoad-4                    3.561m ±   2%   3.479m ±   1%  -2.30% (p=0.002 n=6)
ComponentExecute-4                 1.833µ ±   2%   1.801µ ±   1%  -1.72% (p=0.002 n=6)
PoolContention/workers-1-4         1.028µ ±   5%   1.014µ ±   1%  -1.46% (p=0.006 n=6)
PoolContention/workers-2-4         1.024µ ±   4%   1.018µ ±   1%       ~ (p=0.374 n=6)
PoolContention/workers-4-4         1.022µ ±   1%   1.026µ ±   1%       ~ (p=0.483 n=6)
PoolContention/workers-8-4         1.020µ ±   3%   1.019µ ±   1%       ~ (p=0.556 n=6)
PoolContention/workers-16-4        1.029µ ±   3%   1.030µ ±   2%       ~ (p=0.920 n=6)
ComponentLifecycle-4               3.676m ±   2%   3.539m ±   1%  -3.73% (p=0.002 n=6)
SourceValidation-4                 2.106µ ±   2%   2.133µ ±   1%  +1.28% (p=0.004 n=6)
RegistryConcurrent-4               781.2n ±   6%   792.5n ±   3%       ~ (p=0.394 n=6)
LoaderLoadFromString-4             3.613m ±   1%   3.547m ±   0%  -1.82% (p=0.002 n=6)
geomean                            16.76µ          16.81µ         +0.33%

                            │ baseline-bench.txt │        benchmark-results.txt         │
                            │        B/op        │     B/op      vs base                │
InterpreterCreation-4               2.027Mi ± 0%   2.027Mi ± 0%       ~ (p=0.370 n=6)
ComponentLoad-4                     2.180Mi ± 0%   2.180Mi ± 0%       ~ (p=0.461 n=6)
ComponentExecute-4                  1.203Ki ± 0%   1.203Ki ± 0%       ~ (p=1.000 n=6) ¹
PoolContention/workers-1-4          1.203Ki ± 0%   1.203Ki ± 0%       ~ (p=1.000 n=6) ¹
PoolContention/workers-2-4          1.203Ki ± 0%   1.203Ki ± 0%       ~ (p=1.000 n=6) ¹
PoolContention/workers-4-4          1.203Ki ± 0%   1.203Ki ± 0%       ~ (p=1.000 n=6) ¹
PoolContention/workers-8-4          1.203Ki ± 0%   1.203Ki ± 0%       ~ (p=1.000 n=6) ¹
PoolContention/workers-16-4         1.203Ki ± 0%   1.203Ki ± 0%       ~ (p=1.000 n=6) ¹
ComponentLifecycle-4                2.183Mi ± 0%   2.183Mi ± 0%       ~ (p=0.253 n=6)
SourceValidation-4                  1.984Ki ± 0%   1.984Ki ± 0%       ~ (p=1.000 n=6) ¹
RegistryConcurrent-4                1.133Ki ± 0%   1.133Ki ± 0%       ~ (p=1.000 n=6) ¹
LoaderLoadFromString-4              2.182Mi ± 0%   2.182Mi ± 0%       ~ (p=0.584 n=6)
geomean                             15.25Ki        15.25Ki       +0.00%
¹ all samples are equal

                            │ baseline-bench.txt │        benchmark-results.txt        │
                            │     allocs/op      │  allocs/op   vs base                │
InterpreterCreation-4                15.68k ± 0%   15.68k ± 0%       ~ (p=1.000 n=6) ¹
ComponentLoad-4                      18.02k ± 0%   18.02k ± 0%       ~ (p=1.000 n=6)
ComponentExecute-4                    25.00 ± 0%    25.00 ± 0%       ~ (p=1.000 n=6) ¹
PoolContention/workers-1-4            25.00 ± 0%    25.00 ± 0%       ~ (p=1.000 n=6) ¹
PoolContention/workers-2-4            25.00 ± 0%    25.00 ± 0%       ~ (p=1.000 n=6) ¹
PoolContention/workers-4-4            25.00 ± 0%    25.00 ± 0%       ~ (p=1.000 n=6) ¹
PoolContention/workers-8-4            25.00 ± 0%    25.00 ± 0%       ~ (p=1.000 n=6) ¹
PoolContention/workers-16-4           25.00 ± 0%    25.00 ± 0%       ~ (p=1.000 n=6) ¹
ComponentLifecycle-4                 18.07k ± 0%   18.07k ± 0%       ~ (p=1.000 n=6) ¹
SourceValidation-4                    32.00 ± 0%    32.00 ± 0%       ~ (p=1.000 n=6) ¹
RegistryConcurrent-4                  2.000 ± 0%    2.000 ± 0%       ~ (p=1.000 n=6) ¹
LoaderLoadFromString-4               18.06k ± 0%   18.06k ± 0%       ~ (p=1.000 n=6) ¹
geomean                               183.3         183.3       +0.00%
¹ all samples are equal

pkg: github.com/GoCodeAlone/workflow/middleware
                                  │ baseline-bench.txt │       benchmark-results.txt        │
                                  │       sec/op       │    sec/op     vs base              │
CircuitBreakerDetection-4                  297.6n ± 4%   295.7n ± 14%       ~ (p=0.288 n=6)
CircuitBreakerExecution_Success-4          22.67n ± 1%   22.68n ±  1%       ~ (p=0.797 n=6)
CircuitBreakerExecution_Failure-4          71.00n ± 1%   71.14n ±  0%       ~ (p=0.370 n=6)
geomean                                    78.24n        78.13n        -0.14%

                                  │ baseline-bench.txt │       benchmark-results.txt        │
                                  │        B/op        │    B/op     vs base                │
CircuitBreakerDetection-4                 144.0 ± 0%     144.0 ± 0%       ~ (p=1.000 n=6) ¹
CircuitBreakerExecution_Success-4         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
CircuitBreakerExecution_Failure-4         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
geomean                                              ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                  │ baseline-bench.txt │       benchmark-results.txt        │
                                  │     allocs/op      │ allocs/op   vs base                │
CircuitBreakerDetection-4                 1.000 ± 0%     1.000 ± 0%       ~ (p=1.000 n=6) ¹
CircuitBreakerExecution_Success-4         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
CircuitBreakerExecution_Failure-4         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
geomean                                              ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

pkg: github.com/GoCodeAlone/workflow/module
                                 │ baseline-bench.txt │       benchmark-results.txt        │
                                 │       sec/op       │    sec/op     vs base              │
JQTransform_Simple-4                     838.2n ± 30%   823.3n ± 28%       ~ (p=0.193 n=6)
JQTransform_ObjectConstruction-4         1.427µ ±  3%   1.411µ ±  1%  -1.12% (p=0.002 n=6)
JQTransform_ArraySelect-4                3.509µ ±  4%   3.394µ ±  1%  -3.29% (p=0.002 n=6)
JQTransform_Complex-4                    42.13µ ±  1%   41.29µ ±  2%  -2.01% (p=0.015 n=6)
JQTransform_Throughput-4                 1.766µ ±  1%   1.718µ ±  1%  -2.72% (p=0.002 n=6)
SSEPublishDelivery-4                     62.98n ±  1%   63.16n ±  2%       ~ (p=0.394 n=6)
geomean                                  1.643µ         1.614µ        -1.78%

                                 │ baseline-bench.txt │        benchmark-results.txt         │
                                 │        B/op        │     B/op      vs base                │
JQTransform_Simple-4                   1.273Ki ± 0%     1.273Ki ± 0%       ~ (p=1.000 n=6) ¹
JQTransform_ObjectConstruction-4       1.773Ki ± 0%     1.773Ki ± 0%       ~ (p=1.000 n=6) ¹
JQTransform_ArraySelect-4              2.625Ki ± 0%     2.625Ki ± 0%       ~ (p=1.000 n=6) ¹
JQTransform_Complex-4                  16.22Ki ± 0%     16.22Ki ± 0%       ~ (p=1.000 n=6) ¹
JQTransform_Throughput-4               1.984Ki ± 0%     1.984Ki ± 0%       ~ (p=1.000 n=6) ¹
SSEPublishDelivery-4                     0.000 ± 0%       0.000 ± 0%       ~ (p=1.000 n=6) ¹
geomean                                             ²                 +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                 │ baseline-bench.txt │       benchmark-results.txt        │
                                 │     allocs/op      │ allocs/op   vs base                │
JQTransform_Simple-4                     10.00 ± 0%     10.00 ± 0%       ~ (p=1.000 n=6) ¹
JQTransform_ObjectConstruction-4         15.00 ± 0%     15.00 ± 0%       ~ (p=1.000 n=6) ¹
JQTransform_ArraySelect-4                30.00 ± 0%     30.00 ± 0%       ~ (p=1.000 n=6) ¹
JQTransform_Complex-4                    324.0 ± 0%     324.0 ± 0%       ~ (p=1.000 n=6) ¹
JQTransform_Throughput-4                 17.00 ± 0%     17.00 ± 0%       ~ (p=1.000 n=6) ¹
SSEPublishDelivery-4                     0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
geomean                                             ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

pkg: github.com/GoCodeAlone/workflow/schema
                                    │ baseline-bench.txt │       benchmark-results.txt       │
                                    │       sec/op       │   sec/op     vs base              │
SchemaValidation_Simple-4                    1.085µ ± 8%   1.069µ ± 5%       ~ (p=0.180 n=6)
SchemaValidation_AllFields-4                 1.627µ ± 6%   1.613µ ± 1%       ~ (p=0.190 n=6)
SchemaValidation_FormatValidation-4          1.567µ ± 1%   1.566µ ± 2%       ~ (p=0.970 n=6)
SchemaValidation_ManySchemas-4               1.585µ ± 1%   1.566µ ± 2%       ~ (p=0.121 n=6)
geomean                                      1.447µ        1.434µ       -0.89%

                                    │ baseline-bench.txt │       benchmark-results.txt        │
                                    │        B/op        │    B/op     vs base                │
SchemaValidation_Simple-4                   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
SchemaValidation_AllFields-4                0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
SchemaValidation_FormatValidation-4         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
SchemaValidation_ManySchemas-4              0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
geomean                                                ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                    │ baseline-bench.txt │       benchmark-results.txt        │
                                    │     allocs/op      │ allocs/op   vs base                │
SchemaValidation_Simple-4                   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
SchemaValidation_AllFields-4                0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
SchemaValidation_FormatValidation-4         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
SchemaValidation_ManySchemas-4              0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=6) ¹
geomean                                                ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

pkg: github.com/GoCodeAlone/workflow/store
                                   │ baseline-bench.txt │       benchmark-results.txt        │
                                   │       sec/op       │    sec/op     vs base              │
EventStoreAppend_InMemory-4                1.132µ ± 13%   1.183µ ± 14%       ~ (p=0.394 n=6)
EventStoreAppend_SQLite-4                 1067.8µ ±  5%   990.3µ ±  4%  -7.26% (p=0.002 n=6)
GetTimeline_InMemory/events-10-4           12.62µ ±  4%   12.32µ ±  1%  -2.31% (p=0.002 n=6)
GetTimeline_InMemory/events-50-4           70.84µ ± 22%   60.43µ ± 16%       ~ (p=0.065 n=6)
GetTimeline_InMemory/events-100-4          109.8µ ±  1%   105.7µ ±  1%  -3.71% (p=0.002 n=6)
GetTimeline_InMemory/events-500-4          558.6µ ±  1%   543.3µ ±  1%  -2.74% (p=0.002 n=6)
GetTimeline_InMemory/events-1000-4         1.149m ±  1%   1.111m ±  5%       ~ (p=0.310 n=6)
GetTimeline_SQLite/events-10-4             87.66µ ±  1%   85.03µ ±  1%  -2.99% (p=0.002 n=6)
GetTimeline_SQLite/events-50-4             226.6µ ±  2%   218.7µ ±  0%  -3.45% (p=0.002 n=6)
GetTimeline_SQLite/events-100-4            405.4µ ±  1%   382.3µ ±  1%  -5.70% (p=0.002 n=6)
GetTimeline_SQLite/events-500-4            1.697m ±  3%   1.656m ±  2%  -2.42% (p=0.002 n=6)
GetTimeline_SQLite/events-1000-4           3.341m ±  4%   3.282m ±  6%  -1.77% (p=0.041 n=6)
geomean                                    197.3µ         189.5µ        -3.92%

                                   │ baseline-bench.txt │         benchmark-results.txt         │
                                   │        B/op        │     B/op       vs base                │
EventStoreAppend_InMemory-4                  815.5 ± 6%     746.0 ± 14%       ~ (p=0.128 n=6)
EventStoreAppend_SQLite-4                  1.983Ki ± 1%   1.985Ki ±  2%       ~ (p=0.413 n=6)
GetTimeline_InMemory/events-10-4           7.953Ki ± 0%   7.953Ki ±  0%       ~ (p=1.000 n=6) ¹
GetTimeline_InMemory/events-50-4           46.62Ki ± 0%   46.62Ki ±  0%       ~ (p=1.000 n=6) ¹
GetTimeline_InMemory/events-100-4          94.48Ki ± 0%   94.48Ki ±  0%       ~ (p=1.000 n=6) ¹
GetTimeline_InMemory/events-500-4          472.8Ki ± 0%   472.8Ki ±  0%       ~ (p=1.000 n=6)
GetTimeline_InMemory/events-1000-4         944.3Ki ± 0%   944.3Ki ±  0%       ~ (p=1.000 n=6)
GetTimeline_SQLite/events-10-4             16.74Ki ± 0%   16.74Ki ±  0%       ~ (p=1.000 n=6) ¹
GetTimeline_SQLite/events-50-4             87.14Ki ± 0%   87.14Ki ±  0%       ~ (p=1.000 n=6) ¹
GetTimeline_SQLite/events-100-4            175.4Ki ± 0%   175.4Ki ±  0%       ~ (p=1.000 n=6) ¹
GetTimeline_SQLite/events-500-4            846.1Ki ± 0%   846.1Ki ±  0%       ~ (p=0.242 n=6)
GetTimeline_SQLite/events-1000-4           1.639Mi ± 0%   1.639Mi ±  0%       ~ (p=0.489 n=6)
geomean                                    67.52Ki        67.02Ki        -0.73%
¹ all samples are equal

                                   │ baseline-bench.txt │        benchmark-results.txt        │
                                   │     allocs/op      │  allocs/op   vs base                │
EventStoreAppend_InMemory-4                  7.000 ± 0%    7.000 ± 0%       ~ (p=1.000 n=6) ¹
EventStoreAppend_SQLite-4                    53.00 ± 0%    53.00 ± 0%       ~ (p=1.000 n=6) ¹
GetTimeline_InMemory/events-10-4             125.0 ± 0%    125.0 ± 0%       ~ (p=1.000 n=6) ¹
GetTimeline_InMemory/events-50-4             653.0 ± 0%    653.0 ± 0%       ~ (p=1.000 n=6) ¹
GetTimeline_InMemory/events-100-4           1.306k ± 0%   1.306k ± 0%       ~ (p=1.000 n=6) ¹
GetTimeline_InMemory/events-500-4           6.514k ± 0%   6.514k ± 0%       ~ (p=1.000 n=6) ¹
GetTimeline_InMemory/events-1000-4          13.02k ± 0%   13.02k ± 0%       ~ (p=1.000 n=6) ¹
GetTimeline_SQLite/events-10-4               382.0 ± 0%    382.0 ± 0%       ~ (p=1.000 n=6) ¹
GetTimeline_SQLite/events-50-4              1.852k ± 0%   1.852k ± 0%       ~ (p=1.000 n=6) ¹
GetTimeline_SQLite/events-100-4             3.681k ± 0%   3.681k ± 0%       ~ (p=1.000 n=6) ¹
GetTimeline_SQLite/events-500-4             18.54k ± 0%   18.54k ± 0%       ~ (p=1.000 n=6) ¹
GetTimeline_SQLite/events-1000-4            37.29k ± 0%   37.29k ± 0%       ~ (p=1.000 n=6) ¹
geomean                                     1.162k        1.162k       +0.00%
¹ all samples are equal

Benchmarks run with go test -bench=. -benchmem -count=6.
Regressions ≥ 20% are flagged. Results compared via benchstat.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 31 out of 33 changed files in this pull request and generated 6 comments.

Copilot · 2026-04-13T13:22:14Z

+// wordCount is the number of words in bip39Subset.
+const wordCount = 256
+
+// bip39Subset is a fixed 256-word subset of the BIP-39 English word list used
+// for generating readable challenge tokens. Words are chosen to be short,
+// unambiguous, and easy to type.
+var bip39Subset = [wordCount]string{
+	"able", "about", "above", "absent", "absorb", "abstract", "absurd", "abuse",
+	"access", "account", "accuse", "achieve", "acid", "acoustic", "acquire", "across",
+	"act", "action", "actor", "actual", "adapt", "add", "addict", "address",
+	"adjust", "admit", "adult", "advance", "advice", "aerobic", "afford", "afraid",
+	"again", "agent", "agree", "ahead", "aim", "airport", "aisle", "alarm",
+	"album", "alcohol", "alert", "alien", "all", "alley", "allow", "almost",
+	"alone", "alpha", "already", "alter", "always", "amateur", "amazing", "among",
+	"amount", "amused", "analyst", "anchor", "ancient", "anger", "angle", "angry",
+	"animal", "ankle", "announce", "annual", "another", "answer", "antenna", "antique",
+	"anxiety", "apart", "apple", "approve", "april", "arch", "arctic", "argue",
+	"arm", "armed", "armor", "army", "around", "arrange", "arrest", "arrive",
+	"arrow", "art", "article", "artist", "aspect", "assault", "asset", "assist",
+	"assume", "athlete", "atom", "attack", "attend", "attitude", "attract", "auction",
+	"august", "aunt", "author", "auto", "autumn", "average", "avocado", "avoid",
+	"awake", "aware", "away", "awesome", "awful", "awkward", "axis", "baby",
+	"balance", "bamboo", "banana", "banner", "barely", "bargain", "barrel", "base",
+	"basic", "basket", "battle", "beach", "beauty", "because", "become", "beef",
+	"before", "begin", "behave", "behind", "believe", "below", "belt", "bench",
+	"benefit", "best", "betray", "better", "between", "beyond", "bicycle", "bind",
+	"biology", "bird", "birth", "bitter", "black", "blade", "blame", "blanket",
+	"blast", "bleak", "bless", "blind", "blood", "blossom", "blouse", "blue",
+	"blur", "blush", "board", "boat", "body", "boil", "bomb", "bone",
+	"bonus", "book", "boost", "border", "boring", "borrow", "boss", "bottom",
+	"bounce", "boy", "brain", "brand", "brave", "breeze", "brick", "bridge",
+	"brief", "bright", "bring", "brisk", "broccoli", "broken", "bronze", "broom",
+	"brother", "brown", "brush", "bubble", "buddy", "budget", "buffalo", "build",
+	"bulk", "bullet", "bundle", "bunker", "burden", "burger", "burst", "bus",
+	"business", "busy", "butter", "buyer", "buzz", "cabbage", "cabin", "cable",
+	"cactus", "cage", "cake", "call", "calm", "camera", "camp", "canal",
+	"cancel", "candy", "cannon", "canvas", "canyon", "capable", "capital", "captain",
+	"carbon", "card", "cargo", "carpet", "carry", "cart", "case", "castle",
+	"catalog", "catch", "category", "cause", "caution", "cave", "ceiling", "celery",
+}


bip39Subset is declared as a 256-word array (wordCount = 256), but only ~50 words are initialized here, leaving the remaining entries as empty strings. computeToken can therefore generate tokens with empty word segments, breaking the intended 3-word passphrase format and making tests/token verification flaky. Populate all 256 words or switch to a slice and compute modulo len(bip39Subset) to guarantee non-empty tokens.

Copilot · 2026-04-13T13:22:15Z

+// HandleToolCall executes the bound pipeline with the given tool arguments.
+func (r *MCPToolTriggerRuntime) HandleToolCall(ctx context.Context, args map[string]any) (any, error) {
+	if r.executor == nil {
+		return nil, fmt.Errorf("mcp_tool: no pipeline executor available")
+	}
+	result, err := r.executor.ExecutePipeline(ctx, r.pipeline, args)
+	if err != nil {
+		return nil, err
+	}
+	return result, nil
+}


workflowType injected by the engine uses the pipeline:<name> prefix (see engine.wrapPipelineTriggerConfig). HandleToolCall passes this value directly to PipelineExecutor.ExecutePipeline, but ExecutePipeline expects the bare pipeline name (it prefixes with pipeline: internally). This will cause tool calls to fail to locate the pipeline. Trim the pipeline: prefix before calling ExecutePipeline, or store the bare pipeline name in the runtime.

Copilot · 2026-04-13T13:22:15Z

+	runtime := NewMCPToolTriggerRuntime(cfg, "pipeline:analyze-logs", executor)
+
+	args := map[string]any{"timeframe": "1h"}
+	result, err := runtime.HandleToolCall(context.Background(), args)
+	if err != nil {
+		t.Fatalf("unexpected error: %v", err)
+	}
+	if !called {
+		t.Error("expected executor to be called")
+	}
+	if capturedPipeline != "pipeline:analyze-logs" {
+		t.Errorf("expected pipeline:analyze-logs, got %s", capturedPipeline)
+	}


This test asserts that ExecutePipeline is called with a pipeline:<name>-prefixed identifier, but interfaces.PipelineExecutor.ExecutePipeline expects the bare pipeline name (the engine adds the pipeline: prefix internally when dispatching). Update the test to expect the trimmed pipeline name and add coverage for the prefix-trimming behavior in the trigger runtime.

Copilot · 2026-04-13T13:22:16Z

+// parseRegistryConfig converts a raw config map to MCPRegistryConfig.
+func parseRegistryConfig(cfg map[string]any) module.MCPRegistryConfig {
+	if cfg == nil {
+		return module.MCPRegistryConfig{}
+	}
+	var out module.MCPRegistryConfig
+	if v, ok := cfg["log_on_init"].(bool); ok {
+		out.LogOnInit = v
+	}
+	if v, ok := cfg["expose_admin_api"].(bool); ok {
+		out.ExposeAdminAPI = v
+	}
+	if v, ok := cfg["audit_tool_calls"].(bool); ok {
+		out.AuditToolCalls = v
+	}
+	return out
+}
+
+// TriggerFactories returns trigger constructors for the mcp_tool trigger type.
+func (p *Plugin) TriggerFactories() map[string]plugin.TriggerFactory {
+	return map[string]plugin.TriggerFactory{
+		"mcp_tool": func() any {
+			return module.NewMCPToolTrigger()
+		},
+	}
+}
+
+// WorkflowHandlers returns workflow handler factories for the mcp workflow type.
+func (p *Plugin) WorkflowHandlers() map[string]plugin.WorkflowHandlerFactory {
+	return map[string]plugin.WorkflowHandlerFactory{
+		"mcp": func() any {
+			return handlers.NewMCPWorkflowHandler()
+		},
+	}
+}
+
+// registryModule wraps MCPRegistry as a modular.Module.
+type registryModule struct {
+	name     string
+	cfg      module.MCPRegistryConfig
+	registry *module.MCPRegistry
+}
+
+func newRegistryModule(name string, cfg module.MCPRegistryConfig) *registryModule {
+	return &registryModule{
+		name:     name,
+		cfg:      cfg,
+		registry: module.NewMCPRegistry(),
+	}
+}
+
+func (m *registryModule) Name() string           { return m.name }
+func (m *registryModule) Dependencies() []string { return nil }
+func (m *registryModule) Init(_ modular.Application) error {
+	if m.cfg.LogOnInit {
+		m.registry.Logger().Info("mcp.registry module initialized", "name", m.name)
+	}
+	return nil
+}


MCPRegistryConfig fields expose_admin_api and audit_tool_calls are parsed, but the module wrapper never uses them (only LogOnInit is referenced). As-is, configuring expose_admin_api: true won't actually register /admin/mcp/servers or /admin/mcp/tools, and audit_tool_calls doesn't affect logging. Either wire these behaviors via a plugin wiring hook / router registration and tool-call instrumentation, or remove the config options to avoid a misleading API.

Copilot · 2026-04-13T13:22:16Z

+// ConfigureWorkflow sets up the MCP workflow from configuration.
+func (h *MCPWorkflowHandler) ConfigureWorkflow(_ modular.Application, workflowConfig any) error {
+	_, ok := workflowConfig.(map[string]any)
+	if !ok {
+		return fmt.Errorf("invalid MCP workflow configuration format")
+	}
+	return nil
+}
+
+// ExecuteWorkflow executes an MCP tool call by routing the action to a pipeline.
+func (h *MCPWorkflowHandler) ExecuteWorkflow(_ context.Context, _ string, _ string, _ map[string]any) (map[string]any, error) {
+	// MCP tool calls are dispatched through the MCPToolTrigger, not this handler.
+	// This handler's ConfigureWorkflow is invoked at startup to validate the workflow config.
+	return map[string]any{}, nil
+}


The mcp workflow handler currently doesn't parse/validate server_name/routes and doesn't actually expose the configured pipelines as MCP tools (it returns an empty result and states tool calls are dispatched elsewhere). This makes type: mcp configurations appear to work at startup but provide no functional behavior. Either implement route→pipeline registration (e.g., by configuring mcp_tool mappings or integrating with an MCP server/registry) or remove/rename the handler to avoid a non-functional workflow type.

Copilot · 2026-04-13T13:22:16Z

+> **Requires workflow-plugin-agent v0.8.0+.** The `mcp_tool` trigger, `mcp.registry`
+> module, and all `step.agent_execute` / `step.self_improve_*` / `step.blackboard_*`
+> types are provided by this plugin, not the workflow core engine.


This section says mcp_tool / mcp.registry are provided by workflow-plugin-agent, but in this PR they are introduced via workflow-plugin-mcp (and also added to plugins/all). Please update the requirement note to reference the MCP plugin, and keep the agent plugin requirement limited to agent/blackboard/self_improve step types.

Suggested change

> **Requires workflow-plugin-agent v0.8.0+.** The `mcp_tool` trigger, `mcp.registry`

> module, and all `step.agent_execute` / `step.self_improve_*` / `step.blackboard_*`

> types are provided by this plugin, not the workflow core engine.

> **Requires workflow-plugin-mcp.** The `mcp_tool` trigger and `mcp.registry`

> module are provided by this plugin, not the workflow core engine.

> Install with: `wfctl plugin install workflow-plugin-mcp`

> **Agent step types require workflow-plugin-agent v0.8.0+.** The

> `step.agent_execute`, `step.self_improve_*`, and `step.blackboard_*` types are

> provided by this plugin, not the workflow core engine.

Copilot

Pull request overview

Copilot reviewed 32 out of 35 changed files in this pull request and generated 8 comments.

Copilot · 2026-04-13T13:39:11Z

+func (r *MCPToolTriggerRuntime) HandleToolCall(ctx context.Context, args map[string]any) (any, error) {
+	if r.executor == nil {
+		return nil, fmt.Errorf("mcp_tool: no pipeline executor available")
+	}
+	result, err := r.executor.ExecutePipeline(ctx, r.pipeline, args)
+	if err != nil {


interfaces.PipelineExecutor.ExecutePipeline expects the pipeline name (e.g. "analyze-logs"), but HandleToolCall passes r.pipeline which is currently populated from workflowType (e.g. "pipeline:analyze-logs"). With the default engine executor (StdEngine.ExecutePipeline), this will become pipeline:pipeline:analyze-logs and fail to dispatch. Store only the bare pipeline name in the runtime (strip the pipeline: prefix) or call TriggerWorkflow directly with the full workflowType.

Copilot · 2026-04-13T13:39:12Z

+	result, callErr := handler(ctx, req)
+	if p.auditLogger != nil {
+		p.auditLogger.Info("mcp tool call",
+			"tool", name,
+			"error", callErr,
+		)
+	}
+	if callErr != nil {
+		return nil, callErr
+	}
+
+	for _, c := range result.Content {
+		if tc, ok := c.(mcp.TextContent); ok {
+			return tc.Text, nil
+		}
+	}
+	return result, nil


CallTool ignores result.IsError. Many MCP handlers report tool failures by returning a CallToolResult with IsError=true and error=nil; with the current implementation those failures are treated as success and returned as plain text. Consider converting result.IsError into a Go error (e.g., using the text content as the error message) so callers can reliably detect tool failures.

Copilot · 2026-04-13T13:39:12Z

+	argsJSON, err := json.Marshal(args)
+	if err != nil {
+		return nil, fmt.Errorf("marshal args: %w", err)
+	}
+
+	var req mcp.CallToolRequest
+	req.Params.Name = name
+	req.Params.Arguments = make(map[string]any)
+	if err := json.Unmarshal(argsJSON, &req.Params.Arguments); err != nil {
+		return nil, fmt.Errorf("unmarshal args: %w", err)
+	}


CallTool marshals args to JSON and unmarshals into req.Params.Arguments. If args is nil, this becomes null and json.Unmarshal into a map fails. Either treat nil as an empty argument object, or skip the marshal/unmarshal step and assign args directly (copying the map if you need isolation).

Copilot · 2026-04-13T13:39:12Z

+	if v, ok := cfg["log_on_init"].(bool); ok {
+		out.LogOnInit = v
+	}
+	if v, ok := cfg["expose_admin_api"].(bool); ok {
+		out.ExposeAdminAPI = v
+	}
+	if v, ok := cfg["audit_tool_calls"].(bool); ok {
+		out.AuditToolCalls = v
+	}
+	return out


parseRegistryConfig populates ExposeAdminAPI and AuditToolCalls, but nothing in this plugin/module currently uses those fields (only LogOnInit is referenced). Either wire these options into actual HTTP route registration / audit logging, or remove them to avoid shipping config knobs that are no-ops.

Copilot · 2026-04-13T13:39:12Z

+func TestRunCIValidate_ImmutableSections(t *testing.T) {
+	f := writeTempConfig(t, "modules:\n  - name: server\n    type: http.server\n    config:\n      port: 8080\n")
+	// Require workflows section — should fail since it's missing
+	err := runCIValidate([]string{"--immutable-sections=workflows", f})
+	if err == nil {
+		t.Fatal("expected failure when required workflows section is missing")
+	}


This test intends to validate the --immutable-sections behavior, but the YAML uses http.server config key port, which fails schema validation first (the module requires address). Use a schema-valid config here so the failure is attributable to the immutable-sections check and the test remains robust.

Copilot · 2026-04-13T13:39:13Z

+> **Requires workflow-plugin-agent v0.8.0+.** The `mcp_tool` trigger, `mcp.registry`
+> module, and all `step.agent_execute` / `step.self_improve_*` / `step.blackboard_*`
+> types are provided by this plugin, not the workflow core engine.
+> Install with: `wfctl plugin install workflow-plugin-agent`


This section says mcp_tool trigger and mcp.registry are provided by workflow-plugin-agent, but in this PR they’re introduced by workflow-plugin-mcp (see plugins/mcp/plugin.go). The requirement note should be updated so readers install the correct plugin.

Suggested change

> **Requires workflow-plugin-agent v0.8.0+.** The `mcp_tool` trigger, `mcp.registry`

> module, and all `step.agent_execute` / `step.self_improve_*` / `step.blackboard_*`

> types are provided by this plugin, not the workflow core engine.

> Install with: `wfctl plugin install workflow-plugin-agent`

> **Requires workflow-plugin-mcp v0.8.0+.** The `mcp_tool` trigger, `mcp`

> workflow handler, and `mcp.registry` module are provided by this plugin, not

> the workflow core engine.

> Install with: `wfctl plugin install workflow-plugin-mcp`

>

> Agent-related step types such as `step.agent_execute`,

> `step.self_improve_*`, and `step.blackboard_*` are provided separately by

> `workflow-plugin-agent`.

Copilot · 2026-04-13T13:39:13Z

+)
+
+result, err := mcp.CallTool(ctx, "validate_config", map[string]any{
+    "config": proposedYAML,


The in-process usage example calls validate_config with argument key config, but the tool definition uses yaml_content. As written, the example will produce an MCP tool error (or silently fail if the caller ignores result.IsError). Update the snippet to pass yaml_content.

Suggested change

"config": proposedYAML,

"yaml_content": proposedYAML,

Copilot · 2026-04-13T13:39:13Z

+// ConfigureWorkflow sets up the MCP workflow from configuration.
+func (h *MCPWorkflowHandler) ConfigureWorkflow(_ modular.Application, workflowConfig any) error {
+	_, ok := workflowConfig.(map[string]any)
+	if !ok {
+		return fmt.Errorf("invalid MCP workflow configuration format")
+	}
+	return nil


ConfigureWorkflow currently only type-checks workflowConfig and otherwise accepts any map, without parsing/validating expected fields (e.g., server_name, routes) or registering any services. This makes misconfigurations easy to miss at startup compared to other handlers (e.g., CLIWorkflowHandler parses and validates its config). Consider decoding into MCPHandlerConfig and validating required fields/route entries here (or remove this handler if it’s intentionally a no-op).

intel352 and others added 15 commits April 13, 2026 02:28

docs: add self-improving agentic workflow implementation plan

ca29395

11-phase plan covering workflow core MCP library, agent plugin guardrails/blackboard/safety, and 3 validation scenarios (85-87). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

feat(mcp): add in-process MCP server library with MCPProvider interface

8f04e4b

Extract wfctl MCP tools as a Go library for direct invocation without HTTP or subprocess overhead. All 25+ tools available. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(mcp): add toolHandlers field and collectToolHandlers to Server

2f2c294

Adds ToolHandlerFunc type alias, toolHandlers map to Server struct, and collectToolHandlers() method to support in-process MCP invocation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(lsp): add JSON struct tags to library types

44bf81b

Add snake_case JSON tags to Diagnostic, CompletionResult, and HoverResult so fields serialize correctly in MCP tool responses. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(validation): add challenge-response override tokens and wfctl ci…

50f0e8a

… validate Deterministic 3-word passphrases for guardrail overrides. CI validate runs full validation suite with immutability enforcement. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings April 13, 2026 09:19

Copilot started reviewing on behalf of intel352 April 13, 2026 09:20 View session

This was referenced Apr 13, 2026

feat: self-improving agentic workflow — agent plugin GoCodeAlone/workflow-plugin-agent#6

Merged

feat: self-improving agentic workflow — scenarios 85-87 GoCodeAlone/workflow-scenarios#5

Merged

Copilot AI reviewed Apr 13, 2026

View reviewed changes

intel352 and others added 2 commits April 13, 2026 09:10

fix: register MCP plugin in plugins/all for engine discovery

7024f29

fix: go mod tidy + address Copilot review comments

68650d0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings April 13, 2026 13:15

Copilot started reviewing on behalf of intel352 April 13, 2026 13:16 View session

fix: update otel deps to v1.43.0 and run go mod tidy

6d76e96

Copilot AI reviewed Apr 13, 2026

View reviewed changes

intel352 added 2 commits April 13, 2026 09:24

fix: update all otel/x deps to latest for CI go mod tidy consistency

5df3167

fix: update example module deps for CI go mod tidy consistency

8eb7deb

Copilot AI review requested due to automatic review settings April 13, 2026 13:30

Copilot started reviewing on behalf of intel352 April 13, 2026 13:31 View session

intel352 merged commit 49d7b64 into main Apr 13, 2026
20 of 22 checks passed

intel352 deleted the feat/mcp-library branch April 13, 2026 13:38

Copilot AI reviewed Apr 13, 2026

View reviewed changes

+	if _, exists := t.tools[toolName]; exists {
+		return fmt.Errorf("mcp_tool trigger: duplicate tool_name %q is already registered", toolName)
+	}

	// Returns the text content of the result as a string, or an error.
	// Returns the tool result, which may be of any type, or an error.

		\| `mcp.registry` \| Registry and audit log for MCP tool registrations and invocations \| agent \|

-| `mcp.registry` | Registry and audit log for MCP tool registrations and invocations | agent |
+### MCP
+> **Requires workflow-plugin-mcp.** Types in this section are provided by the MCP plugin.
+> Install with: `wfctl plugin install workflow-plugin-mcp`
+| Type | Description | Plugin |
+|------|-------------|--------|
+| `mcp.registry` | Registry and audit log for MCP tool registrations and invocations | mcp |

		M[Scenario 69<br/>Self-Improving API + Custom Go] --> N[Scenario 70<br/>Self-Extending MCP Tooling]
		N --> O[Scenario 71<br/>Autonomous Agile Agent]

-// ConfigureWorkflow sets up the MCP workflow from configuration.
-func (h *MCPWorkflowHandler) ConfigureWorkflow(_ modular.Application, workflowConfig any) error {
-	_, ok := workflowConfig.(map[string]any)
-	if !ok {
-		return fmt.Errorf("invalid MCP workflow configuration format")
+func parseMCPWorkflowConfig(workflowConfig any) (*MCPHandlerConfig, error) {
+	rawConfig, ok := workflowConfig.(map[string]any)
+	if !ok {
+		return nil, fmt.Errorf("invalid MCP workflow configuration format")
+	}
+	for key := range rawConfig {
+		switch key {
+		case "server_name", "log_tool_calls", "routes":
+		default:
+			return nil, fmt.Errorf("invalid MCP workflow configuration: unknown key %q", key)
+		}
+	}
+	serverNameValue, ok := rawConfig["server_name"]
+	if !ok {
+		return nil, fmt.Errorf("invalid MCP workflow configuration: missing required key %q", "server_name")
+	}
+	serverName, ok := serverNameValue.(string)
+	if !ok || strings.TrimSpace(serverName) == "" {
+		return nil, fmt.Errorf("invalid MCP workflow configuration: %q must be a non-empty string", "server_name")
+	}
+	cfg := &MCPHandlerConfig{
+		ServerName: serverName,
+		Routes:     make(map[string]MCPHandlerRoute),
+	}
+	if logToolCallsValue, exists := rawConfig["log_tool_calls"]; exists {
+		logToolCalls, ok := logToolCallsValue.(bool)
+		if !ok {
+			return nil, fmt.Errorf("invalid MCP workflow configuration: %q must be a boolean", "log_tool_calls")
+		}
+		cfg.LogToolCalls = logToolCalls
+	}
+	routesValue, ok := rawConfig["routes"]
+	if !ok {
+		return nil, fmt.Errorf("invalid MCP workflow configuration: missing required key %q", "routes")
+	}
+	rawRoutes, ok := routesValue.(map[string]any)
+	if !ok {
+		return nil, fmt.Errorf("invalid MCP workflow configuration: %q must be a map of route definitions", "routes")
+	}
+	if len(rawRoutes) == 0 {
+		return nil, fmt.Errorf("invalid MCP workflow configuration: %q must define at least one route", "routes")
+	}
+	for toolName, routeValue := range rawRoutes {
+		if strings.TrimSpace(toolName) == "" {
+			return nil, fmt.Errorf("invalid MCP workflow configuration: route name must be non-empty")
+		}
+		rawRoute, ok := routeValue.(map[string]any)
+		if !ok {
+			return nil, fmt.Errorf("invalid MCP workflow configuration: route %q must be a map", toolName)
+		}
+		for key := range rawRoute {
+			switch key {
+			case "pipeline", "description":
+			default:
+				return nil, fmt.Errorf("invalid MCP workflow configuration: route %q has unknown key %q", toolName, key)
+			}
+		}
+		pipelineValue, ok := rawRoute["pipeline"]
+		if !ok {
+			return nil, fmt.Errorf("invalid MCP workflow configuration: route %q is missing required key %q", toolName, "pipeline")
+		}
+		pipeline, ok := pipelineValue.(string)
+		if !ok || strings.TrimSpace(pipeline) == "" {
+			return nil, fmt.Errorf("invalid MCP workflow configuration: route %q field %q must be a non-empty string", toolName, "pipeline")
+		}
+		route := MCPHandlerRoute{
+			Pipeline: pipeline,
+		}
+		if descriptionValue, exists := rawRoute["description"]; exists {
+			description, ok := descriptionValue.(string)
+			if !ok {
+				return nil, fmt.Errorf("invalid MCP workflow configuration: route %q field %q must be a string", toolName, "description")
+			}
+			route.Description = description
+		}
+		cfg.Routes[toolName] = route
+	}
+	return cfg, nil
+}
+// ConfigureWorkflow sets up the MCP workflow from configuration.
+func (h *MCPWorkflowHandler) ConfigureWorkflow(_ modular.Application, workflowConfig any) error {
+	_, err := parseMCPWorkflowConfig(workflowConfig)
+	if err != nil {
+		return err

-> **Requires workflow-plugin-agent v0.8.0+.** The `mcp_tool` trigger, `mcp.registry`
-> module, and all `step.agent_execute` / `step.self_improve_*` / `step.blackboard_*`
-> types are provided by this plugin, not the workflow core engine.
+> **Requires workflow-plugin-mcp.** The `mcp_tool` trigger and `mcp.registry`
+> module are provided by this plugin, not the workflow core engine.
+> Install with: `wfctl plugin install workflow-plugin-mcp`
+> **Agent step types require workflow-plugin-agent v0.8.0+.** The
+> `step.agent_execute`, `step.self_improve_*`, and `step.blackboard_*` types are
+> provided by this plugin, not the workflow core engine.

-> **Requires workflow-plugin-agent v0.8.0+.** The `mcp_tool` trigger, `mcp.registry`
-> module, and all `step.agent_execute` / `step.self_improve_*` / `step.blackboard_*`
-> types are provided by this plugin, not the workflow core engine.
-> Install with: `wfctl plugin install workflow-plugin-agent`
+> **Requires workflow-plugin-mcp v0.8.0+.** The `mcp_tool` trigger, `mcp`
+> workflow handler, and `mcp.registry` module are provided by this plugin, not
+> the workflow core engine.
+> Install with: `wfctl plugin install workflow-plugin-mcp`
+>
+> Agent-related step types such as `step.agent_execute`,
+> `step.self_improve_*`, and `step.blackboard_*` are provided separately by
+> `workflow-plugin-agent`.

Conversation

intel352 commented Apr 13, 2026

Summary

Design

Implementation Plan

Related PRs

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⏱ Benchmark Results

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

github-actions Bot commented Apr 13, 2026 •

edited

Loading