Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 71 additions & 12 deletions doc/mcp_interface.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ This document describes the Model Context Protocol (MCP) tools exposed by the De

## Tools

DeepWork exposes eleven MCP tools:
DeepWork exposes thirteen MCP tools:

### 1. `get_workflows`

Expand Down Expand Up @@ -54,7 +54,64 @@ interface WorkflowInfo {

---

### 2. `start_workflow`
### 2. `get_active_workflow`

Return the currently active workflow for a session, if one exists. This is useful after compaction, reset, or any host-specific session restore flow.

#### Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `session_id` | `string` | Yes | The persistent DeepWork session ID for the current host session. In Claude Code this is `CLAUDE_CODE_SESSION_ID`. |
| `agent_id` | `string \| null` | No | Optional host-specific agent identifier for agent-scoped workflow state. In Claude Code this is `CLAUDE_CODE_AGENT_ID`. |

#### Returns

```typescript
{
has_active_workflow: boolean;
stack: StackEntry[];
active_workflow?: {
job_name: string;
workflow_name: string;
goal: string;
started_at: string;
step_number: number;
total_steps: number;
completed_steps: string[];
current_step: ActiveStepInfo;
} | null;
}
```

---

### 3. `validate_step_outputs`

Validate a planned `finished_step` payload against the active step without advancing the workflow or running quality reviews. Use this as a dry run when you want to catch wrong output names, missing required outputs, bad types, or missing files before calling `finished_step`.

#### Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `outputs` | `Record<string, string \| string[]>` | Yes | Map of planned step output names to values. Validation uses the active step's declared output contract without advancing the workflow. |
| `session_id` | `string` | Yes | The persistent DeepWork session ID for the current host session. In Claude Code this is `CLAUDE_CODE_SESSION_ID`. |
| `agent_id` | `string \| null` | No | Optional host-specific agent identifier for agent-scoped workflow state. In Claude Code this is `CLAUDE_CODE_AGENT_ID`. |

#### Returns

```typescript
{
valid: boolean;
errors: string[];
current_step: ActiveStepInfo;
stack: StackEntry[];
}
```

---

### 4. `start_workflow`

Start a new workflow session. Initializes state tracking and returns the first step's instructions. Supports nested workflows — starting a workflow while one is active pushes onto a stack.

Expand Down Expand Up @@ -82,7 +139,7 @@ Start a new workflow session. Initializes state tracking and returns the first s

---

### 3. `finished_step`
### 5. `finished_step`

Report that you've finished a workflow step. Validates outputs and runs quality reviews (from step definitions and .deepreview rules), then returns the next action.

Expand Down Expand Up @@ -121,7 +178,7 @@ Report that you've finished a workflow step. Validates outputs and runs quality

---

### 4. `abort_workflow`
### 6. `abort_workflow`

Abort the current workflow and return to the parent workflow (if nested). Use this when a workflow cannot be completed.

Expand Down Expand Up @@ -149,7 +206,7 @@ Abort the current workflow and return to the parent workflow (if nested). Use th

---

### 5. `go_to_step`
### 7. `go_to_step`

Navigate back to a prior step in the current workflow. Clears all progress from the target step onward, forcing re-execution of subsequent steps to ensure consistency. Use this when earlier outputs need revision or quality issues are discovered in later steps.

Expand Down Expand Up @@ -181,7 +238,7 @@ Navigate back to a prior step in the current workflow. Clears all progress from

---

### 6. `get_review_instructions`
### 8. `get_review_instructions`

Run a review of changed files based on `.deepreview` configuration files and DeepSchema-generated synthetic review rules. Returns a list of review tasks to invoke in parallel. Each task has `description`, `subagent_type`, and `prompt` fields for the Agent tool.

Expand All @@ -201,7 +258,7 @@ A plain string with one of:

---

### 7. `get_configured_reviews`
### 9. `get_configured_reviews`

List all configured review rules from `.deepreview` files and DeepSchema-generated synthetic rules. Returns each rule's name, description, and defining file location. Optionally filters to rules matching specific files.

Expand All @@ -225,7 +282,7 @@ Array<{

---

### 8. `mark_review_as_passed`
### 10. `mark_review_as_passed`

Mark a review as passed so it won't be re-run while reviewed files remain unchanged. Call this when a review has no findings, when all findings have been fixed, or when remaining findings have been explicitly dismissed by the user. The `review_id` is provided in the instruction file's "After Review" section.

Expand All @@ -245,7 +302,7 @@ A plain string with either:

---

### 9. `get_named_schemas`
### 11. `get_named_schemas`

List all named DeepSchemas discovered across all schema sources (project-local, standard, and env var). Returns each schema's name, summary, and matcher patterns.

Expand All @@ -263,7 +320,7 @@ Array<{
}>
```

### 10. `register_session_job`
### 12. `register_session_job`

Register a transient job definition scoped to the current session. The job is validated against the job schema and stored so that `start_workflow` can discover it. Can be called multiple times to overwrite.

Expand All @@ -288,7 +345,7 @@ Register a transient job definition scoped to the current session. The job is va

On validation failure, returns `{ error: string }` with details about what failed.

### 11. `get_session_job`
### 13. `get_session_job`

Retrieve the YAML content of a session-scoped job definition previously registered with `register_session_job`.

Expand Down Expand Up @@ -375,7 +432,9 @@ The `finished_step` tool returns one of three statuses:
|
3. Execute step instructions, create outputs
|
4. finished_step(outputs, session_id)
4. validate_step_outputs(outputs, session_id) // optional dry run
|
5. finished_step(outputs, session_id)
|
+-- status = "needs_work" -> Fix issues, goto 4
+-- status = "next_step" -> Execute new instructions, goto 4
Expand Down
27 changes: 23 additions & 4 deletions src/deepwork/jobs/mcp/quality_gate.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
)
from deepwork.review.config import ReviewRule, ReviewTask
from deepwork.review.discovery import load_all_rules
from deepwork.review.formatter import format_for_claude
from deepwork.review.formatter import FORMATTERS, format_for_claude
from deepwork.review.instructions import (
write_instruction_files,
)
Expand Down Expand Up @@ -461,16 +461,35 @@ def run_quality_gate(
return None

# 9. Format as review instructions
review_output = format_for_claude(task_files, project_root)
formatter = FORMATTERS.get(platform, format_for_claude)
review_output = formatter(task_files, project_root)

# 10. Build complete response with guidance
guidance = _build_review_guidance(review_output)
guidance = _build_review_guidance(review_output, platform)

return guidance


def _build_review_guidance(review_output: str) -> str:
def _build_review_guidance(review_output: str, platform: str = "claude") -> str:
"""Build the complete review guidance including /review skill instructions."""
if platform == "openclaw":
return f"""Quality reviews are required before this step can advance.

{review_output}

## How to Run Reviews

For each review task listed above, launch it as a parallel OpenClaw sub-agent with `sessions_spawn`.

- Spawn every listed review before waiting for any completion event.
- Use each instruction path exactly as written, relative to the workspace root. Do not rewrite it as an absolute host path.
- Do not set `timeoutSeconds` on these review spawns; let the runtime default apply. If the tool requires a timeout value, use `0`.
- After all spawns are accepted, use `sessions_yield` to wait for completion events before continuing.

## After Reviews

For any failing reviews, if you believe the issue is invalid, then you can call `mark_review_as_passed` on it. Otherwise, you should act on any feedback from the review to fix the issues. Once done, call `finished_step` again to see if you will pass now."""

return f"""Quality reviews are required before this step can advance.

{review_output}
Expand Down
35 changes: 32 additions & 3 deletions src/deepwork/jobs/mcp/roots.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,16 @@
"""MCP root resolution via listRoots client capability.

Resolves the project root dynamically by asking the MCP client for its
filesystem roots. When ``--path`` is explicitly passed on the CLI the
resolver always returns that path. Otherwise it calls ``ctx.list_roots()``
filesystem roots. When ``--path`` is explicitly passed on the CLI the
resolver always returns that path. Otherwise it calls ``ctx.list_roots()``
on every tool invocation so it tracks workspace changes (e.g. git worktree
switches) without caching stale values.

For OpenClaw bundle installs, the MCP server can be launched from the plugin
bundle directory itself (for example ``plugins/openclaw``) when the host does
not expose a usable ``listRoots`` capability. In that case we normalize the
bundle directory back to the enclosing workspace root when we can detect
OpenClaw workspace markers.
"""

from __future__ import annotations
Expand All @@ -19,6 +25,9 @@

logger = logging.getLogger("deepwork.jobs.mcp")

_OPENCLAW_PLUGIN_MARKER = Path(".codex-plugin") / "plugin.json"
_OPENCLAW_WORKSPACE_MARKER = Path(".openclaw") / "workspace-state.json"


async def resolve_project_root(ctx: Context, fallback: Path) -> Path:
"""Ask the MCP client for its filesystem root.
Expand Down Expand Up @@ -78,4 +87,24 @@ async def get_root(self, ctx: Context) -> Path:
"""
if self._explicit:
return self._fallback
return await resolve_project_root(ctx, self._fallback)
candidate = await resolve_project_root(ctx, self._fallback)
return _normalize_openclaw_bundle_root(candidate)


def _normalize_openclaw_bundle_root(candidate: Path) -> Path:
"""Map an OpenClaw plugin bundle path back to the workspace root."""

resolved = candidate.resolve()
if not (resolved / _OPENCLAW_PLUGIN_MARKER).exists():
return resolved

for ancestor in (resolved, *resolved.parents):
if (ancestor / _OPENCLAW_WORKSPACE_MARKER).exists():
logger.debug(
"Normalized OpenClaw plugin bundle root %s to workspace root %s",
resolved,
ancestor,
)
return ancestor

return resolved
94 changes: 94 additions & 0 deletions src/deepwork/jobs/mcp/schemas.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,31 @@ class FinishedStepInput(BaseModel):
)


class ValidateStepOutputsInput(BaseModel):
"""Input for validate_step_outputs tool."""

outputs: dict[str, ArgumentValue] = Field(
description=(
"Map of planned step output names to values. "
"Validation uses the active step's declared output contract without "
"advancing the workflow or running quality reviews."
)
)
session_id: str = Field(
description=(
"The persistent DeepWork session ID for the current host session. "
"In Claude Code this is CLAUDE_CODE_SESSION_ID."
),
)
agent_id: str | None = Field(
default=None,
description=(
"Optional host-specific agent identifier for agent-scoped workflow state. "
"In Claude Code this is CLAUDE_CODE_AGENT_ID."
),
)


class AbortWorkflowInput(BaseModel):
"""Input for abort_workflow tool."""

Expand Down Expand Up @@ -182,6 +207,24 @@ class GoToStepInput(BaseModel):
)


class GetActiveWorkflowInput(BaseModel):
"""Input for get_active_workflow tool."""

session_id: str = Field(
description=(
"The persistent DeepWork session ID for the current host session. "
"In Claude Code this is CLAUDE_CODE_SESSION_ID."
),
)
agent_id: str | None = Field(
default=None,
description=(
"Optional host-specific agent identifier for agent-scoped workflow state. "
"In Claude Code this is CLAUDE_CODE_AGENT_ID."
),
)


# =============================================================================
# Tool Output Models
# NOTE: Changes to these models affect MCP tool return types.
Expand Down Expand Up @@ -320,6 +363,23 @@ class FinishedStepResponse(BaseModel):
)


class ValidateStepOutputsResponse(BaseModel):
"""Response from validate_step_outputs tool."""

valid: bool = Field(description="Whether the submitted outputs satisfy the active step contract")
errors: list[str] = Field(
default_factory=list,
description="Validation errors that must be fixed before calling finished_step",
)
current_step: ActiveStepInfo = Field(
description="The current step, including the declared expected outputs",
)
stack: list[StackEntry] = Field(
default_factory=list,
description="Current workflow stack after validation",
)


class AbortWorkflowResponse(BaseModel):
"""Response from abort_workflow tool."""

Expand Down Expand Up @@ -349,6 +409,40 @@ class GoToStepResponse(BaseModel):
)


class ActiveWorkflowState(BaseModel):
"""Current active workflow session details."""

job_name: str = Field(description="Name of the active job")
workflow_name: str = Field(description="Name of the active workflow")
goal: str = Field(description="Goal originally supplied when the workflow started")
started_at: str = Field(description="ISO timestamp when the workflow started")
step_number: int = Field(description="1-based index of the current step")
total_steps: int = Field(description="Total number of steps in the workflow")
completed_steps: list[str] = Field(
default_factory=list,
description="Step IDs already completed in this workflow session",
)
current_step: ActiveStepInfo = Field(
description="The active step and its current resolved instructions",
)


class GetActiveWorkflowResponse(BaseModel):
"""Response from get_active_workflow tool."""

has_active_workflow: bool = Field(
description="Whether the given session currently has an active workflow"
)
stack: list[StackEntry] = Field(
default_factory=list,
description="Current workflow stack visible to this session/agent",
)
active_workflow: ActiveWorkflowState | None = Field(
default=None,
description="Details of the active workflow when one exists",
)


# =============================================================================
# Session Job Models
# NOTE: These models support register_session_job / get_session_job tools.
Expand Down
Loading
Loading