diff --git a/README.md b/README.md index d6fcc66..a6d3048 100644 --- a/README.md +++ b/README.md @@ -42,15 +42,16 @@ arguments. The server exposes the following tools for interacting with the Enapter EMS: -| Tool | Description | -| --------------------------- | ---------------------------------------------------------------- | -| `search_sites` | Search among all sites with name and timezone regex filtering | -| `search_devices` | Search devices by site, type, and name regex filtering | -| `search_command_executions` | Search the history of command executions | -| `read_blueprint` | Access device blueprint sections (properties, telemetry, alerts) | -| `get_historical_telemetry` | Retrieve time-series telemetry with configurable granularity | -| `search_rules` | Search for automation rules within a specific site | -| `read_rule` | Read the paginated lines of a rule's Lua script | +| Tool | Description | Access | Default | +| --------------------------- | ---------------------------------------------------------------- | ---------- | -------- | +| `search_sites` | Search among all sites with name and timezone regex filtering | Read-only | Enabled | +| `search_devices` | Search devices by site, type, and name regex filtering | Read-only | Enabled | +| `search_command_executions` | Search the history of command executions | Read-only | Enabled | +| `read_blueprint` | Access device blueprint sections (properties, telemetry, alerts) | Read-only | Enabled | +| `get_historical_telemetry` | Retrieve time-series telemetry with configurable granularity | Read-only | Enabled | +| `search_rules` | Search for automation rules within a specific site | Read-only | Enabled | +| `read_rule` | Read the paginated lines of a rule's Lua script | Read-only | Enabled | +| `execute_command` | Execute a command on a device | Read-write | Disabled | ## Usage Examples @@ -114,6 +115,36 @@ Here are realistic examples of how you can interact with your Enapter devices us - Retrieves the rule's Lua script using `read_rule` - Analyzes the logic and confirms the exact threshold and conditions that trigger the electrolyser +### Example 5: Executing a Command (Human Confirmation Required) + +> ⚠️ `execute_command` is **destructive** — it acts on real physical hardware +> (pumps, electrolysers, valves, inverters). It is **disabled by default**. Enable +> it with `--command-execution-enabled` on the command line or by setting +> `ENAPTER_COMMAND_EXECUTION_ENABLED=1`. + +**User prompt:** + +> The electrolyser at the Alpha site has been running for a long time. Please +> reboot it for me. + +**What happens:** + +- Server locates the electrolyser device using `search_devices` +- Reads the device blueprint with `read_blueprint(section="commands")` to + discover the `reboot` command and checks whether it declares a `confirmation` + block +- The `reboot` command declares a `confirmation` with a `title` and + `description` (e.g. _"Reboot the electrolyser"_ / _"This will restart the + device and interrupt production."_), so the assistant **presents these to the + human and waits for explicit approval** — it does not act on its own initiative +- Only after the human confirms does the assistant call `execute_command` with + `human_confirmed_this_action=True` +- The device runs the command and the tool returns the resulting + `CommandExecution`, whose `state` field reports the outcome + (`success`/`error`/`timeout`/`unsync`) +- The returned execution `id` can later be referenced or audited via + `search_command_executions` + ## Support For issues, questions, or contributions, please: diff --git a/specs/SPEC-007-command-execution.md b/specs/SPEC-007-command-execution.md new file mode 100644 index 0000000..18f6b97 --- /dev/null +++ b/specs/SPEC-007-command-execution.md @@ -0,0 +1,293 @@ +# SPEC-007: Command Execution + +## Context + +The MCP server has so far been read-only: every tool is registered with +`readOnlyHint=True` and only reads sites, devices, telemetry, blueprints, rules, +and the command-execution *history* (`search_command_executions`). The upstream +Enapter API can already execute commands on physical devices, and the +access-control model (SPEC-001/002/003) already defines who may execute what. +What is missing is a tool that actually executes a command — and because Enapter +devices are real-world energy hardware (pumps, electrolysers, valves, inverters), +acting on the world has physical consequences. This spec adds command execution +behind layered defenses so the server can act safely. + +The upstream `enapter.http.api.commands.Client` exposes two execution +primitives: + +- `execute(device_id, name, arguments)` — **blocking**: returns when the command + completes or the upstream's own timeout fires (which yields the terminal + `timeout` state). +- `create_execution(device_id, name, arguments)` — **fire-and-forget**: returns a + tracked execution immediately; pollable via `get_execution`. + +We use the blocking `execute` primitive — one call that returns when the command +completes. This is deliberately simple. The fire-and-forget `create_execution` +plus a poll loop (and with it, progress logging) is a possible future evolution, +deferred until production usage justifies the complexity (see Decision 8). + +## Architectural Decisions + +### 1. Opt-in kill switch, default OFF + +Command execution is dangerous, so the `execute_command` tool is registered +**only** when the server is explicitly configured to allow it +(`ServerConfig.command_execution_enabled`, driven by +`ENAPTER_COMMAND_EXECUTION_ENABLED` / `--command-execution-enabled`, default +`"0"`). All existing deployments stay read-only with zero changes. This is the +reliable, deployment-level off switch — it does not depend on client behavior. + +### 2. Single synchronous tool using the blocking `execute` primitive + +`execute_command` is one tool. It calls +`commands.execute(device_id, name, arguments)`, which blocks until the command +finishes or the upstream's own timeout fires (yielding the terminal `timeout` +state), then returns the `Execution`, which we map to `CommandExecution` and +return. There is no `wait` argument, no second tool, no server-side poll loop, no +`ctx` parameter, and no progress logging. We keep v1 deliberately simple and will +revisit based on production usage. + +The backend owns the command-level timeout; the MCP client owns the HTTP-call +timeout. If the client cancels or times out the call before it returns, the agent +recovers the outcome via the existing `search_command_executions` by +`device_id`/time (the agent has no `id` in that case). We make no claim about +whether the device command continues after a client cancellation — that is the +upstream's behaviour, not ours. + +### 3. Return `CommandExecution`; let the SDK raise + +Three cases, cleanly separating "we refused / the call failed to submit" (raise) +from "the command ran, here is its outcome" (a result, possibly an error state): + +- **Unknown command name** (Decision 7): `command_name` is not declared in the + device manifest → we raise **before** calling `execute`; no execution is + created. +- **Confirmation gate refuses** (Decision 6): the command declares `confirmation` + and `human_confirmed_this_action` is `False` → we raise **before** calling + `execute`; no execution is created. +- **`execute` fails** (e.g. upstream 403 for insufficient role, network error) → + the SDK raises on its own and the exception propagates to the agent. We do not + catch it. +- **`execute` returns** → map the `Execution` to `CommandExecution` and return + it. The `state` field communicates the outcome (`success`/`error`/`timeout`/ + `unsync`), consistent with how `search_command_executions` already returns + items of any state — device error/timeout states are **values, not + exceptions**. The returned `id` is the handle the agent needs to reference, + audit, or follow up the execution. + +### 4. No argument validation in v1 + +Arguments are passed through to the device unchecked beyond what the upstream +API enforces. The agent discovers argument names/types via +`read_blueprint(section="commands")`; the device validates authoritatively. +Local validation, numeric bounds, and `default`/`sensitive` handling are +deferred. Command-*name* existence is checked against the manifest — that is a +structural check, not argument validation (see Decision 7). + +### 5. RBAC unchanged; defense in depth + +Access control is not re-implemented. The upstream API enforces +`authorized_role >= command.access_level` and returns 403 on insufficient +privilege, which the SDK raises and which propagates to the agent (Decision 3); +the MCP server never bypasses or pre-authorizes. The tool is registered with +`readOnlyHint=False`, `destructiveHint=True` as an additional advisory signal to +clients. + +### 6. Server-enforced confirmation via `human_confirmed_this_action` + +Blueprint manifests may declare a per-command `confirmation` block +(`severity`/`title`/`description`) for commands the vendor considers +consequential. We expose this and enforce a server-side gate: + +- `execute_command` takes `human_confirmed_this_action: bool` (default `False`). +- The server fetches the manifest and resolves the command declaration **itself** + (it does not trust the agent's representation of whether confirmation is + required). If the resolved command declares `confirmation` **and** + `human_confirmed_this_action` is `False`, the tool **raises before `execute`** + — no execution is created. The error message includes the `confirmation`'s + `title`/`description` so the agent knows what to ask. +- For commands without a `confirmation` block, the flag is ignored (the default + `False` is fine). + +The parameter uses **`human`, not `user`**, deliberately. In auth contexts +"user" is the authenticated principal, which an LLM can map onto *itself* (it +holds the token, it is the API caller). `human` forces the disambiguation: a +flesh-and-blood person must have confirmed. The name is the social-engineering +lever — well-behaved agents read the attestation, ask the human via their native +conversational ability (presenting the `title`/`description`), and only then set +the flag. The `CommandDeclaration` and `execute_command` docstrings spell this +out: check the declaration via `read_blueprint`; for a confirmation-declared +command, obtain a human's approval before setting `human_confirmed_this_action` — +the name reads as a claim the caller is making, which is the friction we want. + +This is an **enforced assertion, not a guarantee.** The server verifies the flag +was set for confirmation-declared commands; it cannot verify that a human +actually confirmed. An ill-behaved agent can set the flag without asking, and +nothing in v1 prevents that. A server-enforced `ctx.elicit` round-trip — which +actually reaches a human through the client — is the planned follow-up (Decision +7) to close that gap; elicitation is deferred because client support for it is +currently poor. + +`confirmation` is captured on `CommandDeclaration` (domain and MCP), mapped by +`EnapterDataMapper` from the manifest when present (otherwise `None`), and +surfaced via `read_blueprint(section="commands")`. + +### 7. Command-name existence check (not argument validation) + +The server resolves the command declaration from the device manifest itself +(the same fetch the confirmation gate uses — Decision 6). If the resolved +`command_name` is not declared in the manifest, the tool raises **before** +calling `execute` — no execution is created — and the error message names the +available commands so the agent can recover (correct a typo or hallucination). + +This is a structural check against the device's own blueprint, not a permission +check: it does not duplicate or drift from upstream RBAC (Decision 5), and it is +distinct from the *argument* validation (`min`/`max`, type/enum/required) that +Decision 4 explicitly defers. The manifest is authoritative for which commands a +device exposes, so failing fast locally is safe and avoids a wasted round-trip. +A command present in the manifest without a `confirmation` block still proceeds +at the default `human_confirmed_this_action=False` (Decision 6). + +### 8. Deferred to a follow-up spec + +The following are intentionally **out of scope for this spec**: + +- **Server-enforced `ctx.elicit`** as a real human round-trip on commands that + declare `confirmation` — closes the "agent can lie" gap left by the + `human_confirmed_this_action` flag (Decision 6). Deferred due to poor current + client support for elicitation. +- **An async execution model** (`create_execution` + poll), **progress logging**, + and/or **a separate wait tool** (or `wait` argument) to stream progress and + handle long commands without a single blocking call — deferred until production + usage shows it is needed. +- **Local argument validation** (`min`/`max`, type/enum/required checks). + +The v1 defenses are therefore: the opt-in kill switch, upstream RBAC, +`destructiveHint=True`, the `human_confirmed_this_action` gate plus docstring +guidance, and the execution audit trail (recoverable via +`search_command_executions`). + +## Constraints + +- Do **not** change the semantics or annotations of the existing seven tools; + they remain read-only (`readOnlyHint=True`). +- Do **not** change the default behavior of the server: command execution stays + OFF unless explicitly enabled. +- Do **not** bypass, weaken, or re-implement upstream RBAC. The SDK's 403 + propagates; do not pre-check permissions in a way that could drift from the + server. +- Do **not** implement server-side elicitation or validate arguments in this + spec (both deferred — see Decision 8). In particular, do not add `min`/`max` to + command argument declarations. (Capturing and exposing the `confirmation` block + and the `human_confirmed_this_action` gate **are** in scope — see Decision 6.) +- Do **not** add rate limiting (deferred). +- Do **not** add a `wait`/`timeout` argument, a second execution tool, a `ctx` + parameter, or progress logging to `execute_command`. It is one blocking tool + that uses `commands.execute` directly (not `create_execution` + polling). These + are deferred per Decision 8. +- Do **not** catch the SDK's exceptions for `execute` (e.g. 403, network); let + them propagate. +- The confirmation parameter is named exactly `human_confirmed_this_action` (the + `human` framing is intentional — see Decision 6). +- The wire format of existing models is unchanged except for the additive, + optional `confirmation` field on command declarations. + +## Acceptance Criteria + +1. **Kill switch (off).** With `command_execution_enabled` falsy (the default), + `tools/list` returns exactly the seven existing tools and does not include + `execute_command`. + +2. **Kill switch (on).** With `command_execution_enabled` truthy, `tools/list` + returns eight tools including `execute_command`. + +3. **CLI/config wiring.** `--command-execution-enabled` (choices `0`/`1`, + default from `ENAPTER_COMMAND_EXECUTION_ENABLED`, default `"0"`) sets + `ServerConfig.command_execution_enabled`, which controls criteria 1 and 2. + +4. **Tool annotations.** `execute_command` is registered with + `readOnlyHint=False`, `destructiveHint=True`, and a non-empty `title`. The + seven existing tools remain `readOnlyHint=True`. + +5. **Tool input schema.** `execute_command` accepts `device_id: str` (required), + `command_name: str` (required), `arguments: object | null` (optional, + `default: null`; `null`/omitted means no arguments — the upstream SDK + normalizes `null` to `{}`), and `human_confirmed_this_action: bool` + (default `False`) — and no other parameters. + `tests/integration/schemas/execute_command.json` is committed and asserts this + shape and the exact parameter name. + +6. **`confirmation` enrichment.** `domain.CommandDeclaration` has an optional + `confirmation` capturing `severity`, `title`, and `description`. The MCP-layer + `CommandDeclaration` mirrors it. `EnapterDataMapper` maps it from the manifest + when present and leaves it `None` when absent. + +7. **`confirmation` exposed.** `tests/integration/schemas/read_blueprint.json` is + regenerated and shows `confirmation` on command declarations. + +8. **Confirmation gate.** For a command whose manifest declaration includes a + `confirmation` block, `execute_command` with `human_confirmed_this_action=False` + raises **before** calling `execute` and the message includes the block's + `title`/`description`; with `human_confirmed_this_action=True` it proceeds. For + a command without a `confirmation` block, the flag is ignored and execution + proceeds at the default `False`. The server resolves the declaration from the + manifest itself. (Verifiable with a mock API: `execute` is **not** called on + refusal, and **is** called otherwise.) + +9. **Confirmation docstrings.** The `CommandDeclaration` model docstring and the + `execute_command` tool docstring direct the agent to obtain a **human's** + approval (presenting the `title`/`description`) before executing a command + whose `confirmation` is present, and to set `human_confirmed_this_action=True` + to attest it. Both docstrings appear in their committed schema snapshots. + +10. **Execution.** Once the gate passes, `core` calls + `commands.execute(device_id, command_name, arguments)`, which blocks until the + command completes or the upstream timeout fires, then maps the returned + `Execution` to `CommandExecution` and returns it. There is no poll loop and no + logging. + +11. **Return on completion.** When `execute` returns, the tool returns the full + MCP `CommandExecution` (including `id`, `state`, and `response_payload`). The + `state` field communicates the outcome for every terminal state, including + `error`/`timeout`/`unsync` (the tool does **not** raise for these — it returns + the `CommandExecution` with that state). + +12. **SDK errors propagate.** If `commands.execute` raises (e.g. upstream 403, + network error), the exception propagates to the agent unmodified (the tool + does not catch it). + +13. **Recovery after cancellation.** If the tool call is cancelled or times out + before returning, the agent can recover the outcome via + `search_command_executions` by `device_id`/time (an execution record is + created by `execute`; the agent has no `id` in this case). + +14. **RBAC surfacing.** An upstream authorization failure (403) reaches the agent + via the propagated SDK exception (criterion 12); the server does not pre-check + or bypass it. + +15. **Tests.** Unit tests cover: the unknown-command-name gate — a name not in + the manifest → raises and `execute` is not called; the confirmation gate — + confirmation-declared + flag `False` → raises and `execute` is not called; + confirmation-declared + flag `True` → proceeds; a command present in the + manifest without `confirmation` + flag `False` → proceeds; `execute` + returning a success `Execution` (returns a `CommandExecution` with + `response_payload`); `execute` returning an `error`/`timeout`/`unsync` + `Execution` (returns a `CommandExecution` with that state); and an `execute` + SDK raise propagating. Plus `EnapterDataMapper` mapping `confirmation` + (present → value, absent → `None`). The integration test still asserts exactly + seven tools under the default (disabled) config, and an enabled-config case + asserts eight tools and the `execute_command` schema snapshot. + +16. **Unknown command name.** For a `command_name` not present in the device + manifest, `execute_command` raises **before** calling `execute` (no execution + is created) and the message names the available commands. (Verifiable with a + mock API: `execute` is not called.) + +17. **README.** `README.md` is updated: the `execute_command` tool is added to + the "Available Tools" table (noting it is disabled by default and enabled via + `--command-execution-enabled` / `ENAPTER_COMMAND_EXECUTION_ENABLED`), and a + usage example demonstrates executing a command including the + human-confirmation flow (presenting the `confirmation` `title`/`description` + and setting `human_confirmed_this_action=True`). + +18. `make lint` and `make test` pass. diff --git a/src/enapter_mcp_server/cli/serve_command.py b/src/enapter_mcp_server/cli/serve_command.py index 4a300fb..19e0691 100644 --- a/src/enapter_mcp_server/cli/serve_command.py +++ b/src/enapter_mcp_server/cli/serve_command.py @@ -38,6 +38,7 @@ ) ENAPTER_OAUTH_PROXY_JWT_SIGNING_KEY = os.getenv("ENAPTER_OAUTH_PROXY_JWT_SIGNING_KEY") ENAPTER_CORS_ALLOW_ORIGINS = os.getenv("ENAPTER_CORS_ALLOW_ORIGINS") +ENAPTER_COMMAND_EXECUTION_ENABLED = os.getenv("ENAPTER_COMMAND_EXECUTION_ENABLED", "0") class ServeCommand(Command): @@ -125,6 +126,13 @@ def register(parent: Subparsers) -> None: default=ENAPTER_OAUTH_PROXY_JWT_SIGNING_KEY, help="Signing key for JWTs issued by OAuth proxy. Required if OAuth proxy is enabled", ) + parser.add_argument( + "--command-execution-enabled", + choices=["0", "1"], + default=ENAPTER_COMMAND_EXECUTION_ENABLED, + help="Enable the destructive `execute_command` tool (kill switch). " + "When disabled (the default), the tool is not registered at all", + ) @staticmethod async def run(args: argparse.Namespace) -> None: @@ -165,6 +173,7 @@ async def run(args: argparse.Namespace) -> None: oauth_proxy_config=oauth_proxy_config, logo_url=args.logo_url, cors_allow_origins=cors_allow_origins, + command_execution_enabled=args.command_execution_enabled == "1", ) async with asyncio.TaskGroup() as task_group: async with http.EnapterAPI( diff --git a/src/enapter_mcp_server/core/__init__.py b/src/enapter_mcp_server/core/__init__.py index ea47853..34ad38e 100644 --- a/src/enapter_mcp_server/core/__init__.py +++ b/src/enapter_mcp_server/core/__init__.py @@ -5,6 +5,8 @@ from .device_search_query import DeviceSearchQuery from .enapter_api import EnapterAPI from .errors import ( + CommandNotFound, + ConfirmationRequired, GatewayUnavailable, LatestTelemetryUnavailable, SearchQueryTooBroad, @@ -19,6 +21,8 @@ "ApplicationServer", "AuthConfig", "CommandExecutionSearchQuery", + "CommandNotFound", + "ConfirmationRequired", "DeviceDTO", "DeviceSearchQuery", "EnapterAPI", diff --git a/src/enapter_mcp_server/core/application_server.py b/src/enapter_mcp_server/core/application_server.py index 229299e..eb9472c 100644 --- a/src/enapter_mcp_server/core/application_server.py +++ b/src/enapter_mcp_server/core/application_server.py @@ -1,6 +1,7 @@ import asyncio import datetime import re +from typing import Any from enapter_mcp_server import domain @@ -9,7 +10,12 @@ from .device_dto import DeviceDTO from .device_search_query import DeviceSearchQuery from .enapter_api import EnapterAPI -from .errors import GatewayUnavailable, SearchQueryTooBroad +from .errors import ( + CommandNotFound, + ConfirmationRequired, + GatewayUnavailable, + SearchQueryTooBroad, +) from .rule_search_query import RuleSearchQuery from .site_dto import SiteDTO from .site_search_query import SiteSearchQuery @@ -310,6 +316,44 @@ async def read_blueprint( entities.sort(key=key) return entities[offset : offset + limit] + async def execute_command( + self, + auth: AuthConfig, + device_id: str, + command_name: str, + arguments: dict[str, Any] | None = None, + human_confirmed_this_action: bool = False, + ) -> domain.CommandExecution: + commands = await self._resolve_manifest_commands(auth, device_id) + declaration = commands.get(command_name) + if declaration is None: + available = ", ".join(sorted(commands)) or "(none)" + raise CommandNotFound( + f"Command {command_name!r} is not declared in the manifest of" + f" device {device_id!r}. Available commands: {available}." + ) + if declaration.confirmation is not None and not human_confirmed_this_action: + confirmation = declaration.confirmation + title = confirmation.title or declaration.display_name + description = confirmation.description + raise ConfirmationRequired( + f"Command {command_name!r} on device {device_id!r} requires human" + f" confirmation before execution. Title: {title}." + f" Description: {description}." + ) + return await self._enapter_api.execute_command( + auth, device_id, command_name, arguments + ) + + async def _resolve_manifest_commands( + self, auth: AuthConfig, device_id: str + ) -> dict[str, domain.CommandDeclaration]: + device_dto = await self._enapter_api.get_device( + auth, device_id, expand_manifest=True + ) + assert device_dto.manifest is not None + return device_dto.manifest.commands + async def get_historical_telemetry( self, auth: AuthConfig, diff --git a/src/enapter_mcp_server/core/enapter_api.py b/src/enapter_mcp_server/core/enapter_api.py index f8c582e..75e8629 100644 --- a/src/enapter_mcp_server/core/enapter_api.py +++ b/src/enapter_mcp_server/core/enapter_api.py @@ -53,6 +53,14 @@ async def get_device( expand_active_alerts: bool = False, ) -> DeviceDTO: ... + async def execute_command( + self, + auth: AuthConfig, + device_id: str, + command_name: str, + arguments: dict[str, Any] | None, + ) -> domain.CommandExecution: ... + @enapter.async_.generator async def list_command_executions( self, diff --git a/src/enapter_mcp_server/core/errors.py b/src/enapter_mcp_server/core/errors.py index 1b94dcf..70b7916 100644 --- a/src/enapter_mcp_server/core/errors.py +++ b/src/enapter_mcp_server/core/errors.py @@ -8,3 +8,11 @@ class SearchQueryTooBroad(Exception): class GatewayUnavailable(Exception): pass + + +class ConfirmationRequired(Exception): + pass + + +class CommandNotFound(Exception): + pass diff --git a/src/enapter_mcp_server/domain/__init__.py b/src/enapter_mcp_server/domain/__init__.py index a8a7157..797a4fd 100644 --- a/src/enapter_mcp_server/domain/__init__.py +++ b/src/enapter_mcp_server/domain/__init__.py @@ -5,6 +5,7 @@ from .blueprint_section import BlueprintSection from .blueprint_summary import BlueprintSummary from .command_argument_declaration import CommandArgumentDeclaration +from .command_confirmation import CommandConfirmation from .command_declaration import CommandDeclaration from .command_execution import CommandExecution from .command_execution_state import CommandExecutionState @@ -33,6 +34,7 @@ "BlueprintSection", "BlueprintSummary", "CommandArgumentDeclaration", + "CommandConfirmation", "CommandDeclaration", "CommandExecution", "CommandExecutionState", diff --git a/src/enapter_mcp_server/domain/command_confirmation.py b/src/enapter_mcp_server/domain/command_confirmation.py new file mode 100644 index 0000000..f5f506d --- /dev/null +++ b/src/enapter_mcp_server/domain/command_confirmation.py @@ -0,0 +1,8 @@ +import dataclasses + + +@dataclasses.dataclass(frozen=True, kw_only=True) +class CommandConfirmation: + severity: str | None = None + title: str | None = None + description: str | None = None diff --git a/src/enapter_mcp_server/domain/command_declaration.py b/src/enapter_mcp_server/domain/command_declaration.py index 9d27f82..9d03141 100644 --- a/src/enapter_mcp_server/domain/command_declaration.py +++ b/src/enapter_mcp_server/domain/command_declaration.py @@ -2,6 +2,7 @@ from .access_role import AccessRole from .command_argument_declaration import CommandArgumentDeclaration +from .command_confirmation import CommandConfirmation @dataclasses.dataclass(frozen=True, kw_only=True) @@ -12,3 +13,4 @@ class CommandDeclaration: description: str | None arguments: list[CommandArgumentDeclaration] implements: list[str] + confirmation: CommandConfirmation | None = None diff --git a/src/enapter_mcp_server/http/enapter_api.py b/src/enapter_mcp_server/http/enapter_api.py index c3267b7..66398ef 100644 --- a/src/enapter_mcp_server/http/enapter_api.py +++ b/src/enapter_mcp_server/http/enapter_api.py @@ -119,6 +119,19 @@ async def get_device( ) return self._data_mapper.to_device_dto(device) + async def execute_command( + self, + auth: core.AuthConfig, + device_id: str, + command_name: str, + arguments: dict[str, Any] | None, + ) -> domain.CommandExecution: + async with self._new_client(auth) as client: + execution = await client.commands.execute( + device_id, command_name, arguments + ) + return self._data_mapper.to_command_execution(execution) + @enapter.async_.generator async def list_command_executions( self, diff --git a/src/enapter_mcp_server/http/enapter_data_mapper.py b/src/enapter_mcp_server/http/enapter_data_mapper.py index 200458f..66a2612 100644 --- a/src/enapter_mcp_server/http/enapter_data_mapper.py +++ b/src/enapter_mcp_server/http/enapter_data_mapper.py @@ -124,6 +124,18 @@ def to_command_declaration( for arg_name, arg_dto in (dto.get("arguments") or {}).items() ], implements=dto.get("implements") or [], + confirmation=self.to_command_confirmation(dto.get("confirmation")), + ) + + def to_command_confirmation( + self, dto: dict[str, Any] | None + ) -> domain.CommandConfirmation | None: + if dto is None: + return None + return domain.CommandConfirmation( + severity=dto.get("severity"), + title=dto.get("title"), + description=dto.get("description"), ) def to_command_argument_declaration( diff --git a/src/enapter_mcp_server/mcp/models/__init__.py b/src/enapter_mcp_server/mcp/models/__init__.py index 12014e7..38427c7 100644 --- a/src/enapter_mcp_server/mcp/models/__init__.py +++ b/src/enapter_mcp_server/mcp/models/__init__.py @@ -5,6 +5,7 @@ from .blueprint_section import BlueprintSection from .blueprint_summary import BlueprintSummary from .command_argument_declaration import CommandArgumentDeclaration +from .command_confirmation import CommandConfirmation from .command_declaration import CommandDeclaration from .command_execution import CommandExecution from .command_execution_state import CommandExecutionState @@ -32,6 +33,7 @@ "BlueprintSection", "BlueprintSummary", "CommandArgumentDeclaration", + "CommandConfirmation", "CommandDeclaration", "CommandExecution", "CommandExecutionState", diff --git a/src/enapter_mcp_server/mcp/models/command_confirmation.py b/src/enapter_mcp_server/mcp/models/command_confirmation.py new file mode 100644 index 0000000..54f464d --- /dev/null +++ b/src/enapter_mcp_server/mcp/models/command_confirmation.py @@ -0,0 +1,21 @@ +from typing import Self + +import pydantic + +from enapter_mcp_server import domain + + +class CommandConfirmation(pydantic.BaseModel): + """Vendor-declared confirmation block for a consequential command.""" + + severity: str | None = None + title: str | None = None + description: str | None = None + + @classmethod + def from_domain(cls, confirmation: domain.CommandConfirmation) -> Self: + return cls( + severity=confirmation.severity, + title=confirmation.title, + description=confirmation.description, + ) diff --git a/src/enapter_mcp_server/mcp/models/command_declaration.py b/src/enapter_mcp_server/mcp/models/command_declaration.py index 95ca097..c053683 100644 --- a/src/enapter_mcp_server/mcp/models/command_declaration.py +++ b/src/enapter_mcp_server/mcp/models/command_declaration.py @@ -6,14 +6,15 @@ from .access_role import AccessRole from .command_argument_declaration import CommandArgumentDeclaration +from .command_confirmation import CommandConfirmation class CommandDeclaration(pydantic.BaseModel): """A declaration of a device command. - The `access_level` field defines the minimum role required to execute - this command. A user can execute this command only if their - `authorized_role` for the device is at or after this `access_level`. + The `access_level` field defines the minimum role required to execute this command. A user can execute this command only if their `authorized_role` for the device is at or after this `access_level`. + + The `confirmation` field, when present, marks the command as consequential per the vendor's declaration. """ name: str @@ -22,6 +23,7 @@ class CommandDeclaration(pydantic.BaseModel): description: str | None arguments: list[CommandArgumentDeclaration] implements: list[str] + confirmation: CommandConfirmation | None = None @classmethod def from_domain(cls, declaration: domain.CommandDeclaration) -> Self: @@ -34,4 +36,9 @@ def from_domain(cls, declaration: domain.CommandDeclaration) -> Self: CommandArgumentDeclaration.from_domain(a) for a in declaration.arguments ], implements=declaration.implements, + confirmation=( + CommandConfirmation.from_domain(declaration.confirmation) + if declaration.confirmation is not None + else None + ), ) diff --git a/src/enapter_mcp_server/mcp/server.py b/src/enapter_mcp_server/mcp/server.py index ff5c68c..249da4b 100644 --- a/src/enapter_mcp_server/mcp/server.py +++ b/src/enapter_mcp_server/mcp/server.py @@ -1,6 +1,7 @@ import asyncio import datetime import urllib.parse +from typing import Any import enapter import fastmcp @@ -120,6 +121,16 @@ def _register_tools(self, fastmcp_server: fastmcp.FastMCP) -> None: ), ) + if self._config.command_execution_enabled: + fastmcp_server.tool( + self.execute_command, + annotations=mcp.types.ToolAnnotations( + readOnlyHint=False, + destructiveHint=True, + title="Execute Command", + ), + ) + def _new_middleware(self) -> list[starlette.middleware.Middleware]: middleware = [] if self._config.cors_allow_origins is not None: @@ -389,6 +400,35 @@ async def search_command_executions( ) return [models.CommandExecution.from_domain(e) for e in executions] + async def execute_command( + self, + device_id: str, + command_name: str, + arguments: dict[str, Any] | None = None, + human_confirmed_this_action: bool = False, + ) -> models.CommandExecution: + """Execute a command on a specific device and return its outcome. + + This tool performs a real-world action on physical energy hardware. It is destructive: use it only when acting is intended. + + The `state` field of the returned `CommandExecution` communicates the outcome. If the underlying call fails to submit (e.g. insufficient role or a network error), the exception propagates. + + Some commands declare a `confirmation` block because they are consequential. Before executing such a command you MUST obtain a human's explicit approval through a structured form: present the device, the command, and the confirmation's `title`/`description` as a prompt with discrete choices whose options include exactly one explicit "approve" choice, then wait for the human to select it. Approval counts ONLY when the returned answer exactly matches that approve option. Never infer approval from free-text conversation ("yes", "maybe", "sure", "I think so", silence, or any other reply), and never accept a typed/free-text answer even when the form permits one instead of a selection — if the response is anything other than an exact selection of the approve option, re-present the form or do not execute. Only after such a form approval may you set `human_confirmed_this_action=True` to attest it. For commands without a `confirmation` block, the flag is ignored. + + Related tools: + - `read_blueprint`: Use `section="commands"` to discover a device's commands, arguments, and `confirmation` blocks. + - `search_command_executions`: Audit past executions or recover the outcome of a cancelled/timed-out call. + """ + auth = await self._get_auth_config() + execution = await self._app.execute_command( + auth=auth, + device_id=device_id, + command_name=command_name, + arguments=arguments, + human_confirmed_this_action=human_confirmed_this_action, + ) + return models.CommandExecution.from_domain(execution) + async def get_historical_telemetry( self, device_id: str, diff --git a/src/enapter_mcp_server/mcp/server_config.py b/src/enapter_mcp_server/mcp/server_config.py index 8e2fd8e..f665494 100644 --- a/src/enapter_mcp_server/mcp/server_config.py +++ b/src/enapter_mcp_server/mcp/server_config.py @@ -12,6 +12,7 @@ class ServerConfig: oauth_proxy_config: OAuthProxyConfig | None = None logo_url: str | None = None cors_allow_origins: list[str] | None = None + command_execution_enabled: bool = False @property def address(self) -> str: diff --git a/tests/integration/schemas/execute_command.json b/tests/integration/schemas/execute_command.json new file mode 100644 index 0000000..bfc4bc7 --- /dev/null +++ b/tests/integration/schemas/execute_command.json @@ -0,0 +1,112 @@ +{ + "annotations": { + "destructiveHint": true, + "idempotentHint": null, + "openWorldHint": null, + "readOnlyHint": false, + "title": "Execute Command" + }, + "description": "Execute a command on a specific device and return its outcome.\n\nThis tool performs a real-world action on physical energy hardware. It is destructive: use it only when acting is intended.\n\nThe `state` field of the returned `CommandExecution` communicates the outcome. If the underlying call fails to submit (e.g. insufficient role or a network error), the exception propagates.\n\nSome commands declare a `confirmation` block because they are consequential. Before executing such a command you MUST obtain a human's explicit approval through a structured form: present the device, the command, and the confirmation's `title`/`description` as a prompt with discrete choices whose options include exactly one explicit \"approve\" choice, then wait for the human to select it. Approval counts ONLY when the returned answer exactly matches that approve option. Never infer approval from free-text conversation (\"yes\", \"maybe\", \"sure\", \"I think so\", silence, or any other reply), and never accept a typed/free-text answer even when the form permits one instead of a selection \u2014 if the response is anything other than an exact selection of the approve option, re-present the form or do not execute. Only after such a form approval may you set `human_confirmed_this_action=True` to attest it. For commands without a `confirmation` block, the flag is ignored.\n\nRelated tools:\n- `read_blueprint`: Use `section=\"commands\"` to discover a device's commands, arguments, and `confirmation` blocks.\n- `search_command_executions`: Audit past executions or recover the outcome of a cancelled/timed-out call.", + "execution": null, + "icons": null, + "inputSchema": { + "additionalProperties": false, + "properties": { + "arguments": { + "anyOf": [ + { + "additionalProperties": true, + "type": "object" + }, + { + "type": "null" + } + ], + "default": null + }, + "command_name": { + "type": "string" + }, + "device_id": { + "type": "string" + }, + "human_confirmed_this_action": { + "default": false, + "type": "boolean" + } + }, + "required": [ + "device_id", + "command_name" + ], + "type": "object" + }, + "meta": { + "fastmcp": { + "tags": [] + } + }, + "name": "execute_command", + "outputSchema": { + "description": "Represents a command execution history record.", + "properties": { + "arguments": { + "anyOf": [ + { + "additionalProperties": true, + "type": "object" + }, + { + "type": "null" + } + ], + "default": null + }, + "command_name": { + "type": "string" + }, + "created_at": { + "format": "date-time", + "type": "string" + }, + "device_id": { + "type": "string" + }, + "id": { + "type": "string" + }, + "response_payload": { + "anyOf": [ + { + "additionalProperties": true, + "type": "object" + }, + { + "type": "null" + } + ], + "default": null + }, + "state": { + "enum": [ + "new", + "in_progress", + "success", + "error", + "timeout", + "unsync" + ], + "type": "string" + } + }, + "required": [ + "id", + "device_id", + "command_name", + "state", + "created_at" + ], + "type": "object" + }, + "title": "Execute Command" +} \ No newline at end of file diff --git a/tests/integration/schemas/read_blueprint.json b/tests/integration/schemas/read_blueprint.json index f784dea..1d24342 100644 --- a/tests/integration/schemas/read_blueprint.json +++ b/tests/integration/schemas/read_blueprint.json @@ -305,7 +305,7 @@ "type": "object" }, { - "description": "A declaration of a device command.\n\nThe `access_level` field defines the minimum role required to execute\nthis command. A user can execute this command only if their\n`authorized_role` for the device is at or after this `access_level`.", + "description": "A declaration of a device command.\n\nThe `access_level` field defines the minimum role required to execute this command. A user can execute this command only if their `authorized_role` for the device is at or after this `access_level`.\n\nThe `confirmation` field, when present, marks the command as consequential per the vendor's declaration.", "properties": { "access_level": { "description": "Roles in priority order: readonly < user < owner < installer < vendor < system.", @@ -379,6 +379,53 @@ }, "type": "array" }, + "confirmation": { + "anyOf": [ + { + "description": "Vendor-declared confirmation block for a consequential command.", + "properties": { + "description": { + "anyOf": [ + { + "type": "string" + }, + { + "type": "null" + } + ], + "default": null + }, + "severity": { + "anyOf": [ + { + "type": "string" + }, + { + "type": "null" + } + ], + "default": null + }, + "title": { + "anyOf": [ + { + "type": "string" + }, + { + "type": "null" + } + ], + "default": null + } + }, + "type": "object" + }, + { + "type": "null" + } + ], + "default": null + }, "description": { "anyOf": [ { diff --git a/tests/integration/test_mcp.py b/tests/integration/test_mcp.py index d8e1cb9..bbeae3c 100644 --- a/tests/integration/test_mcp.py +++ b/tests/integration/test_mcp.py @@ -8,6 +8,24 @@ from enapter_mcp_server import core, http, mcp +def _assert_schema(name: str, actual: dict[str, Any]) -> None: + schema_dir: pathlib.Path = pathlib.Path(__file__).parent / "schemas" + schema_dir.mkdir(exist_ok=True) + schema_path: pathlib.Path = schema_dir / f"{name}.json" + + if os.getenv("UPDATE_SCHEMAS") or not schema_path.exists(): + schema_path.write_text( + json.dumps(actual, indent=2, sort_keys=True), encoding="utf-8" + ) + if not os.getenv("UPDATE_SCHEMAS"): + raise AssertionError( + f"Schema snapshot created at {schema_path}. Please re-run the tests." + ) + + expected: dict[str, Any] = json.loads(schema_path.read_text(encoding="utf-8")) + assert actual == expected + + @pytest.mark.asyncio(loop_scope="class") class TestServer: @@ -35,7 +53,9 @@ async def test_tool_schemas(self, mcp_client: mcp.Client) -> None: """Fully checks the total count and schema of each returned tool.""" tools_result: list[Any] = await mcp_client.list_tools() - # 1. Verify the exact hard-coded count of tools + # 1. Verify the exact hard-coded count of tools. With the default + # (disabled) config the destructive `execute_command` tool is NOT + # registered. assert len(tools_result) == 7 # 2. Assert on each tool's schema individually @@ -53,21 +73,96 @@ async def test_tool_schemas(self, mcp_client: mcp.Client) -> None: assert tool is not None, f"Tool '{name}' not found" actual: dict[str, Any] = json.loads(tool.model_dump_json()) - self._assert_schema(name, actual) + _assert_schema(name, actual) + + # `execute_command` must NOT be registered under the default (disabled) + # config. + assert all(t.name != "execute_command" for t in tools_result) - def _assert_schema(self, name: str, actual: dict[str, Any]) -> None: - schema_dir: pathlib.Path = pathlib.Path(__file__).parent / "schemas" - schema_dir.mkdir(exist_ok=True) - schema_path: pathlib.Path = schema_dir / f"{name}.json" - if os.getenv("UPDATE_SCHEMAS") or not schema_path.exists(): - schema_path.write_text( - json.dumps(actual, indent=2, sort_keys=True), encoding="utf-8" +@pytest.mark.asyncio(loop_scope="class") +class TestServerWithCommandExecution: + """A separate class that enables the command execution kill switch.""" + + @pytest.fixture(scope="class") + async def mcp_client(self) -> AsyncGenerator[mcp.Client, None]: + config: mcp.ServerConfig = mcp.ServerConfig( + host="127.0.0.1", + port=12346, + enapter_http_api_url="", + command_execution_enabled=True, + ) + async with http.EnapterAPI(base_url=config.enapter_http_api_url) as enapter_api: + app: core.ApplicationServer = core.ApplicationServer( + enapter_api=enapter_api ) - if not os.getenv("UPDATE_SCHEMAS"): - raise AssertionError( - f"Schema snapshot created at {schema_path}. Please re-run the tests." - ) + async with mcp.Server(app=app, config=config): + async with mcp.Client(url=f"http://{config.address}/mcp") as client: + yield client + + async def test_registers_eight_tools_including_execute_command( + self, mcp_client: mcp.Client + ) -> None: + tools_result: list[Any] = await mcp_client.list_tools() + + assert len(tools_result) == 8 + assert any(t.name == "execute_command" for t in tools_result) + + # The existing seven tools stay read-only. + for tool in tools_result: + if tool.name == "execute_command": + continue + assert tool.annotations.readOnlyHint is True + + async def test_execute_command_annotations(self, mcp_client: mcp.Client) -> None: + tools_result: list[Any] = await mcp_client.list_tools() + tool: Any | None = next( + (t for t in tools_result if t.name == "execute_command"), None + ) + assert tool is not None + + assert tool.annotations.readOnlyHint is False + assert tool.annotations.destructiveHint is True + assert tool.title == "Execute Command" + + async def test_execute_command_schema(self, mcp_client: mcp.Client) -> None: + tools_result: list[Any] = await mcp_client.list_tools() + tool: Any | None = next( + (t for t in tools_result if t.name == "execute_command"), None + ) + assert tool is not None + + actual: dict[str, Any] = json.loads(tool.model_dump_json()) + assert actual["name"] == "execute_command" + + properties: dict[str, Any] = actual["inputSchema"]["properties"] + # Exactly the four prescribed parameters and no others. + assert set(properties.keys()) == { + "device_id", + "command_name", + "arguments", + "human_confirmed_this_action", + } + # device_id and command_name are required; the other two are optional. + assert set(actual["inputSchema"]["required"]) == {"device_id", "command_name"} + # arguments is an optional object-or-null defaulting to null. The + # upstream SDK normalizes null to {}. + assert properties["arguments"]["anyOf"] == [ + {"additionalProperties": True, "type": "object"}, + {"type": "null"}, + ] + assert properties["arguments"]["default"] is None + # human_confirmed_this_action defaults to False. + assert properties["human_confirmed_this_action"]["type"] == "boolean" + assert properties["human_confirmed_this_action"]["default"] is False + + # Annotations. + assert actual["annotations"]["readOnlyHint"] is False + assert actual["annotations"]["destructiveHint"] is True + assert actual["annotations"]["title"] == "Execute Command" + + # Docstring must guide the agent to obtain a human's approval. + assert "human" in actual["description"].lower() + assert "human_confirmed_this_action=True" in actual["description"] - expected: dict[str, Any] = json.loads(schema_path.read_text(encoding="utf-8")) - assert actual == expected + _assert_schema("execute_command", actual) diff --git a/tests/unit/core/test_application_server.py b/tests/unit/core/test_application_server.py index d9b636d..46da518 100644 --- a/tests/unit/core/test_application_server.py +++ b/tests/unit/core/test_application_server.py @@ -2,6 +2,7 @@ from typing import Any, AsyncGenerator import enapter +import pytest from enapter_mcp_server import core, domain @@ -38,6 +39,8 @@ def __init__( command_executions: dict[str, list[domain.CommandExecution]] | None = None, rule_engine_states: dict[str, core.RuleEngineDTO] | None = None, rules: dict[str, list[core.RuleDTO]] | None = None, + execute_command_result: domain.CommandExecution | None = None, + execute_command_raises: BaseException | None = None, ): self._sites = sites or [] self._devices = devices or [] @@ -47,8 +50,11 @@ def __init__( self._command_executions = command_executions or {} self._rule_engine_states = rule_engine_states or {} self._rules = rules or {} + self._execute_command_result = execute_command_result + self._execute_command_raises = execute_command_raises self.latest_telemetry_batch_calls = 0 self.get_rule_engine_calls = 0 + self.execute_command_calls: list[dict[str, Any]] = [] async def get_rule_engine( self, auth: core.AuthConfig, site_id: str @@ -148,6 +154,26 @@ async def get_device( return device raise ValueError(f"Device {device_id} not found") + async def execute_command( + self, + auth: core.AuthConfig, + device_id: str, + command_name: str, + arguments: dict[str, Any] | None, + ) -> domain.CommandExecution: + self.execute_command_calls.append( + { + "device_id": device_id, + "command_name": command_name, + "arguments": arguments, + } + ) + if self._execute_command_raises is not None: + raise self._execute_command_raises + if self._execute_command_result is None: + raise NotImplementedError() + return self._execute_command_result + async def get_latest_telemetry( self, auth: core.AuthConfig, attributes_by_device: dict[str, list[str]] ) -> dict[str, dict[str, Any]]: @@ -1332,3 +1358,313 @@ async def test_search_command_executions_requires_scope(self) -> None: assert str(exc) == "Please provide `device_id` to narrow down the search." else: raise AssertionError("Expected SearchQueryTooBroad") + + @staticmethod + def _device_with_commands( + device_id: str, commands: dict[str, domain.CommandDeclaration] + ) -> core.DeviceDTO: + return core.DeviceDTO( + blueprint_id="bp-1", + id=device_id, + name=device_id, + site_id="s1", + type=domain.DeviceType.NATIVE, + authorized_role=domain.AccessRole.OWNER, + manifest=make_device_manifest(commands=commands), + ) + + async def test_execute_command_unknown_name_raises_and_not_called(self) -> None: + device = self._device_with_commands( + "dev-1", + { + "reboot": domain.CommandDeclaration( + name="reboot", + display_name="Reboot", + access_level=domain.AccessRole.OWNER, + description=None, + arguments=[], + implements=[], + ), + "status": domain.CommandDeclaration( + name="status", + display_name="Status", + access_level=domain.AccessRole.USER, + description=None, + arguments=[], + implements=[], + ), + }, + ) + api = MockEnapterAPI(devices=[device]) + app = core.ApplicationServer(api) + auth = core.AuthConfig(token="test") + + try: + await app.execute_command( + auth, + device_id="dev-1", + command_name="does_not_exist", + arguments={}, + ) + except core.CommandNotFound as exc: + message = str(exc) + assert "does_not_exist" in message + # The message must name an available command so the agent can + # recover (correct a typo or hallucination). + assert "reboot" in message + assert "status" in message + else: + raise AssertionError("Expected CommandNotFound") + + # `execute` must NOT have been called when the name is unknown. + assert api.execute_command_calls == [] + + async def test_execute_command_confirmation_declared_flag_false_raises_and_not_called( + self, + ) -> None: + device = self._device_with_commands( + "dev-1", + { + "reboot": domain.CommandDeclaration( + name="reboot", + display_name="Reboot", + access_level=domain.AccessRole.OWNER, + description=None, + arguments=[], + implements=[], + confirmation=domain.CommandConfirmation( + severity="warning", + title="Reboot the device", + description="This will restart the device.", + ), + ) + }, + ) + api = MockEnapterAPI(devices=[device]) + app = core.ApplicationServer(api) + auth = core.AuthConfig(token="test") + + try: + await app.execute_command( + auth, + device_id="dev-1", + command_name="reboot", + arguments={}, + human_confirmed_this_action=False, + ) + except core.ConfirmationRequired as exc: + message = str(exc) + assert "Reboot the device" in message + assert "This will restart the device." in message + else: + raise AssertionError("Expected ConfirmationRequired") + + # `execute` must NOT have been called on refusal. + assert api.execute_command_calls == [] + + async def test_execute_command_confirmation_declared_flag_true_proceeds( + self, + ) -> None: + device = self._device_with_commands( + "dev-1", + { + "reboot": domain.CommandDeclaration( + name="reboot", + display_name="Reboot", + access_level=domain.AccessRole.OWNER, + description=None, + arguments=[], + implements=[], + confirmation=domain.CommandConfirmation( + severity="warning", + title="Reboot the device", + description="This will restart the device.", + ), + ) + }, + ) + result = domain.CommandExecution( + id="exec-1", + device_id="dev-1", + command_name="reboot", + state=domain.CommandExecutionState.SUCCESS, + created_at=datetime.datetime(2024, 1, 1), + arguments={"x": 1}, + response_payload={"ok": True}, + ) + api = MockEnapterAPI(devices=[device], execute_command_result=result) + app = core.ApplicationServer(api) + auth = core.AuthConfig(token="test") + + out = await app.execute_command( + auth, + device_id="dev-1", + command_name="reboot", + arguments={"x": 1}, + human_confirmed_this_action=True, + ) + + assert out == result + assert api.execute_command_calls == [ + {"device_id": "dev-1", "command_name": "reboot", "arguments": {"x": 1}} + ] + + async def test_execute_command_no_confirmation_flag_false_proceeds(self) -> None: + device = self._device_with_commands( + "dev-1", + { + "status": domain.CommandDeclaration( + name="status", + display_name="Status", + access_level=domain.AccessRole.USER, + description=None, + arguments=[], + implements=[], + ) + }, + ) + result = domain.CommandExecution( + id="exec-1", + device_id="dev-1", + command_name="status", + state=domain.CommandExecutionState.SUCCESS, + created_at=datetime.datetime(2024, 1, 1), + ) + api = MockEnapterAPI(devices=[device], execute_command_result=result) + app = core.ApplicationServer(api) + auth = core.AuthConfig(token="test") + + out = await app.execute_command( + auth, + device_id="dev-1", + command_name="status", + arguments={}, + human_confirmed_this_action=False, + ) + + assert out == result + assert len(api.execute_command_calls) == 1 + + async def test_execute_command_returns_command_execution_with_response_payload( + self, + ) -> None: + device = self._device_with_commands( + "dev-1", + { + "status": domain.CommandDeclaration( + name="status", + display_name="Status", + access_level=domain.AccessRole.USER, + description=None, + arguments=[], + implements=[], + ) + }, + ) + result = domain.CommandExecution( + id="exec-1", + device_id="dev-1", + command_name="status", + state=domain.CommandExecutionState.SUCCESS, + created_at=datetime.datetime(2024, 1, 1), + arguments={"a": 1}, + response_payload={"state": "running"}, + ) + api = MockEnapterAPI(devices=[device], execute_command_result=result) + app = core.ApplicationServer(api) + + out = await app.execute_command( + core.AuthConfig(token="test"), + device_id="dev-1", + command_name="status", + arguments={"a": 1}, + ) + + assert out.id == "exec-1" + assert out.state == domain.CommandExecutionState.SUCCESS + assert out.response_payload == {"state": "running"} + + @pytest.mark.parametrize( + "state", + [ + domain.CommandExecutionState.ERROR, + domain.CommandExecutionState.TIMEOUT, + domain.CommandExecutionState.UNSYNC, + ], + ) + async def test_execute_command_returns_terminal_states_without_raising( + self, state: domain.CommandExecutionState + ) -> None: + device = self._device_with_commands( + "dev-1", + { + "status": domain.CommandDeclaration( + name="status", + display_name="Status", + access_level=domain.AccessRole.USER, + description=None, + arguments=[], + implements=[], + ) + }, + ) + result = domain.CommandExecution( + id="exec-1", + device_id="dev-1", + command_name="status", + state=state, + created_at=datetime.datetime(2024, 1, 1), + ) + api = MockEnapterAPI(devices=[device], execute_command_result=result) + app = core.ApplicationServer(api) + + out = await app.execute_command( + core.AuthConfig(token="test"), + device_id="dev-1", + command_name="status", + arguments={}, + ) + + assert out.state == state + + async def test_execute_command_sdk_raise_propagates(self) -> None: + import httpx + + device = self._device_with_commands( + "dev-1", + { + "status": domain.CommandDeclaration( + name="status", + display_name="Status", + access_level=domain.AccessRole.USER, + description=None, + arguments=[], + implements=[], + ) + }, + ) + api_error = httpx.HTTPStatusError( + "Forbidden", + request=httpx.Request( + "POST", "https://api.enapter.com/v3/devices/dev-1/execute_command" + ), + response=httpx.Response(403, request=httpx.Request("POST", "")), + ) + api = MockEnapterAPI( + devices=[device], + execute_command_result=None, + execute_command_raises=api_error, + ) + app = core.ApplicationServer(api) + + try: + await app.execute_command( + core.AuthConfig(token="test"), + device_id="dev-1", + command_name="status", + arguments={}, + ) + except httpx.HTTPStatusError as exc: + assert exc is api_error + else: + raise AssertionError("Expected the SDK exception to propagate") diff --git a/tests/unit/http/test_enapter_api.py b/tests/unit/http/test_enapter_api.py index c9fbd47..d07ccd8 100644 --- a/tests/unit/http/test_enapter_api.py +++ b/tests/unit/http/test_enapter_api.py @@ -1,9 +1,11 @@ import contextlib -from typing import AsyncGenerator, cast +import datetime +from typing import Any, AsyncGenerator, Callable, Coroutine, cast import enapter +import pytest -from enapter_mcp_server import core, http +from enapter_mcp_server import core, domain, http class FailingTelemetryClient: @@ -45,3 +47,343 @@ async def test_get_latest_telemetry_raises_latest_telemetry_unavailable( pass else: raise AssertionError("Expected LatestTelemetryUnavailable") + + +# --------------------------------------------------------------------------- +# Fakes for execute_command tests +# --------------------------------------------------------------------------- + +_SdkExecuteFn = ( + Callable[ + [str, str, dict[str, Any] | None], + Coroutine[None, None, enapter.http.api.commands.Execution], + ] + | None +) + + +class _SpyCommandsClient: + """Records calls to `execute` and returns a pre-configured result or raises.""" + + def __init__( + self, + *, + result: enapter.http.api.commands.Execution | None = None, + raises: Exception | None = None, + side_effect: _SdkExecuteFn = None, + ) -> None: + self.result = result + self.raises = raises + self.side_effect = side_effect + self.calls: list[tuple[str, str, dict[str, Any] | None]] = [] + + async def execute( + self, + device_id: str, + command_name: str, + arguments: dict[str, Any] | None, + ) -> enapter.http.api.commands.Execution: + self.calls.append((device_id, command_name, arguments)) + if self.side_effect is not None: + return await self.side_effect(device_id, command_name, arguments) + if self.raises is not None: + raise self.raises + if self.result is not None: + return self.result + raise NotImplementedError("No result, raises, or side_effect configured") + + +class _CommandFakeClient: + """Fake SDK Client exposing only `commands`.""" + + def __init__(self, commands: _SpyCommandsClient) -> None: + self.commands = commands + + +class _CommandStubEnapterAPI(http.EnapterAPI): + """Stub that injects a fake `commands` client for execute_command tests.""" + + def __init__(self, base_url: str, commands: _SpyCommandsClient) -> None: + super().__init__(base_url) + self._commands = commands + + @contextlib.asynccontextmanager + async def _new_client( + self, auth: core.AuthConfig + ) -> AsyncGenerator[enapter.http.api.Client, None]: + yield cast(enapter.http.api.Client, _CommandFakeClient(self._commands)) + + +# --------------------------------------------------------------------------- +# Fixture helpers +# --------------------------------------------------------------------------- + + +def _make_execution( + *, + exec_id: str = "exec-1", + device_id: str = "dev-1", + state: enapter.http.api.commands.ExecutionState = ( + enapter.http.api.commands.ExecutionState.SUCCESS + ), + created_at: datetime.datetime | None = None, + command_name: str = "cmd.power_on", + arguments: dict[str, Any] | None = None, + response_payload: dict[str, Any] | None = None, + response_state: enapter.http.api.commands.response.ResponseState = ( + enapter.http.api.commands.response.ResponseState.SUCCEEDED + ), +) -> enapter.http.api.commands.Execution: + if created_at is None: + created_at = datetime.datetime(2025, 1, 1, tzinfo=datetime.timezone.utc) + if arguments is None: + arguments = {} + resp: enapter.http.api.commands.response.Response | None = None + if response_payload is not None: + resp = enapter.http.api.commands.response.Response( + state=response_state, + payload=response_payload, + received_at=created_at, + ) + return enapter.http.api.commands.Execution( + id=exec_id, + device_id=device_id, + state=state, + created_at=created_at, + request=enapter.http.api.commands.request.Request( + name=command_name, arguments=arguments + ), + response=resp, + log=None, + ) + + +# --------------------------------------------------------------------------- +# Tests +# --------------------------------------------------------------------------- + + +class TestExecuteCommand: + """Unit tests for http.EnapterAPI.execute_command.""" + + # -- helpers ----------------------------------------------------------- + + @staticmethod + def _api( + commands: _SpyCommandsClient, + ) -> _CommandStubEnapterAPI: + return _CommandStubEnapterAPI(base_url="http://example.test", commands=commands) + + @staticmethod + def _auth() -> core.AuthConfig: + return core.AuthConfig(token="test-token") + + # -- call-args --------------------------------------------------------- + + async def test_execute_passes_device_id_command_name_and_arguments( + self, + ) -> None: + """commands.execute receives every positional argument unchanged.""" + spy = _SpyCommandsClient( + result=_make_execution( + device_id="dev-arg", command_name="cmd.foo", arguments={"a": 1} + ), + ) + api = self._api(spy) + + await api.execute_command( + auth=self._auth(), + device_id="dev-arg", + command_name="cmd.foo", + arguments={"a": 1}, + ) + + assert spy.calls == [("dev-arg", "cmd.foo", {"a": 1})] + + async def test_execute_passes_none_arguments_unchanged(self) -> None: + """None for `arguments` is passed through — the adapter does not pre-normalize.""" + spy = _SpyCommandsClient( + result=_make_execution(arguments={}), + ) + api = self._api(spy) + + await api.execute_command( + auth=self._auth(), + device_id="dev-1", + command_name="cmd.foo", + arguments=None, + ) + + assert spy.calls == [("dev-1", "cmd.foo", None)] + + # -- mapping on success ------------------------------------------------- + + async def test_execute_maps_success_to_command_execution(self) -> None: + """A successful SDK Execution maps to a domain.CommandExecution.""" + created = datetime.datetime(2025, 6, 1, 12, 0, 0, tzinfo=datetime.timezone.utc) + sdk_exec = _make_execution( + exec_id="exec-mapped", + device_id="dev-mapped", + command_name="cmd.power_on", + arguments={"power": 42}, + response_payload={"status": "on"}, + created_at=created, + ) + spy = _SpyCommandsClient(result=sdk_exec) + api = self._api(spy) + + result = await api.execute_command( + auth=self._auth(), + device_id="dev-mapped", + command_name="cmd.power_on", + arguments={"power": 42}, + ) + + assert isinstance(result, domain.CommandExecution) + assert result.id == "exec-mapped" + assert result.device_id == "dev-mapped" + assert result.command_name == "cmd.power_on" + assert result.state == domain.CommandExecutionState.SUCCESS + assert result.created_at == created + assert result.arguments == {"power": 42} + assert result.response_payload == {"status": "on"} + + async def test_execute_maps_null_response_to_none_payload(self) -> None: + """When SDK response is None, response_payload is None.""" + sdk_exec = _make_execution( + exec_id="exec-noresp", + device_id="dev-noresp", + command_name="cmd.foo", + response_payload=None, + ) + spy = _SpyCommandsClient(result=sdk_exec) + api = self._api(spy) + + result = await api.execute_command( + auth=self._auth(), + device_id="dev-noresp", + command_name="cmd.foo", + arguments={}, + ) + + assert result.response_payload is None + assert result.id == "exec-noresp" + + # -- terminal states returned, not raised ------------------------------ + + @pytest.mark.parametrize( + "sdk_state,expected_state", + [ + ( + enapter.http.api.commands.ExecutionState.ERROR, + domain.CommandExecutionState.ERROR, + ), + ( + enapter.http.api.commands.ExecutionState.TIMEOUT, + domain.CommandExecutionState.TIMEOUT, + ), + ( + enapter.http.api.commands.ExecutionState.UNSYNC, + domain.CommandExecutionState.UNSYNC, + ), + ], + ) + async def test_terminal_state_returned_not_raised( + self, + sdk_state: enapter.http.api.commands.ExecutionState, + expected_state: domain.CommandExecutionState, + ) -> None: + """Terminal SDK states are mapped and returned — the adapter never raises.""" + sdk_exec = _make_execution( + exec_id="exec-term", + state=sdk_state, + response_payload=None, + ) + spy = _SpyCommandsClient(result=sdk_exec) + api = self._api(spy) + + result = await api.execute_command( + auth=self._auth(), + device_id="dev-term", + command_name="cmd.fail", + arguments={}, + ) + + assert isinstance(result, domain.CommandExecution) + assert result.state == expected_state + assert result.id == "exec-term" + + # -- SDK exceptions propagate uncaught --------------------------------- + + async def test_sdk_exception_propagates_unmodified(self) -> None: + """When commands.execute raises, the exact exception propagates — no wrapping.""" + exc = enapter.http.api.Error( + message="forbidden", + code="missing_permission", + details={"role": "viewer"}, + ) + spy = _SpyCommandsClient(raises=exc) + api = self._api(spy) + + with pytest.raises(enapter.http.api.Error) as exc_info: + await api.execute_command( + auth=self._auth(), + device_id="dev-1", + command_name="cmd.foo", + arguments=None, + ) + + assert exc_info.value is exc + + async def test_non_sdk_exception_propagates_unmodified(self) -> None: + """An arbitrary exception from execute also propagates unmodified.""" + exc = ValueError("unexpected value") + spy = _SpyCommandsClient(raises=exc) + api = self._api(spy) + + with pytest.raises(ValueError) as exc_info: + await api.execute_command( + auth=self._auth(), + device_id="dev-1", + command_name="cmd.foo", + arguments=None, + ) + + assert exc_info.value is exc + + # -- side_effect callback coverage ------------------------------------ + + async def test_side_effect_receives_args_and_result_is_returned( + self, + ) -> None: + """When a side_effect callable is configured, its args match and its + return value flows through the adapter.""" + received: list[tuple[str, str, dict[str, Any] | None]] = [] + + async def _side_effect( + device_id: str, + command_name: str, + arguments: dict[str, Any] | None, + ) -> enapter.http.api.commands.Execution: + received.append((device_id, command_name, arguments)) + return _make_execution( + exec_id="exec-side", + device_id=device_id, + command_name=command_name, + arguments=arguments if arguments is not None else {}, + ) + + spy = _SpyCommandsClient(side_effect=_side_effect) + api = self._api(spy) + + result = await api.execute_command( + auth=self._auth(), + device_id="dev-side", + command_name="cmd.test", + arguments={"k": "v"}, + ) + + assert received == [("dev-side", "cmd.test", {"k": "v"})] + assert spy.calls == [("dev-side", "cmd.test", {"k": "v"})] + assert result.id == "exec-side" + assert result.arguments == {"k": "v"} diff --git a/tests/unit/http/test_enapter_data_mapper.py b/tests/unit/http/test_enapter_data_mapper.py index 1eff86e..4d906e8 100644 --- a/tests/unit/http/test_enapter_data_mapper.py +++ b/tests/unit/http/test_enapter_data_mapper.py @@ -385,3 +385,74 @@ def test_to_command_execution(self) -> None: assert mapped.created_at == created_at assert mapped.arguments == {"on": True} assert mapped.response_payload == {"status": "ok"} + + def test_command_declaration_maps_confirmation_when_present(self) -> None: + manifest = http.EnapterDataMapper().to_device_manifest( + { + "commands": { + "reboot": { + "display_name": "Reboot", + "confirmation": { + "severity": "warning", + "title": "Reboot the device", + "description": "Restarts the device.", + }, + } + } + } + ) + + assert manifest is not None + confirmation = manifest.commands["reboot"].confirmation + assert confirmation is not None + assert confirmation.severity == "warning" + assert confirmation.title == "Reboot the device" + assert confirmation.description == "Restarts the device." + + def test_command_declaration_confirmation_absent_is_none(self) -> None: + manifest = http.EnapterDataMapper().to_device_manifest( + { + "commands": { + "status": { + "display_name": "Status", + } + } + } + ) + + assert manifest is not None + assert manifest.commands["status"].confirmation is None + + def test_command_declaration_confirmation_null_value_is_none(self) -> None: + manifest = http.EnapterDataMapper().to_device_manifest( + { + "commands": { + "reboot": { + "display_name": "Reboot", + "confirmation": None, + } + } + } + ) + + assert manifest is not None + assert manifest.commands["reboot"].confirmation is None + + def test_command_declaration_confirmation_partial_block_is_defensive(self) -> None: + manifest = http.EnapterDataMapper().to_device_manifest( + { + "commands": { + "reboot": { + "display_name": "Reboot", + "confirmation": {"title": "Reboot the device"}, + } + } + } + ) + + assert manifest is not None + confirmation = manifest.commands["reboot"].confirmation + assert confirmation is not None + assert confirmation.title == "Reboot the device" + assert confirmation.severity is None + assert confirmation.description is None