Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 40 additions & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -132,9 +132,48 @@ jobs:
fi
echo "No leaked secrets detected."

conventions:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Enforce code conventions (see CLAUDE.md)
run: |
set -o pipefail
fail=0

echo "::group::throw new Error outside allowed files"
if grep -rn "throw new Error" src --include='*.ts' | grep -vE '(app-config|migrate|\.spec\.ts|circuit-breaker\.ts|proto-registry\.service\.ts|grpc-helpers\.ts|run\.repository\.ts|webhook\.service\.ts|run-recovery\.service\.ts)'; then
echo "::error::throw new Error found outside allowed files. Use AppException or a Nest built-in."
fail=1
fi
echo "::endgroup::"

echo "::group::console.* outside migrate / specs"
if grep -rn 'console\.' src --include='*.ts' | grep -vE '(migrate|\.spec\.ts)'; then
echo "::error::console.* found outside migrate / tests. Use Nest Logger instead."
fail=1
fi
echo "::endgroup::"

echo "::group::process.env outside allowed files"
if grep -rn 'process\.env' src --include='*.ts' | grep -vE '(app-config|main\.ts|telemetry/telemetry\.ts|db/migrate\.ts|\.spec\.ts)'; then
echo "::error::process.env read outside app-config/main/telemetry/migrate. Centralize in AppConfigService."
fail=1
fi
echo "::endgroup::"

echo "::group::Observer-only: no write-path endpoints reintroduced"
if grep -rn 'POST.*runs/:id/\(messages\|signal\|context\)' src --include='*.ts' | grep -vE '(gone\(|\.spec\.ts|REMOVED|deprecated)'; then
echo "::error::Write-path endpoint references found. Control-plane is observer-only (direct-agent-auth)."
fail=1
fi
echo "::endgroup::"

exit $fail

build:
runs-on: ubuntu-latest
needs: [lint, typecheck, test, check-env-secrets]
needs: [lint, typecheck, test, check-env-secrets, conventions]
permissions:
contents: read
packages: read
Expand Down
8 changes: 4 additions & 4 deletions docs/API.md
Original file line number Diff line number Diff line change
Expand Up @@ -414,12 +414,12 @@ Metrics summary including token usage and estimated cost:
}
```

**Token usage convention:** Agents include token data in message metadata:
**Token usage convention:** Agents include token data in envelope metadata when sending via `macp-sdk-*` directly to the runtime. The control-plane observes that envelope on its read-only stream:
```json
POST /runs/:id/messages
// Envelope emitted by the agent via the SDK (e.g. session.send(...))
{
"from": "fraud-agent",
"messageType": "Evaluation",
"sender": "fraud-agent",
"payload": { ... },
"metadata": {
"tokenUsage": {
Expand Down Expand Up @@ -805,7 +805,7 @@ Emitted for each commitment the runtime evaluates against the active policy.

Emitted in two cases:
1. The runtime sends a `PolicyDenied` stream message.
2. A send-ack (`POST /runs/:id/messages`) returns `error.code = "POLICY_DENIED"`. The control-plane synthesizes the event so deny reasons are visible on the event stream even if the runtime doesn't echo them back.
2. A runtime-emitted send-ack observed on the stream carries `error.code = "POLICY_DENIED"` (the agent's `Send` RPC was rejected by policy). The control-plane synthesizes the event so deny reasons are visible on the event stream even if the runtime doesn't echo them back as a dedicated `PolicyDenied` envelope.

```json
{
Expand Down
2 changes: 1 addition & 1 deletion docs/INTEGRATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

Key methods to implement (observer-only surface, post direct-agent-auth):
- `initialize()` — protocol version negotiation.
- `subscribeSession({runId, runtimeSessionId})` — read-only `StreamSession` observer; returns `{events, abort}`. **Never writes envelopes.**
- `subscribeSession({runId, runtimeSessionId, afterSequence?})` — read-only `StreamSession` observer; returns `{events, abort}`. **Never writes envelopes.** Per RFC-MACP-0006 §3.2 the provider writes a single passive-subscribe frame (`{subscribeSessionId, afterSequence}`) and immediately half-closes the write side; the runtime then replays accepted history from `afterSequence` (default 0 = full replay) before switching to live broadcast.
- `getSession()` — poll for session state (used by the observer's `pollForOpenSession` loop).
- `cancelSession()` — only called when `run.metadata.cancellationDelegated === true` (Option B in direct-agent-auth §Cancellation design).
- `getManifest()` / `listModes()` / `listRoots()` / `health()` — metadata.
Expand Down
28 changes: 14 additions & 14 deletions docs/TROUBLESHOOTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,21 +43,21 @@
4. Manually cancel: `POST /runs/{id}/cancel`
5. If recovery is enabled (`RUN_RECOVERY_ENABLED=true`), the system auto-recovers orphaned runs on startup

## Message Send Failures
## Legacy Write Endpoints Return 410 Gone

**Symptom:** `POST /runs/:id/messages` returns 400 or 502
**Symptom:** `POST /runs/:id/messages`, `/signal`, or `/context` returns `410 Gone` with `errorCode: ENDPOINT_REMOVED`.

**Common causes:**
- Run not in `binding_session` or `running` state → check `GET /runs/:id`
- Invalid payload encoding → use `payloadEnvelope` with `encoding: "proto"` for real runtime
- Non-existent participant → `from` must match a registered participant ID
- Session expired → check session TTL
**Explanation:** The control-plane is observer-only as of the 2026-04-15 direct-agent-auth refactor. Agents authenticate to the runtime directly and emit their own envelopes via `macp-sdk-python` / `macp-sdk-typescript`. See `docs/API.md` § "Messages & Signals — emission is NOT via the control-plane" for migration guidance.

## Signals Not Appearing in Projection
## Agent Envelopes Not Appearing in Projection

**Symptom:** `POST /runs/:id/signal` succeeds but signals don't appear in `GET /runs/:id/state`
**Symptom:** Agents call `session.send(...)` via the SDK but events don't appear in `GET /runs/:id/state`.

**Explanation:** Signals are recorded as `message.sent` events by the control plane. The `signal.emitted` projection entries only appear when the runtime echoes signals back via the gRPC stream as `stream-envelope` events with `messageType: Signal`. With a mock runtime that doesn't echo signals, only `message.sent` events (with `subject.kind: signal`) appear.
**Checks:**
1. Confirm the run's `runtimeSessionId` matches the `session_id` the agent is writing to (`GET /runs/:id`).
2. Check stream consumer logs for `StreamSession` reconnection loops — the observer subscribes read-only and must be connected.
3. Confirm the runtime echoes envelopes back on the stream (some runtimes only echo certain message types). `signal.emitted` and `message.sent` canonical events require `stream-envelope` entries on the observer stream.
4. For session discovery, verify `SESSION_DISCOVERY_ENABLED=true` so externally-launched sessions auto-create runs.

## SSE Stream Drops

Expand Down Expand Up @@ -101,11 +101,11 @@
| `CIRCUIT_BREAKER_OPEN` | 503 | Runtime circuit breaker is open |
| `STREAM_EXHAUSTED` | 500 | Max stream reconnection retries reached |
| `SESSION_EXPIRED` | 410 | Runtime session has expired |
| `KICKOFF_FAILED` | 502 | Kickoff message failed after retries |
| `MODE_NOT_SUPPORTED` | 400 | Runtime does not support requested mode |
| `VALIDATION_ERROR` | 400 | Request body validation failed |
| `MESSAGE_SEND_FAILED` | 502 | Runtime rejected a session message |
| `SIGNAL_DISPATCH_FAILED` | 502 | Runtime rejected a signal |
| `CONTEXT_UPDATE_FAILED` | 502 | Runtime rejected context update |
| `INVALID_SESSION_ID` | 400 | Session ID not recognized by runtime |
| `UNKNOWN_POLICY_VERSION` | 400 | Policy version not found in registry |
| `POLICY_DENIED` | 403 | Commitment rejected by policy rules |
| `INVALID_POLICY_DEFINITION` | 400 | Policy rules fail schema validation |
| `SESSION_ALREADY_EXISTS` | 409 | Duplicate session start attempt |
| `INTERNAL_ERROR` | 500 | Unexpected server error |
8 changes: 4 additions & 4 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
"dev:link-protos": "cd ../multiagentcoordinationprotocol/packages/proto-npm && npm link && cd - && npm link @multiagentcoordinationprotocol/proto"
},
"dependencies": {
"@multiagentcoordinationprotocol/proto": "^0.1.0",
"@multiagentcoordinationprotocol/proto": "^0.1.2",
"@grpc/grpc-js": "^1.14.0",
"@grpc/proto-loader": "^0.8.0",
"@nestjs/common": "^11.1.16",
Expand Down
16 changes: 5 additions & 11 deletions src/contracts/runtime.ts
Original file line number Diff line number Diff line change
Expand Up @@ -76,17 +76,6 @@ export interface RuntimeInitializeResult {
instructions?: string;
}

/**
* Result shape returned once the observer has confirmed the session is OPEN.
* `initiator` is learned from the runtime's session metadata (GetSession response),
* never chosen by the control-plane.
*/
export interface RuntimeSessionOpenResult {
runtimeSessionId: string;
initiator: string;
ack: RuntimeAck;
}

export interface RuntimeGetSessionRequest {
runId: string;
runtimeSessionId: string;
Expand Down Expand Up @@ -156,6 +145,11 @@ export interface RuntimeCallOptions {
export interface RuntimeSubscribeSessionRequest {
runId: string;
runtimeSessionId: string;
/**
* RFC-MACP-0006 §3.2: replay log sequence to start from. 0 (default) replays
* the full accepted history before switching to live broadcast.
*/
afterSequence?: number;
}

/**
Expand Down
4 changes: 0 additions & 4 deletions src/errors/error-codes.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,11 @@ export enum ErrorCode {
RUNTIME_TIMEOUT = 'RUNTIME_TIMEOUT',
STREAM_EXHAUSTED = 'STREAM_EXHAUSTED',
SESSION_EXPIRED = 'SESSION_EXPIRED',
KICKOFF_FAILED = 'KICKOFF_FAILED',
VALIDATION_ERROR = 'VALIDATION_ERROR',
INTERNAL_ERROR = 'INTERNAL_ERROR',
MODE_NOT_SUPPORTED = 'MODE_NOT_SUPPORTED',
CIRCUIT_BREAKER_OPEN = 'CIRCUIT_BREAKER_OPEN',
SIGNAL_DISPATCH_FAILED = 'SIGNAL_DISPATCH_FAILED',
CONTEXT_UPDATE_FAILED = 'CONTEXT_UPDATE_FAILED',
INVALID_SESSION_ID = 'INVALID_SESSION_ID',
MESSAGE_SEND_FAILED = 'MESSAGE_SEND_FAILED',
UNKNOWN_POLICY_VERSION = 'UNKNOWN_POLICY_VERSION',
POLICY_DENIED = 'POLICY_DENIED',
INVALID_POLICY_DEFINITION = 'INVALID_POLICY_DEFINITION',
Expand Down
2 changes: 1 addition & 1 deletion src/metrics/metrics.service.ts
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ function safeNumber(val: unknown, fallback = 0): number {
* { "tokenUsage": { "promptTokens": N, "completionTokens": N, "model": "..." } }
*
* This can appear in:
* - event.data.metadata.tokenUsage (sent via POST /runs/:id/messages metadata)
* - event.data.metadata.tokenUsage (envelope metadata emitted by agents via the macp-sdk)
* - event.data.decodedPayload.tokenUsage (embedded in proto payload)
* - event.data.payloadDescriptor.tokenUsage (from payload descriptor)
* - event.data.tokenUsage (direct)
Expand Down
Loading
Loading