Commit message#17
Merged
ajit-zer07 merged 1 commit intomainfrom Apr 16, 2026
Merged
Conversation
Observer-only control-plane: direct-agent-auth + quality cleanup
Two-phase change. Phase 1 (direct-agent-auth, plan CP-1..CP-13) makes the
control-plane a scenario-agnostic observer: agents authenticate to the runtime
directly and emit their own envelopes; the control-plane never calls Send.
Phase 2 (quality-cleanup) removes dead code from the refactor, replaces
hardcoded test sleeps with conditional polling, documents conventions, and
extracts gRPC marshalling helpers.
Why: RFC-MACP-0004 §4 requires `envelope.sender` to derive from authenticated
identity. The previous control-plane forged envelopes and consumed
scenario-specific fields (`kickoff[]`, `participants[].role`, `policyHints`,
`commitments[]`, `initiatorParticipantId`) that violated the documented
"scenario-agnostic observer" boundary. Once that was unwound, the codebase
had stale tests, dead types, and inconsistent conventions that this commit
also addresses.
Behavioural changes (HTTP):
- POST /runs accepts a scenario-agnostic RunDescriptor only; unknown keys
rejected (forbidNonWhitelisted: true). Returns {runId, sessionId, status,
traceId} — sessionId is allocated by the control-plane (UUID v4) or echoed
back when caller provides a valid one.
- POST /runs/:id/{messages,signal,context} return 410 Gone with
errorCode: ENDPOINT_REMOVED. Agents migrate to macp-sdk-python /
macp-sdk-typescript.
- POST /runs/:id/cancel: Option A proxies to metadata.cancelCallback.url;
Option B (metadata.cancellationDelegated=true) calls runtime.CancelSession.
Internal changes:
- RuntimeProvider narrowed to observer surface: subscribeSession() opens a
read-only StreamSession and ends the write side immediately.
- New RunExecutor.execute() flow: initialize → pollForOpenSession (100ms→1s
backoff, 60s timeout) → bindSession → subscribeSession → stream consumer.
- RuntimeCredentialResolverService reverted to single-bearer form; per-agent
tokens (RUNTIME_AGENT_TOKENS_JSON) removed.
- Extracted src/runtime/grpc-helpers.ts (fromEnvelope, fromAck,
fromSessionMetadata, buildMetadata, getClientMethod) — 553→459 lines on
the provider.
- ProtoRegistryService.decodeKnown now falls back to JSON when proto decode
throws (mock runtime emits JSON; real runtime emits proto).
Invariants enforced by tests:
- src/runtime/observer-invariant.spec.ts — fails CI on any provider.send(,
openSession(, chooseInitiator(, retryKickoff( in src/.
- src/projection/projection-coverage.spec.ts — every CANONICAL_EVENT_TYPES
entry must have a reducer branch in ProjectionService.applyEvents.
Quality cleanup:
- Deleted dead code: MockRuntimeProvider, encodeSessionContext /
encodePayloadEnvelope / encodeMessage + their types, ExecutionRequest type
alias (migrated 6 imports → RunDescriptor), test-agents/ Python harness
(drove removed HTTP endpoints), 5 mode-specific integration specs +
runs-messaging spec, 3 stale write-path DTOs, ts-agent.ts test helper.
- Replaced 17 hardcoded sleep() calls with new test/helpers/wait-for.ts
conditional polling (~80s of fixed waits → ~5s of polls).
- Tightened 15 instrumentation assertions to verify metric type + labels
+ round-trip observation.
- Documented 5 conventions in CLAUDE.md §Conventions with grep rules
(errors, logger, ValidationPipe, env vars, metrics).
- Migrated THROTTLE_TTL_MS / THROTTLE_LIMIT to AppConfigService via
ThrottlerModule.forRootAsync.
Schema:
- runs.runtime_session_id is now populated at INSERT (so GET /runs/:id
returns the sessionId immediately after POST /runs).
- runtime_sessions.initiator_participant_id is populated from the runtime's
GetSession snapshot (was forged by chooseInitiator). Used by run
recovery as the subscriberId fallback.
Docs:
- README.md, docs/API.md, docs/ARCHITECTURE.md, docs/INTEGRATION.md, CLAUDE.md
rewritten for observer model. Fixed test counts (407→577) and endpoint
table (write paths now 410 Gone).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Save as the GitHub PR body:
Summary
direct-agent-authplan tasks CP-1..CP-13:POST /runsaccepts a scenario-agnosticRunDescriptor, returns{runId, sessionId}; the control-plane pollsGetSessionuntil the initiator agent opens it, then subscribes to a read-onlyStreamSession.POST /runs/:id/{messages,signal,context}return 410 Gone. Agents authenticate to the runtime directly viamacp-sdk-python/macp-sdk-typescript. RFC-MACP-0004 §4conformant.
plans/quality-cleanup.md(Phases 0–3): dead code purge, hardcoded sleeps → conditionalwaitFor()polling, instrumentation assertionstightened, conventions documented + enforced by grep, gRPC helpers extracted from the runtime provider.
observer-invariant.spec.ts(noprovider.send(/openSession(/chooseInitiator(/retryKickoff(in
src/) andprojection-coverage.spec.ts(every canonical event type has a reducer).Behavioural changes (call out for reviewers)
POST /runsExecutionRequestwithkickoff[],participants[].role,policyHints,commitments[],initiatorParticipantIdRunDescriptoronlyforbidNonWhitelisted: true); returnssessionIdPOST /runs/:id/messagesprovider.sendENDPOINT_REMOVED)POST /runs/:id/signalPOST /runs/:id/contextPOST /runs/:id/cancelruntime.CancelSessionwith control-plane's identitymetadata.cancelCallback.url. Option B:runtime.CancelSessiononly whenmetadata.cancellationDelegated === trueExamples-service and the SDKs landed their direct-agent-auth slices before this PR; they already produce
RunDescriptorshapes against the new contract — seeui-console/plans/direct-agent-auth.md.Schema changes
runs.runtime_session_idis now populated at INSERT (was set later viamarkBindingSession).GET /runs/:idreturns the sessionId immediately.runtime_sessions.initiator_participant_idis populated from the runtime'sGetSessionsnapshot (was forged by the deletedchooseInitiator). Still used by recovery.Removed (deletion notice — coordinate with downstream consumers)
RUNTIME_AGENT_TOKENS_JSONenv var.KICKOFF_MAX_RETRIESenv var.MockRuntimeProviderclass (useScriptedMockRuntimeProviderfor tests).test-agents/Python harness +scripts/run-e2e.sh.send-run-message.dto.ts,send-signal.dto.ts,update-context.dto.ts.decision-mode,proposal-mode,task-mode,quorum-mode,handoff-mode,runs-messaging(replaced by consolidatedobserver-mode.integration.spec.ts).ExecutionRequesttype alias (useRunDescriptor).encodeSessionContext/encodePayloadEnvelope/encodeMessageonProtoRegistryServiceand their types (PayloadEnvelopeInput,ProtoPayload,PayloadEncoding).TestClient.sendMessage / sendSignal / updateContexthelpers +ts-agent.ts.New env vars
SESSION_POLL_BASE_MS100SESSION_POLL_MAX_MS1000SESSION_POLL_TIMEOUT_MS60000CANCEL_CALLBACK_TIMEOUT_MS5000THROTTLE_TTL_MS/THROTTLE_LIMIT60000/100AppConfigServiceTest plan
npm run lint --max-warnings=0— cleannpm run build— cleannpm test— 577/577 across 44 suitesnpm run test:integration— 59/59 across 14 suitesexamples-serviceat this branch, launch a fraud scenario; verifyPOST /runsreturns{runId, sessionId}, agents emitenvelopes directly (runtime gRPC logs), control-plane shows zero
SendRPCs, projection populates from runtimeProposal/Vote/Commitmentevents, UI cancel routesvia the initiator's
cancelCallback.RUNTIME_BEARER_TOKENis set with a least-privilege identity (can_start_sessions: false); confirmRUNTIME_AGENT_TOKENS_JSONremoved from Railway env; confirm dev-mode toggles (MACP_ALLOW_DEV_SENDER_HEADER,RUNTIME_USE_DEV_HEADER) are NOT set in prod.Migration impact
RunDescriptor(compiler shipped 2026-04-15). Reading the response'ssessionIdand forwarding the initiator'scancelCallbackURLinto
session.metadata.cancelCallbackare the only follow-ups needed there.POST /runs/:id/{messages,signal,context}callers need to be removed. DisplaysessionIdin run details (already returned byGET /runs/:id).RUNTIME_AGENT_TOKENS_JSON. SetRUNTIME_BEARER_TOKENto one least-privilege observer token. The runtime'sMACP_AUTH_TOKENS_JSONentryfor the control-plane must have
can_start_sessions: false.Net diff
80 files changed, 2,056 insertions, 7,605 deletions — net −5,549 LoC.