Skip to content

Observer-only control-plane + quality cleanup + session discovery sup…#19

Merged
ajit-zer07 merged 1 commit intomainfrom
auth_context-update
Apr 18, 2026
Merged

Observer-only control-plane + quality cleanup + session discovery sup…#19
ajit-zer07 merged 1 commit intomainfrom
auth_context-update

Conversation

@ajit-zer07
Copy link
Copy Markdown
Contributor

Summary

  • Observer-only control-plane (direct-agent-auth CP-1..CP-13). POST /runs accepts scenario-agnostic RunDescriptor, returns {runId, sessionId}. Control-plane polls
    GetSession until the initiator agent opens the session, then subscribes read-only via StreamSession. Write endpoints (/messages, /signal, /context) return 410
    Gone
    . Agents use macp-sdk-python / macp-sdk-typescript directly.
  • Session discovery. New SessionDiscoveryService subscribes to WatchSessions and auto-creates run records for sessions started by external launchers.
  • Quality cleanup (Phases 0–3). Dead code purge, sleep()waitFor() polling, instrumentation assertions tightened, conventions documented, gRPC helpers extracted,
    --max-warnings=0 enforced.

Behavioral changes

Endpoint Before After
POST /runs Accepted ExecutionRequest with kickoff[], participants[].role, etc. Accepts RunDescriptor only (forbidNonWhitelisted); returns sessionId
POST /runs/:id/messages 200 — CP forged envelope 410 Gone
POST /runs/:id/signal 200 — CP forged envelope 410 Gone
POST /runs/:id/context 200 — CP forged envelope 410 Gone
POST /runs/:id/cancel Always called runtime.CancelSession Option A: proxies to cancelCallback.url. Option B: CancelSession when cancellationDelegated=true

New features

  • Session discovery via WatchSessions gRPC stream (SESSION_DISCOVERY_ENABLED=true default). Auto-creates runs for externally-started sessions.
  • listSessions() / watchSessions() on RuntimeProvider interface.

New env vars

Var Default Purpose
SESSION_POLL_BASE_MS 100 Observer GetSession poll interval
SESSION_POLL_MAX_MS 1000 Capped backoff
SESSION_POLL_TIMEOUT_MS 60000 Timeout waiting for initiator
CANCEL_CALLBACK_TIMEOUT_MS 5000 UI cancel → agent callback
SESSION_DISCOVERY_ENABLED true Auto-discover external sessions
THROTTLE_TTL_MS / THROTTLE_LIMIT 60000 / 100 Now via AppConfigService

Removed

  • RUNTIME_AGENT_TOKENS_JSON, KICKOFF_MAX_RETRIES env vars
  • MockRuntimeProvider, encode methods + types (PayloadEnvelopeInput, ProtoPayload)
  • ExecutionRequest type alias (use RunDescriptor)
  • test-agents/ Python harness, scripts/run-e2e.sh
  • DTOs: send-run-message.dto.ts, send-signal.dto.ts, update-context.dto.ts
  • Integration specs: 5 mode-specific + runs-messaging (replaced by observer-mode.integration.spec.ts)
  • TestClient.sendMessage/sendSignal/updateContext, ts-agent.ts

Invariant tests (CI gates)

  • observer-invariant.spec.ts — fails on any provider.send(, openSession(, chooseInitiator(, retryKickoff( in src/
  • projection-coverage.spec.ts — every CANONICAL_EVENT_TYPES entry must have a reducer branch

Test plan

  • npm run lint --max-warnings=0 — clean
  • npm run build — clean
  • npm test577/577 (44 suites)
  • npm run test:integration59/59 (14 suites)
  • npm run format — applied
  • Manual smoke: launch a scenario via examples-service, verify agents emit envelopes directly, CP projects them, UI cancel proxies via callback
  • Production: set RUNTIME_BEARER_TOKEN (observer, can_start_sessions: false), remove RUNTIME_AGENT_TOKENS_JSON

Net diff

93 files changed, 1,767 insertions, 1,636 deletions — net +131 LoC (includes new session discovery service, invariant tests, waitFor helper, and gRPC helpers; offset by
deleted dead code).

…port

  Transform the control-plane into a scenario-agnostic observer per the
  direct-agent-auth plan (CP-1..CP-13). Agents authenticate to the runtime
  directly (RFC-MACP-0004 §4); the control-plane never calls Send.

  Followed by a four-phase quality cleanup (plans/quality-cleanup.md):
  dead code removal, test-suite hygiene (sleep→polling), convention
  documentation, and gRPC helper extraction.

  Behavioral changes:
  - POST /runs accepts RunDescriptor only (forbidNonWhitelisted); returns
    {runId, sessionId, status, traceId}. sessionId allocated via UUID v4.
  - POST /runs/:id/{messages,signal,context} → 410 Gone (ENDPOINT_REMOVED).
  - POST /runs/:id/cancel: Option A proxies to metadata.cancelCallback.url;
    Option B calls runtime.CancelSession when cancellationDelegated=true.
  - New observer execute flow: initialize → pollForOpenSession (100ms→1s
    backoff, 60s timeout) → bindSession → subscribeSession (read-only) →
    stream consumer.
  - Session discovery: SessionDiscoveryService subscribes to WatchSessions
    and auto-creates runs for externally-started sessions.

  Internal changes:
  - RuntimeProvider narrowed: subscribeSession() replaces openSession/send/
    startSession/streamSession. Added listSessions/watchSessions.
  - Extracted grpc-helpers.ts (fromEnvelope, fromAck, fromSessionMetadata,
    buildMetadata, getClientMethod) — provider 553→459 lines.
  - RuntimeCredentialResolverService reverted to single-bearer (CP-9).
  - ProtoRegistryService.decodeKnown falls back to JSON on proto failure.
  - Deleted: MockRuntimeProvider, encode methods + types, ExecutionRequest
    alias, test-agents/ harness, 5 mode-specific integration specs,
    runs-messaging spec, 3 write-path DTOs, ts-agent.ts helper.

  Quality:
  - waitFor() polling replaces 17 hardcoded sleep() calls in integration
    tests (~80s→~5s conditional polling).
  - Two invariant tests: observer-invariant.spec.ts (no provider.send),
    projection-coverage.spec.ts (every canonical event has a reducer).
  - Conventions documented in CLAUDE.md §Conventions with grep rules.
  - forbidNonWhitelisted applied to all class-DTO @Body endpoints.
  - THROTTLE_TTL_MS/THROTTLE_LIMIT migrated to AppConfigService.
  - npm run lint upgraded to --max-warnings=0.

  Verification:
  - lint clean, build clean
  - 577/577 unit tests (44 suites)
  - 59/59 integration tests (14 suites)
@ajit-zer07 ajit-zer07 merged commit e3ac247 into main Apr 18, 2026
7 checks passed
@ajit-zer07 ajit-zer07 deleted the auth_context-update branch April 18, 2026 22:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant