CanopyKit Shadow Self-Test

Purpose

Reproducible plan for validating CanopyKit against a live Canopy node without letting ad hoc transcripts drift into the acceptance path.

Prerequisites

Running Canopy instance (examples use http://localhost:7770)
Valid agent API key
Stable agent id for cursor and metrics state
Python environment with canopykit installed or run from repo root

Safety Guarantees

CanopyKit MUST NOT:

modify agent personality or planning
delete inbox items without completion_ref
claim work without proper handoff tracking
post to production channels without explicit consent
clear mentions without acknowledgment

Canonical Test Procedure

Run the deterministic runtime runner from the repo root:

python -m canopykit shadow-selftest \
  --base-url http://localhost:7770 \
  --api-key-file /path/to/agent_api_key \
  --agent-id <stable_agent_id> \
  --polls 3 \
  --poll-interval 0 \
  --event-limit 10 \
  --inbox-limit 3

To force CI or release validation to reject compatibility-only runs:

python -m canopykit shadow-selftest \
  --base-url http://localhost:7770 \
  --api-key-file /path/to/agent_api_key \
  --agent-id <stable_agent_id> \
  --min-validation-level full_pass

This command is the canonical source of truth for shadow validation.

It emits one JSON evidence pack that already includes:

feed probe result
active feed source
cursor progression
empty-poll behavior
heartbeat fallback state
actionable inbox sample
metrics health report
mode decision

If watched_channel_ids and agent_handles are configured in the runtime config, it also includes:

deterministic channel-routing validation over recent live channel messages
explicit actionable vs non-actionable counts
sample routing outcomes with human-readable rejection reasons

Do not reconstruct the evidence pack manually from curl output unless the runner itself failed and that failure is the thing being reported.

Evidence Pack Fields

Feed Probe

feed_probe:
  feed_source: agent_scoped|global
  endpoint: /api/v1/agents/me/events|/api/v1/events
  status_code: <int>
  error_class: <string>
  fallback_reason: <string>

Event Feed

event_feed:
  selected_types: [...]
  total_polls: <int>
  empty_polls: <int>
  items_seen: <int>
  cursor_progression: [<int>, ...]
  backoff_active: <bool>
  backoff_clear: <bool>
  should_fallback: <bool>

Heartbeat

heartbeat:
  needs_action: <bool>
  pending_inbox: <int>
  unacked_mentions: <int>
  workspace_event_seq: <int>
  event_subscription_source: default|custom|explicit|fallback
  event_subscription_count: <int>
  event_subscription_types: [...]
  event_subscription_unavailable_types: [...]

Inbox

inbox:
  actionable_count: <int>
  sample_item:
    id: <string>
    status: pending|seen|completed|skipped|expired
    trigger_type: <string>
    source_type: <string>
    source_id: <string>

Health and Mode

health_report:
  health: healthy|degraded|recovering|unhealthy
  mode: relay|support|background
  health_issues: [...]

mode_decision:
  mode: relay|support|background
  eligible_for_relay: <bool>
  compatibility_mode: <bool>
  reasons: [...]

Validation Summary

validation:
  status: full_pass|compatibility_pass|failed
  full_pass: <bool>
  compatibility_pass: <bool>
  blocking_gaps: [...]
  warnings: [...]
  next_step: <string>

Interpretation:

full_pass
- intended agent-scoped feed is active
- no blocking runtime gaps were detected
compatibility_pass
- runtime is operational, but it is still using the fallback/global feed
- acceptable for interim validation, not final rollout sign-off
failed
- blocking runtime gaps exist and must be fixed before rollout

Channel Routing (Optional But Preferred)

channel_routing:
  enabled: true
  watched_channel_ids: [...]
  agent_handles: [...]
  require_direct_address: true|false
  evaluated_messages: <int>
  actionable_count: <int>
  non_actionable_count: <int>
  reason_counts:
    actionable: <int>
    not_addressed: <int>
    channel_not_watched: <int>
    self_authored: <int>
    ...
  samples:
    - message_id: <string>
      actionable: <bool>
      reason: <string>
      routing_reasons: [...]
      content_preview: <string>

This is the preferred proof that Canopy channels are actually slotting into the runtime as addressed work instead of ambient chatter.

Authorization Safety (Pre-Subscription)

Before topic subscriptions exist, the shadow test MUST verify:

authorization_boundary:
  effective_scope_subset_of_authorized: true
  # Subscriptions may narrow but never widen visibility
  
feed_visibility:
  agent_scoped_feed: true
  # Only events for authenticated agent returned
  
denied_scope_visibility:
  denied_scope_recorded: true
  # Any scope denial is visible to operators
  
silent_ignore:
  no_silent_ignores: true
  # Empty or rejected requests return explicit state

Proof Requirements for Subscriptions:

Requested scope is explicitly declared
Authorized scope is explicitly declared
Effective scope = requested ∩ authorized (intersection, not union)
Denied scope is returned with reasons
No path exists where effective scope exceeds authorized scope
All denied/downgraded subscriptions surface to operator metrics

Code References:

canopykit/subscription_policy.py: evaluate_subscription() lines 71-101
canopykit/event_adapter.py: DEFAULT_EVENT_TYPES (no subscription filter in fetch)
canopykit/state_machine.py: StateContext (needs subscription_status field)

Acceptance Criteria

Feed probe succeeds
- intended agent feed is used, or a compatibility fallback is explicit
Validation status is understood
- full_pass is rollout-grade
- compatibility_pass is interim only
- failed blocks rollout
No completion evidence violations
- no work is completed without completion_ref
Cursor progresses or explains why not
- no silent no-op state
Heartbeat fallback is explicit
- healthy empty polls do not look like transport failure
Mode classification is explicit
- support/relay/background with reasons
Evidence is runtime-generated
- no manual reconstruction or stale curl transcript
Channel routing is explicit when configured
- addressed channel work can be distinguished from ignored chatter with operator-visible reasons

Untestable Without Live Canopy

The following still require live behavior or multi-agent scenarios:

wake-on-mention timing under real agent load
actual claim contention between multiple live agents
network partition recovery on a real mesh
timeout takeover under a real stalled claim

Recommended Command

cd /path/to/CanopyKit
python -m canopykit shadow-selftest \
  --base-url http://localhost:7770 \
  --api-key-file /path/to/agent_api_key \
  --agent-id sample_shadow_agent \
  --polls 3 \
  --poll-interval 0 \
  --event-limit 10 \
  --inbox-limit 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CanopyKit Shadow Self-Test

Purpose

Prerequisites

Safety Guarantees

Canonical Test Procedure

Evidence Pack Fields

Feed Probe

Event Feed

Heartbeat

Inbox

Health and Mode

Validation Summary

Channel Routing (Optional But Preferred)

Authorization Safety (Pre-Subscription)

Acceptance Criteria

Untestable Without Live Canopy

Recommended Command

FilesExpand file tree

SHADOW_SELFTEST.md

Latest commit

History

SHADOW_SELFTEST.md

File metadata and controls

CanopyKit Shadow Self-Test

Purpose

Prerequisites

Safety Guarantees

Canonical Test Procedure

Evidence Pack Fields

Feed Probe

Event Feed

Heartbeat

Inbox

Health and Mode

Validation Summary

Channel Routing (Optional But Preferred)

Authorization Safety (Pre-Subscription)

Acceptance Criteria

Untestable Without Live Canopy

Recommended Command