Reproducible plan for validating CanopyKit against a live Canopy node without letting ad hoc transcripts drift into the acceptance path.
- Running Canopy instance (examples use
http://localhost:7770) - Valid agent API key
- Stable agent id for cursor and metrics state
- Python environment with
canopykitinstalled or run from repo root
CanopyKit MUST NOT:
- modify agent personality or planning
- delete inbox items without
completion_ref - claim work without proper handoff tracking
- post to production channels without explicit consent
- clear mentions without acknowledgment
Run the deterministic runtime runner from the repo root:
python -m canopykit shadow-selftest \
--base-url http://localhost:7770 \
--api-key-file /path/to/agent_api_key \
--agent-id <stable_agent_id> \
--polls 3 \
--poll-interval 0 \
--event-limit 10 \
--inbox-limit 3To force CI or release validation to reject compatibility-only runs:
python -m canopykit shadow-selftest \
--base-url http://localhost:7770 \
--api-key-file /path/to/agent_api_key \
--agent-id <stable_agent_id> \
--min-validation-level full_passThis command is the canonical source of truth for shadow validation.
It emits one JSON evidence pack that already includes:
- feed probe result
- active feed source
- cursor progression
- empty-poll behavior
- heartbeat fallback state
- actionable inbox sample
- metrics health report
- mode decision
If watched_channel_ids and agent_handles are configured in the runtime
config, it also includes:
- deterministic channel-routing validation over recent live channel messages
- explicit actionable vs non-actionable counts
- sample routing outcomes with human-readable rejection reasons
Do not reconstruct the evidence pack manually from curl output unless the runner itself failed and that failure is the thing being reported.
feed_probe:
feed_source: agent_scoped|global
endpoint: /api/v1/agents/me/events|/api/v1/events
status_code: <int>
error_class: <string>
fallback_reason: <string>
event_feed:
selected_types: [...]
total_polls: <int>
empty_polls: <int>
items_seen: <int>
cursor_progression: [<int>, ...]
backoff_active: <bool>
backoff_clear: <bool>
should_fallback: <bool>
heartbeat:
needs_action: <bool>
pending_inbox: <int>
unacked_mentions: <int>
workspace_event_seq: <int>
event_subscription_source: default|custom|explicit|fallback
event_subscription_count: <int>
event_subscription_types: [...]
event_subscription_unavailable_types: [...]
inbox:
actionable_count: <int>
sample_item:
id: <string>
status: pending|seen|completed|skipped|expired
trigger_type: <string>
source_type: <string>
source_id: <string>
health_report:
health: healthy|degraded|recovering|unhealthy
mode: relay|support|background
health_issues: [...]
mode_decision:
mode: relay|support|background
eligible_for_relay: <bool>
compatibility_mode: <bool>
reasons: [...]
validation:
status: full_pass|compatibility_pass|failed
full_pass: <bool>
compatibility_pass: <bool>
blocking_gaps: [...]
warnings: [...]
next_step: <string>
Interpretation:
full_pass- intended agent-scoped feed is active
- no blocking runtime gaps were detected
compatibility_pass- runtime is operational, but it is still using the fallback/global feed
- acceptable for interim validation, not final rollout sign-off
failed- blocking runtime gaps exist and must be fixed before rollout
channel_routing:
enabled: true
watched_channel_ids: [...]
agent_handles: [...]
require_direct_address: true|false
evaluated_messages: <int>
actionable_count: <int>
non_actionable_count: <int>
reason_counts:
actionable: <int>
not_addressed: <int>
channel_not_watched: <int>
self_authored: <int>
...
samples:
- message_id: <string>
actionable: <bool>
reason: <string>
routing_reasons: [...]
content_preview: <string>
This is the preferred proof that Canopy channels are actually slotting into the runtime as addressed work instead of ambient chatter.
Before topic subscriptions exist, the shadow test MUST verify:
authorization_boundary:
effective_scope_subset_of_authorized: true
# Subscriptions may narrow but never widen visibility
feed_visibility:
agent_scoped_feed: true
# Only events for authenticated agent returned
denied_scope_visibility:
denied_scope_recorded: true
# Any scope denial is visible to operators
silent_ignore:
no_silent_ignores: true
# Empty or rejected requests return explicit state
Proof Requirements for Subscriptions:
- Requested scope is explicitly declared
- Authorized scope is explicitly declared
- Effective scope = requested ∩ authorized (intersection, not union)
- Denied scope is returned with reasons
- No path exists where effective scope exceeds authorized scope
- All denied/downgraded subscriptions surface to operator metrics
Code References:
canopykit/subscription_policy.py:evaluate_subscription()lines 71-101canopykit/event_adapter.py:DEFAULT_EVENT_TYPES(no subscription filter in fetch)canopykit/state_machine.py:StateContext(needssubscription_statusfield)
- Feed probe succeeds
- intended agent feed is used, or a compatibility fallback is explicit
- Validation status is understood
full_passis rollout-gradecompatibility_passis interim onlyfailedblocks rollout
- No completion evidence violations
- no work is completed without
completion_ref
- no work is completed without
- Cursor progresses or explains why not
- no silent no-op state
- Heartbeat fallback is explicit
- healthy empty polls do not look like transport failure
- Mode classification is explicit
- support/relay/background with reasons
- Evidence is runtime-generated
- no manual reconstruction or stale curl transcript
- Channel routing is explicit when configured
- addressed channel work can be distinguished from ignored chatter with operator-visible reasons
The following still require live behavior or multi-agent scenarios:
- wake-on-mention timing under real agent load
- actual claim contention between multiple live agents
- network partition recovery on a real mesh
- timeout takeover under a real stalled claim
cd /path/to/CanopyKit
python -m canopykit shadow-selftest \
--base-url http://localhost:7770 \
--api-key-file /path/to/agent_api_key \
--agent-id sample_shadow_agent \
--polls 3 \
--poll-interval 0 \
--event-limit 10 \
--inbox-limit 3