feat(introspection): slim, just-in-time engine introspection worker#114
feat(introspection): slim, just-in-time engine introspection worker#114rohitg00 wants to merge 7 commits into
Conversation
Adds a new `introspection` worker that wraps `engine::workers::list` and
`engine::functions::list` with progressive disclosure. Default outputs are
2-3 KB instead of 50+ KB, with full schemas fetched only on demand.
Function ids registered:
introspection::context::bootstrap
One-shot session-start frame: connected workers, engine-builtins
that are not enabled (with activation hints — e.g. iii-sandbox),
root skill index ids only, canonical discovery flow, pinned tips.
~3 KB instead of the usual 50+ KB skill-bodies dump.
introspection::context::worker_status
Probe one worker for { status, builtin, activation_hint }. Lets the
agent check iii-sandbox / iii-http / iii-cron registration state
before assuming the functions exist.
introspection::workers::list
Slim worker list with { name, status, function_count, description }.
Pass include=full for the raw graph. Filters anonymous host:port
registrations and dedupes ghost reconnects.
introspection::workers::describe
Full detail for one worker, with embedded function entries slimmed
(id + description only, no schemas) and associated root skills
attached. ~2 KB instead of 30 KB raw slice.
introspection::functions::list
Slim function list { id, worker, description } across the bus.
Drops 16 noise prefixes by default (skills::resources-*,
engine::console::, telemetry::, hook-fanout::, policy-denylist::,
auth::, introspection::stream::, etc). Pass include_noise: true to
opt back in. Returns count + filtered_out for transparency.
introspection::functions::describe
Just-in-time full schema for one function id (request_format +
response_format). Agent calls this only after picking an id from
the slim list.
introspection::stream::subscribe
Snapshot of current registrations. Live deltas land when the
engine emits on the introspection.registrations pubsub channel.
introspection::registry::query
HTTP GET workers.iii.dev/registry/index.json + substring filter.
Why this matters
Upstream guidance currently says "call engine::functions::list first".
The reply for a typical stack is ~52 KB of JSON (every function's full
request/response schema), and it sits in the transcript and is replayed
on every subsequent turn. With introspection a typical bootstrap +
keyword filter + one describe totals ~5 KB.
Architecture
Pure proxy over engine::workers::list + skills::list — no new state, no
new database, no engine changes. Each function is a thin reshape of an
existing engine query. Tests cover the slim entry shape, dedup, builtin
hint mapping, and registry-query filter.
Files
introspection/Cargo.toml iii-sdk =0.11.6
introspection/iii.worker.yaml iii: v1, binary deploy
introspection/skill.md top-level iii://introspection
introspection/README.md function table + flow
introspection/config.yaml registry_url + timeout
introspection/src/main.rs register the 8 functions
introspection/src/config.rs yaml load with defaults
introspection/src/functions/mod.rs DEFAULT_EXCLUDED_NAMESPACES,
ENGINE_BUILTINS, shared helpers
introspection/src/functions/context.rs
introspection/src/functions/workers.rs
introspection/src/functions/functions_mod.rs
introspection/src/functions/stream.rs
introspection/src/functions/registry.rs
introspection/tests/integration.rs slim entry + payload shape tests
Build verified locally against iii-sdk =0.11.6. To pin to upstream
=0.11.3 just bump the Cargo.toml line; the SDK surface used here
(register_worker, register_function, trigger, TriggerRequest) is stable
between the two.
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (8)
📝 WalkthroughWalkthroughThis PR adds a new Changesiii-introspection Worker
Sequence Diagram(s)No sequence diagram needed for this change; the PR introduces a new worker with multiple independent endpoints, each serving different introspection queries without complex cross-endpoint orchestration or state machines. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 9
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@introspection/src/config.rs`:
- Line 36: The YAML parsing currently uses
serde_yaml::from_str(&text).unwrap_or_default() which swallows parse errors;
change this to propagate the parse error instead (e.g., return Err(...) or use
the ? operator) so callers receive the failure instead of receiving a default
Config; update the code around the Config binding (the cfg variable) and the
enclosing function signature to return a Result<Config, serde_yaml::Error> (or
the crate's common error type) and remove unwrap_or_default(), ensuring
serde_yaml::from_str(&text) errors flow back to the caller.
In `@introspection/src/functions/context.rs`:
- Around line 173-179: The current code uses workers.iter().find(...) which
returns the first matching row and can yield stale statuses; change the lookup
to pick the most recent matching entry by using workers.iter().rfind(|w|
w.get("name").and_then(|n| n.as_str()) == Some(name)) (i.e., replace .find()
with .rfind()) so row reflects the latest duplicate entry, leaving the
subsequent status extraction (the status variable and its unwrap_or fallback)
unchanged.
In `@introspection/src/functions/functions_mod.rs`:
- Around line 110-123: The describe() loop currently matches fn_id across all
workers including disconnected ones; restrict it to connected workers by
checking the worker status before iterating functions—e.g., add a guard like if
w.get("status").and_then(|s| s.as_str()) != Some("connected") { continue } (or
the equivalent status field your codebase uses) immediately after obtaining
worker_name so that only workers with status "connected" are considered when
matching fn_id (variables to update: workers loop, worker_name, fns, fn_id, id).
In `@introspection/src/functions/registry.rs`:
- Around line 19-29: The code currently calls resp.json() without checking HTTP
status, so replace that flow in registry.rs (inside the function that builds
`resp`) by first calling resp.error_for_status() (or error_for_status_ref()) and
map any status error into an IIIError::Handler with the URL and status/error
details; only after a successful status call, proceed to deserialize with
resp.json(). Ensure the error mapping includes the HTTP status (and optionally
response body/text) to give accurate diagnostics instead of a misleading "json
parse failed" message.
In `@introspection/src/functions/stream.rs`:
- Around line 24-27: The mapped "function_count" currently defaults to json!(0)
which is misleading when the worker object `w` contains a `functions` array but
no `function_count`; change the mapping logic in
introspection/src/functions/stream.rs so that for each worker `w` you first try
`w.get("function_count")` and if absent compute the count from
`w.get("functions")` by taking its array length (e.g., map to
json!(functions_array.len()) when `functions` is present and an array), and only
fall back to json!(0) if neither exists; update the expression that builds the
worker map (where "function_count" is set) to perform this conditional
lookup/derive using `w.get("functions")` and `as_array()`/len to produce the
correct count.
In `@introspection/src/functions/workers.rs`:
- Around line 23-41: Normalize and de-duplicate the `workers` collection before
applying the `filter` and `include_disconnected` logic: replace the current
`workers` Vec manipulation with a deterministic normalization step that collects
rows into a map keyed by worker "name" (symbol: workers), resolving duplicates
by preferring a connected row (status == "connected") or the most recent row
(e.g., compare a "last_seen" / timestamp field if present), then rebuild a
deduped Vec from the map and sort it deterministically (e.g., by name or
last_seen) before the existing filter/retain code runs; apply the same
normalization/dedupe routine to the other usage of `workers` referenced in the
same file (the block around the other instance noted in the review) so both
`list` and `describe` operate on the same cleaned dataset.
In `@introspection/src/main.rs`:
- Line 188: The startup log string is hardcoded to "registered 6 functions" but
the process actually registers 8 functions; update the tracing::info! call (the
line with tracing::info!("iii-introspection registered 6 functions, awaiting
calls")) to report the correct count — either change the literal to "8" or,
better, compute and interpolate the actual number from the collection or
variables used when registering functions (e.g., use the length of the Vec or
slice that holds registrations) so the message always reflects the true number
of registered functions.
- Line 31: The current call uses config::load(&cli.config).unwrap_or_default(),
which hides load/parse errors; change it to fail fast by handling the Result
from config::load(&cli.config) instead of defaulting: if load returns Err, log
or print the error (including the Err value) and exit with a non-zero code;
otherwise wrap the Ok value in Arc and assign to cfg. Update the expression
around config::load, unwrap_or_default, and cfg to perform this explicit error
handling (e.g., using match, unwrap_or_else, or ? propagated to main) so
failures are not silently masked.
In `@introspection/tests/integration.rs`:
- Around line 3-34: Tests currently assert against hardcoded json! literals (in
slim_worker_entry_shape_is_stable, slim_function_entry_shape_is_stable, and
registry_query_payload_validates_required_fields) so they don't exercise
production code; replace the hardcoded json! values with calls to the actual
production shaping/handler functions that create those payloads (or to a shared
helper used by the handlers) and assert on the returned serde_json::Value
instead. Concretely: import and call the real functions used by the service to
build slim worker entries, slim function entries, and registry query payloads
(replace the json!({ ... }) assignments in slim_worker_entry_shape_is_stable,
slim_function_entry_shape_is_stable, and
registry_query_payload_validates_required_fields with those function calls),
then keep the same assertions against the returned Value. Ensure async test uses
the async handler if needed and add any necessary feature/imports to access the
production functions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 2764a784-6ded-40fe-9bdc-1d181074cbb9
⛔ Files ignored due to path filters (1)
introspection/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (13)
introspection/Cargo.tomlintrospection/README.mdintrospection/iii.worker.yamlintrospection/skill.mdintrospection/src/config.rsintrospection/src/functions/context.rsintrospection/src/functions/functions_mod.rsintrospection/src/functions/mod.rsintrospection/src/functions/registry.rsintrospection/src/functions/stream.rsintrospection/src/functions/workers.rsintrospection/src/main.rsintrospection/tests/integration.rs
| #[test] | ||
| fn slim_worker_entry_shape_is_stable() { | ||
| let entry = json!({ | ||
| "name": "shell-bash", | ||
| "status": "connected", | ||
| "function_count": 3, | ||
| "description": "Sandboxed bash exec", | ||
| }); | ||
| assert_eq!(entry["name"], "shell-bash"); | ||
| assert_eq!(entry["status"], "connected"); | ||
| assert_eq!(entry["function_count"], 3); | ||
| assert!(entry["description"].is_string()); | ||
| } | ||
|
|
||
| #[test] | ||
| fn slim_function_entry_shape_is_stable() { | ||
| let entry = json!({ | ||
| "id": "shell::bash::exec", | ||
| "worker": "shell-bash", | ||
| "description": "Run bash in a sandbox", | ||
| }); | ||
| assert_eq!(entry["id"], "shell::bash::exec"); | ||
| assert_eq!(entry["worker"], "shell-bash"); | ||
| assert!(entry["description"].is_string()); | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn registry_query_payload_validates_required_fields() { | ||
| let payload = json!({"q": "mcp", "limit": 5}); | ||
| assert!(payload.get("q").is_some()); | ||
| assert_eq!(payload["limit"], 5); | ||
| } |
There was a problem hiding this comment.
Current “integration” tests don’t exercise production code paths
These tests validate hardcoded JSON literals built inside the test itself, so they pass even if the handler implementations break. Please wire them to actual handler outputs (or shared shaping functions) so they protect real behavior.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@introspection/tests/integration.rs` around lines 3 - 34, Tests currently
assert against hardcoded json! literals (in slim_worker_entry_shape_is_stable,
slim_function_entry_shape_is_stable, and
registry_query_payload_validates_required_fields) so they don't exercise
production code; replace the hardcoded json! values with calls to the actual
production shaping/handler functions that create those payloads (or to a shared
helper used by the handlers) and assert on the returned serde_json::Value
instead. Concretely: import and call the real functions used by the service to
build slim worker entries, slim function entries, and registry query payloads
(replace the json!({ ... }) assignments in slim_worker_entry_shape_is_stable,
slim_function_entry_shape_is_stable, and
registry_query_payload_validates_required_fields with those function calls),
then keep the same assertions against the returned Value. Ensure async test uses
the async handler if needed and add any necessary feature/imports to access the
production functions.
- config.rs: propagate yaml parse errors instead of silently defaulting - main.rs: fail fast on malformed config; only fall back on missing file - main.rs: fix startup log function count (6 -> 8) - registry.rs: explicit HTTP status check + content-type check before JSON deserialization; clearer error on non-2xx and on HTML/text bodies - stream.rs: derive function_count from functions[] when source omits the explicit field, via new effective_fn_count helper - functions/mod.rs: new shared helpers dedup_workers, effective_fn_count, is_anonymous_name. dedup drops anonymous host:port rows, prefers status=connected, then highest function_count - functions_mod.rs (describe): dedup + restrict to connected workers so describe never returns stale metadata from a ghost row - context.rs (worker_status): dedup before .find() to avoid picking a stale row over a live one - 5 new unit tests in functions/mod.rs covering anon detection, fn count fallback, exclusion prefixes, dedup ordering, builtin hints
Summary
Adds a new
introspectionworker that wraps the engine's introspectionsurfaces (
engine::workers::list,engine::functions::list,skills::list) with progressive disclosure. Default responses are2-3 KB instead of 50+ KB; full schemas are fetched only on demand.
Today, a discovery-first turn typically calls
engine::functions::list—the reply is ~52 KB of JSON (every function's full request/response
schema) and replays in the transcript on every subsequent turn. With
this worker, a typical bootstrap + keyword filter + one describe totals
~5 KB. Net ~25× drop in input tokens by turn 5 on observed sessions.
Function ids registered
introspection::context::bootstrapiii-sandboxconfig block), root skill index ids only, pinned discovery tips. ~3 KB.introspection::context::worker_status{ status, builtin, activation_hint }iii-sandbox/iii-http/iii-cronis reachable before calling its functions.introspection::workers::list{ name, status, function_count, description }[]host:portregistrations and dedupes ghost reconnects.include=fullreturns the raw graph.introspection::workers::describeintrospection::functions::list{ id, worker, description }[]skills::resources-*,engine::console::,telemetry::,hook-fanout::,policy-denylist::,auth::,introspection::stream::, etc).include_noise: trueopts back in. Returnsfiltered_outcount for transparency.introspection::functions::describeintrospection::stream::subscribeintrospection.registrationspubsub channel (companion change welcome).introspection::registry::queryworkers.iii.dev/registry/index.json+ substring filter.Why this exists
Nothing in the tree covers slim, streamable, just-in-time engine
introspection at this shape.
sensoris code-quality regression(unrelated). This worker fills the gap without changing the engine.
introspection::context::bootstrapin particular is the answer to:"the agent calls
engine::functions::listthen floods the transcriptwith 52 KB of unused schemas." It returns a 3 KB orientation frame
including a discovery flow and pinned tips telling the agent to use
introspection::functions::listfirst, then describe one id at a time.Architecture
Pure proxy. No new state. No new database. No engine changes. Each
function is a thin reshape of an existing engine query, plus a small
static table of known engine builtins so the agent gets activation
hints for the ones that are config-gated.
SDK pin
Built locally against
iii-sdk = "=0.11.6". Happy to bump down to=0.11.3to match the rest of the tree if CI prefers — the SDKsurface used here (
register_worker,register_function,trigger,TriggerRequest) is stable between the two.Future work (out of scope this PR)
introspection.registrationspubsub channel sostream::subscribecan switch from snapshot to live tail.
prefer
introspection::functions::listoverengine::functions::list. Separate PR.Test plan
cargo build --releasecleantests/integration.rspassiii trigger --function-id introspection::context::bootstrapreturns the expected frame against a running harness graphpr-checksgreen (new worker — version > base trivially)cargo fmt+cargo clippy)