Skip to content

feat(sandbox): router worker dispatching by provider field#119

Open
rohitg00 wants to merge 3 commits into
mainfrom
feat/sandbox-router
Open

feat(sandbox): router worker dispatching by provider field#119
rohitg00 wants to merge 3 commits into
mainfrom
feat/sandbox-router

Conversation

@rohitg00
Copy link
Copy Markdown
Contributor

@rohitg00 rohitg00 commented May 11, 2026

Mirrors provider-router for the sandbox family. Caller-facing surface
moves from per-provider sandbox::<provider>::* to a single
sandbox::* on the router; adapters register only
sandbox::provider::<name>::* underneath. Picks provider from the
payload, falls back to default_provider (default local).

Why

Current PRs #92-#97 each shadow sandbox::* under their own provider
segment, which collides with the engine-shipped iii-sandbox namespace
and gives callers no way to swap providers via config.

What ships here

  • sandbox-abi crate — shared CreateRequest / ExecRequest / ...
    shapes, the SCode space (existing S100/S200/S300/S400/S404/S500/
    S501/S502/S503 plus new S600 unknown provider), and an
    http-status -> SCode helper for REST adapters
  • sandbox router worker — registers sandbox::{create, exec, stop, list, snapshot, expose_port, branch, fs::{read, write}} and forwards
    each call to sandbox::provider::<name>::<leaf> via iii.trigger
    with provider stripped from the payload
  • sandbox-CONTEXT.md — updated for the router + adapter split,
    documents S600, calls out that provider="local" errors until the
    engine-shipped iii-sandbox registers its iii-function surface

Validation

cargo fmt --check       # clean
cargo clippy --all-targets --all-features -- -D warnings  # clean
cargo test              # 7 passed (3 abi + 4 router)

Adapter PRs

Each existing sandbox-* PR has been updated on its own branch with the
namespace rename:

Handler logic, tests, and S-code mapping unchanged in every adapter.

Caller migration

Before:

iii.trigger("sandbox::e2b::create", json!({ "image": "base" })).await?;

After:

iii.trigger("sandbox::create", json!({ "provider": "e2b", "image": "base" })).await?;

Direct invocation of sandbox::provider::e2b::* remains supported and
stable for callers that want to bypass the router.

Open question

provider="local" is reserved for the engine-shipped iii-sandbox
microVM. Until that registers an iii-function surface, the router
returns S600 for calls without an explicit provider. Recommend
setting default_provider in config.yaml to an installed adapter
(e.g. default_provider: e2b) for now.

Summary by CodeRabbit

  • New Features

    • Sandbox Worker Family now available with core operations: create, exec, stop, and list sandboxes.
    • Optional capabilities including snapshots, branching, port exposure, and filesystem operations.
    • Provider-based routing with configurable default provider.
  • Documentation

    • Added Sandbox ABI specification documenting canonical function shapes, stable error codes, and evolution policy.

Review Change Stack

rohitg00 added 3 commits May 11, 2026 19:33
Mirrors provider-router shape: sandbox owns the caller-facing
sandbox::{create,exec,stop,list,snapshot,expose_port,branch,fs::{read,write}}
ids and dispatches by the payload's provider field to
sandbox::provider::<name>::<leaf>. Default provider = local
(engine-shipped iii-sandbox).

Adapters (sandbox-e2b/-daytona/-vercel/-morph/-modal/-cloudflare) must
register only sandbox::provider::<name>::* and stop shadowing the bare
sandbox::* namespace. Refactor of the existing 6 PRs follows.

Adds sandbox-abi crate with shared CreateRequest/ExecRequest/... shapes,
the S-code error space (S100/S200/S300/S400/S404/S500/S501/S502/S503,
plus new S600 unknown_provider), and an http-status -> S-code helper for
REST adapters.
Caller surface (sandbox::*) is now owned by the sandbox router worker;
adapters register only sandbox::provider::<name>::*. Adds S600 unknown
provider, the router-owned caller table, and notes that provider=local
errors until the engine-shipped iii-sandbox registers its iii-function
surface.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 11, 2026

📝 Walkthrough

Walkthrough

This pull request introduces a complete sandbox worker family system: a shared ABI library (sandbox-abi) defining canonical request/response types and error codes, a dispatch router (sandbox) that forwards operations to provider-specific adapters based on a provider field, and comprehensive documentation of the architecture, call flows, and known cross-provider variations.

Changes

Sandbox Worker Family

Layer / File(s) Summary
ABI Contract & Data Model
sandbox-CONTEXT.md, sandbox-abi/src/lib.rs
Canonical ABI defines request/response payloads for create/exec/stop/list, Capability enum, SCode stable error codes, AbiError with code mapping, and helper to translate HTTP status to canonical errors.
ABI Package Setup
sandbox-abi/Cargo.toml, sandbox-abi/README.md
Library crate with serde/serde_json/thiserror dependencies; README documents shared types consumed by router and adapters, namespace constraints, and provider dispatch defaults.
Router Configuration & Startup
sandbox/src/config.rs, sandbox/src/main.rs
Config struct with YAML deserialization and default_provider field; CLI parser for config path and engine URL; Tokio async entrypoint initializing tracing, loading config with fallback, registering worker and handlers, and awaiting Ctrl+C.
Router Dispatch & Provider Resolution
sandbox/src/register.rs
Ctx wrapper holding Arc<Config>; resolve_provider extracts/defaults provider from request; async forward strips provider field, builds provider-scoped function id, triggers adapter via iii.trigger with timeout, maps errors to AbiError; register_all wires all sandbox function IDs to dispatchers; unit tests cover provider resolution and fallback scenarios.
Router Package & Deployment
sandbox/Cargo.toml, sandbox/src/lib.rs, sandbox/iii.worker.yaml
Library and binary targets; dependency on sandbox-abi and iii-sdk; iii.worker.yaml defines dispatch-by-provider routing and default_provider=local; public module exports config and register.
Documentation & Guides
sandbox/README.md
Dispatch overview, installation instructions, Rust quickstart calling sandbox::create, error code documentation, and link to full ABI reference.

🎯 3 (Moderate) | ⏱️ ~20 minutes

A sandbox spawned fresh each day,
Routes requests through adapters at play.
Provider-dispatched calls take flight,
While stable errors keep all things right. 🐰

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 38.46% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and accurately summarizes the main change: introducing a sandbox router that dispatches calls based on a provider field.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/sandbox-router

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (1)
sandbox/README.md (1)

63-68: ⚡ Quick win

Clarify relationship between default configuration and adapter installation.

This section documents how to set default_provider but doesn't mention that:

  1. The shipped default (local) won't work until iii-sandbox registers (per sandbox-CONTEXT.md line 128)
  2. Users should configure this after installing at least one adapter

Consider adding a note to make this clearer for new users.

Suggested clarification
 Omit `provider` to use `default_provider`. Set `default_provider` per
-deployment in `config.yaml`:
+deployment in `config.yaml` (requires an installed adapter):

 ```yaml
+# Set this to an installed adapter (e2b, daytona, modal, etc.)
+# Note: The default "local" provider requires iii-sandbox registration
 default_provider: e2b

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @sandbox/README.md around lines 63 - 68, Update the README text that explains
default_provider to clarify that the value must correspond to an installed
adapter (e2b, daytona, modal, etc.) and that the shipped default "local"
provider will not work until iii-sandbox is registered (see sandbox-CONTEXT.md)
— also recommend users set default_provider after installing at least one
adapter; reference the default_provider and provider terms so readers know
to replace or omit provider to use default_provider.


</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @sandbox-abi/src/lib.rs:

  • Around line 185-194: map_http_status currently classifies all non-key 4xx
    codes as ProviderUnavailable; add explicit match arms in map_http_status for
    client-side errors (e.g., 400, 404, 409, 410, 422 and similar) that return
    AbiError::BadInput containing the status and body instead of falling through to
    the _ arm, and leave 500..=599 and truly unexpected codes mapped to
    AbiError::ProviderUnavailable; update the match in the map_http_status function
    to include these new arms and ensure the AbiError enum has a BadInput variant
    that carries status+body so callers / S-code mapping can surface them correctly.
  • Around line 163-183: AbiError::BadInput is incorrectly mapped to
    SCode::ProviderUnavailable in AbiError::code(), so add a new SCode variant for
    bad input (e.g. SCode::BadInput with the chosen S700 token) and map
    AbiError::BadInput => SCode::BadInput inside the code() match; also update the
    error formatting for the BadInput variant so it emits the standard “[Sxxx]”
    prefix like other variants (adjust the #[error(...)] or the error-display
    routine) to keep the wire-format stable.

In @sandbox/Cargo.toml:

  • Line 24: Replace the deprecated serde_yaml dependency in Cargo.toml with the
    maintained fork by declaring serde_yaml = { package = "yaml_serde", version =
    "0.10" } and then run cargo update; this preserves the existing crate name so
    calls like serde_yaml::from_str in config.rs do not need code changes but will
    use the maintained yaml_serde implementation.

In @sandbox/iii.worker.yaml:

  • Around line 8-9: The config sets default_provider: local which causes
    immediate S600 "unknown provider" errors until an adapter is registered; remove
    the default_provider entry (or set it to an explicit empty placeholder like
    default_provider: "") from the YAML so callers must opt-in after installing an
    adapter, and update any README or description to document that an adapter must
    be installed before setting default_provider; look for the default_provider key
    in the sandbox/iii.worker.yaml and change or remove it accordingly.

In @sandbox/src/main.rs:

  • Around line 13-20: The CLI flag is currently exposed as --url on struct Cli
    (field url) but docs/PR expect --engine-url; change the CLI to use --engine-url
    by renaming the arg and field (e.g., field engine_url) on struct Cli and update
    all call sites that reference cli.url (notably register_worker(&cli.url, ...))
    to use the new field (register_worker(&cli.engine_url, ...)); ensure the
    #[arg(long, default_value = DEFAULT_ENGINE_URL)] attribute is updated to
    #[arg(long = "engine-url", default_value = DEFAULT_ENGINE_URL)] or equivalent so
    the flag and defaults match the documentation.
  • Around line 38-43: The register flow races: call register_worker(&cli.url,
    InitOptions::default()) and then register_all(&iii, ctx) without waiting allows
    the engine to route sandbox::* invocations before handlers are propagated; fix
    by synchronizing initialization — either (a) add a 500ms
    tokio::time::sleep(Duration::from_millis(500)).await after register_worker(...)
    to match the repo test pattern, or (b) make register_all(...) async and await it
    (preferred) so registration completes before proceeding; adjust calls around
    register_worker, register_all, and any use of iii (e.g., iii.shutdown_async())
    accordingly to ensure handlers are registered before the router starts.

In @sandbox/src/register.rs:

  • Around line 60-69: Replace fragile substring checks on e.to_string() in the
    .map_err closure with a structured match on the IIIError::Remote variant:
    inspect the error as IIIError::Remote { code, .. } and compare code to the exact
    SDK code used for a missing adapter (use the same literal used in
    turn-orchestrator, e.g. "function_not_found" equivalent for adapters) and only
    then return to_iii(AbiError::UnknownProvider(...)); otherwise return
    IIIError::Handler(...) as before; update the closure that currently builds msg
    and checks contains(...) to use matches! or a match arm on IIIError::Remote {
    code, .. } and keep the same messages involving AbiError::UnknownProvider,
    SCode::ProviderUnavailable and the target/provider text.

Nitpick comments:
In @sandbox/README.md:

  • Around line 63-68: Update the README text that explains default_provider to
    clarify that the value must correspond to an installed adapter (e2b, daytona,
    modal, etc.) and that the shipped default "local" provider will not work until
    iii-sandbox is registered (see sandbox-CONTEXT.md) — also recommend users set
    default_provider after installing at least one adapter; reference the
    default_provider and provider terms so readers know to replace or omit
    provider to use default_provider.

</details>

<details>
<summary>🪄 Autofix (Beta)</summary>

Fix all unresolved CodeRabbit comments on this PR:

- [ ] <!-- {"checkboxId": "4b0d0e0a-96d7-4f10-b296-3a18ea78f0b9"} --> Push a commit to this branch (recommended)
- [ ] <!-- {"checkboxId": "ff5b1114-7d8c-49e6-8ac1-43f82af23a33"} --> Create a new PR with the fixes

</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: Organization UI

**Review profile**: CHILL

**Plan**: Pro

**Run ID**: `c4b3bf2f-06f8-4288-b4bb-cfb1e5e93437`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 213a294f3fdbb06b9d43adbb2f52298270173319 and b3f4553de66c6a06704aa36e524631ced25880aa.

</details>

<details>
<summary>⛔ Files ignored due to path filters (2)</summary>

* `sandbox-abi/Cargo.lock` is excluded by `!**/*.lock`
* `sandbox/Cargo.lock` is excluded by `!**/*.lock`

</details>

<details>
<summary>📒 Files selected for processing (11)</summary>

* `sandbox-CONTEXT.md`
* `sandbox-abi/Cargo.toml`
* `sandbox-abi/README.md`
* `sandbox-abi/src/lib.rs`
* `sandbox/Cargo.toml`
* `sandbox/README.md`
* `sandbox/iii.worker.yaml`
* `sandbox/src/config.rs`
* `sandbox/src/lib.rs`
* `sandbox/src/main.rs`
* `sandbox/src/register.rs`

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Comment thread sandbox-abi/src/lib.rs
Comment on lines +163 to +183
#[error("bad input: {0}")]
BadInput(String),
}

impl AbiError {
pub fn code(&self) -> SCode {
match self {
Self::ImageNotAllowed(_) => SCode::ImageNotAllowed,
Self::ResourceOversize(_) => SCode::ResourceOversize,
Self::HostCannotBoot(_) => SCode::HostCannotBoot,
Self::ConcurrencyCapReached(_) => SCode::ConcurrencyCapReached,
Self::CapabilityUnsupported(_) => SCode::CapabilityUnsupported,
Self::RateLimited => SCode::RateLimited,
Self::QuotaExhausted => SCode::QuotaExhausted,
Self::ProviderUnavailable(_) => SCode::ProviderUnavailable,
Self::AuthInvalid => SCode::AuthInvalid,
Self::UnknownProvider(_) => SCode::UnknownProvider,
Self::BadInput(_) => SCode::ProviderUnavailable,
}
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

AbiError::BadInput.code() returns the wrong SCode (provider unavailable instead of a client error).

BadInput is a caller/input-validation error, but code() maps it to SCode::ProviderUnavailable (S502). Callers and retry layers that pattern-match on the leading [Sxxx] token will treat malformed-input failures as a transient provider outage and may retry with backoff, even though the request will never succeed. This also conflicts with the docstring on SCode describing it as the stable wire contract.

Either (a) add a dedicated SCode variant for bad input (e.g. BadInput => "S700") and map AbiError::BadInput to it, or (b) drop the BadInput variant from this typed surface if callers should construct their own IIIError::BadInput directly. Note also that the #[error("bad input: {0}")] message for BadInput is the only variant that doesn't emit a [Sxxx] prefix, which already hints at this gap.

🛠️ Suggested fix — add a dedicated S-code
 pub enum SCode {
     ImageNotAllowed,
     ...
     UnknownProvider,
+    BadInput,
 }

 impl SCode {
     pub fn as_str(self) -> &'static str {
         match self {
             ...
             Self::UnknownProvider => "S600",
+            Self::BadInput => "S700",
         }
     }
 }
 pub enum AbiError {
     ...
-    #[error("bad input: {0}")]
+    #[error("[{}] bad input: {0}", SCode::BadInput.as_str())]
     BadInput(String),
 }
     pub fn code(&self) -> SCode {
         match self {
             ...
-            Self::BadInput(_) => SCode::ProviderUnavailable,
+            Self::BadInput(_) => SCode::BadInput,
         }
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@sandbox-abi/src/lib.rs` around lines 163 - 183, AbiError::BadInput is
incorrectly mapped to SCode::ProviderUnavailable in AbiError::code(), so add a
new SCode variant for bad input (e.g. SCode::BadInput with the chosen S700
token) and map AbiError::BadInput => SCode::BadInput inside the code() match;
also update the error formatting for the BadInput variant so it emits the
standard “[Sxxx]” prefix like other variants (adjust the #[error(...)] or the
error-display routine) to keep the wire-format stable.

Comment thread sandbox-abi/src/lib.rs
Comment on lines +185 to +194
/// Map an HTTP status from a provider REST API onto the canonical S-code.
pub fn map_http_status(status: u16, body: &str) -> AbiError {
match status {
401 | 403 => AbiError::AuthInvalid,
402 => AbiError::QuotaExhausted,
429 => AbiError::RateLimited,
500..=599 => AbiError::ProviderUnavailable(format!("status {status}: {body}")),
_ => AbiError::ProviderUnavailable(format!("unexpected status {status}: {body}")),
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

map_http_status collapses every non-special 4xx into ProviderUnavailable.

Statuses like 400 (bad request), 404 (not found), 409 (conflict), and 422 fall into the _ arm and surface as S502 provider unavailable. That misclassifies caller-side errors as transient infra failures, encouraging retries that will never succeed and obscuring the real cause in logs.

Consider routing 400/404/409/410/422 (and similar) through a dedicated client-error path (e.g. an AbiError::BadInput carrying status+body, once BadInput has its own SCode per the comment above), and keeping the generic fallback only for truly unexpected codes.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@sandbox-abi/src/lib.rs` around lines 185 - 194, map_http_status currently
classifies all non-key 4xx codes as ProviderUnavailable; add explicit match arms
in map_http_status for client-side errors (e.g., 400, 404, 409, 410, 422 and
similar) that return AbiError::BadInput containing the status and body instead
of falling through to the `_` arm, and leave 500..=599 and truly unexpected
codes mapped to AbiError::ProviderUnavailable; update the match in the
map_http_status function to include these new arms and ensure the AbiError enum
has a BadInput variant that carries status+body so callers / S-code mapping can
surface them correctly.

Comment thread sandbox/Cargo.toml
iii-sdk = "=0.11.6"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
serde_yaml = "0.9"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

Is the serde_yaml crate (version 0.9) still maintained in 2026, and what is the currently recommended replacement (e.g. serde_yml, serde_yaml_ng)?

💡 Result:

serde_yaml (including v0.9.x) is no longer maintained; in 2024 it was published with a +deprecated tag and its crate/docs explicitly state the project is no longer maintained [1][2]. As of 2026-05-11, the latest published 0.9 series artifact is 0.9.34+deprecated, last updated 2024-03-25 [3]. Recommended replacement (current, actively maintained fork): yaml_serde (crate name yaml_serde, version line 0.10.x). The official YAML organization’s actively maintained fork states that the original serde_yaml crate is no longer maintained, and that this fork continues development with full compatibility; it also provides a Cargo renaming migration example (serde_yaml = { package = "yaml_serde", version = "0.10" }) [4]. A current yaml_serde release exists (e.g., 0.10.4 updated 2026-03-11) [5]. Other possible alternatives you mentioned exist as forks/related crates, but the most clearly “recommended” actively maintained replacement (with explicit guidance to migrate from serde_yaml) is yaml_serde: - serde_yml (crate serde_yml) has releases and a homepage, but it is not presented as an official actively maintained fork of serde_yaml in the sources above [6]. - serde_yaml_ng also exists as a fork/alternative (e.g., latest 0.10.0 updated 2024-05-26) [7].

Citations:


serde_yaml 0.9 is unmaintained — switch to the officially maintained fork.

The upstream serde_yaml crate was archived and marked deprecated by its author in early 2024 (flagged by cargo audit as RUSTSEC-2024-0320). The official YAML organization's actively maintained fork is yaml_serde, which provides full compatibility. Since this is a brand-new crate and YAML use here is trivial (one from_str call in config.rs), migration is straightforward via Cargo package renaming: serde_yaml = { package = "yaml_serde", version = "0.10" }.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@sandbox/Cargo.toml` at line 24, Replace the deprecated serde_yaml dependency
in Cargo.toml with the maintained fork by declaring serde_yaml = { package =
"yaml_serde", version = "0.10" } and then run cargo update; this preserves the
existing crate name so calls like serde_yaml::from_str in config.rs do not need
code changes but will use the maintained yaml_serde implementation.

Comment thread sandbox/iii.worker.yaml
Comment on lines +8 to +9
config:
default_provider: local
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Default provider will fail until an adapter is installed.

The default_provider: local will return S600 errors until iii-sandbox registers its surface (per line 128 of sandbox-CONTEXT.md). This creates a poor out-of-box experience where callers get "unknown provider" errors immediately after installation.

Consider one of these approaches:

  1. Recommended: Omit default_provider from the default config and require explicit configuration after adapter installation
  2. Add a prominent warning in the description or README that users must install an adapter first
  3. Use a placeholder value (e.g., default_provider: "") that makes the requirement more obvious
Suggested config adjustment
 config:
-  default_provider: local
+  # default_provider must be set after installing an adapter (e2b, daytona, etc.)
+  # Example: default_provider: e2b
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
config:
default_provider: local
config:
# default_provider must be set after installing an adapter (e2b, daytona, etc.)
# Example: default_provider: e2b
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@sandbox/iii.worker.yaml` around lines 8 - 9, The config sets
default_provider: local which causes immediate S600 "unknown provider" errors
until an adapter is registered; remove the default_provider entry (or set it to
an explicit empty placeholder like default_provider: "") from the YAML so
callers must opt-in after installing an adapter, and update any README or
description to document that an adapter must be installed before setting
default_provider; look for the default_provider key in the
sandbox/iii.worker.yaml and change or remove it accordingly.

Comment thread sandbox/src/main.rs
Comment on lines +13 to +20
#[derive(Parser, Debug)]
#[command(name = "iii-sandbox")]
struct Cli {
#[arg(long, default_value = "./config.yaml")]
config: PathBuf,
#[arg(long, default_value = DEFAULT_ENGINE_URL)]
url: String,
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

CLI flag is --url, but PR description / docs reference --engine-url.

The PR objectives and the AI summary both describe a --engine-url flag, while the actual Cli exposes --url. Either the flag should be renamed to match the documented interface, or the README/PR description should be updated. Picking a slightly more specific name also avoids collisions if more URL-style flags are added later.

🛠️ Optional rename
-    #[arg(long, default_value = DEFAULT_ENGINE_URL)]
-    url: String,
+    #[arg(long = "engine-url", default_value = DEFAULT_ENGINE_URL)]
+    engine_url: String,

…and update the register_worker(&cli.url, ...) call site accordingly.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#[derive(Parser, Debug)]
#[command(name = "iii-sandbox")]
struct Cli {
#[arg(long, default_value = "./config.yaml")]
config: PathBuf,
#[arg(long, default_value = DEFAULT_ENGINE_URL)]
url: String,
}
#[derive(Parser, Debug)]
#[command(name = "iii-sandbox")]
struct Cli {
#[arg(long, default_value = "./config.yaml")]
config: PathBuf,
#[arg(long = "engine-url", default_value = DEFAULT_ENGINE_URL)]
engine_url: String,
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@sandbox/src/main.rs` around lines 13 - 20, The CLI flag is currently exposed
as --url on struct Cli (field url) but docs/PR expect --engine-url; change the
CLI to use --engine-url by renaming the arg and field (e.g., field engine_url)
on struct Cli and update all call sites that reference cli.url (notably
register_worker(&cli.url, ...)) to use the new field
(register_worker(&cli.engine_url, ...)); ensure the #[arg(long, default_value =
DEFAULT_ENGINE_URL)] attribute is updated to #[arg(long = "engine-url",
default_value = DEFAULT_ENGINE_URL)] or equivalent so the flag and defaults
match the documentation.

Comment thread sandbox/src/main.rs
Comment on lines +38 to +43
let iii = register_worker(&cli.url, InitOptions::default());
register_all(&iii, ctx);

tracing::info!("sandbox router registered, awaiting invocations");
tokio::signal::ctrl_c().await?;
iii.shutdown_async().await;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Look at how other workers in this repo order register_worker vs handler registration,
# and whether iii-sdk exposes a start/ready primitive.
rg -nP --type=rust -C5 '\bregister_worker\s*\(' -g '!target/**'
rg -nP --type=rust -C2 '\b(InitOptions|register_function_with|shutdown_async)\b' -g '!target/**' | head -100

Repository: iii-hq/workers

Length of output: 34835


🏁 Script executed:

# Find the register_all function implementation in sandbox
rg -nP --type=rust '\bregister_all\b' sandbox/ -A 10

Repository: iii-hq/workers

Length of output: 1741


🏁 Script executed:

# Find iii-sdk source - check if it's in dependencies or local workspace
fd -t f 'Cargo.toml' | head -5 | xargs rg -l 'iii.sdk|iii_sdk'

Repository: iii-hq/workers

Length of output: 145


🏁 Script executed:

# Check sandbox/src/main.rs for imports to understand where register_worker and register_all come from
head -50 sandbox/src/main.rs

Repository: iii-hq/workers

Length of output: 1313


🏁 Script executed:

# Get the full register_all function
cat -n sandbox/src/register.rs | head -200

Repository: iii-hq/workers

Length of output: 6135


🏁 Script executed:

# Search for register_function_with signature to see if it's async
rg -nP --type=rust '\bregister_function_with\b' -A 3 | head -50

Repository: iii-hq/workers

Length of output: 1376


🏁 Script executed:

# Check if iii_sdk is a local path dependency or external crate
rg -nP '\[dependencies\]' Cargo.toml -A 20 | head -60

Repository: iii-hq/workers

Length of output: 109


🏁 Script executed:

# Find root Cargo.toml or check workspace structure
fd -t f 'Cargo.toml' | grep -E '(^Cargo.toml|workspace)' | head -3
cat Cargo.toml | grep -A 50 '\[workspace\]' | head -60

Repository: iii-hq/workers

Length of output: 101


🏁 Script executed:

# Search for iii_sdk in dependencies across the workspace
rg -nP 'iii.sdk|iii_sdk' Cargo.toml | head -20

Repository: iii-hq/workers

Length of output: 109


🏁 Script executed:

# Check if register_function_with returns anything or is fire-and-forget
rg -nP --type=rust '\.register_function_with\(' -B 2 -A 5 | head -80

Repository: iii-hq/workers

Length of output: 1221


🏁 Script executed:

# Find where iii_sdk comes from - check in subdirectories or external references
find . -name "*.rs" -type f | xargs grep -l "pub fn register_worker" | head -5

Repository: iii-hq/workers

Length of output: 40


🏁 Script executed:

# Look for any workspace Cargo.toml files
fd 'Cargo.toml' -type f | xargs rg -l 'register_worker\|iii.sdk' | head -10

Repository: iii-hq/workers

Length of output: 229


🏁 Script executed:

# Check image-resize which uses register_function_with and see what _fn_ref is (discarded)
cat -n image-resize/src/main.rs | sed -n '60,85p'

Repository: iii-hq/workers

Length of output: 1607


🏁 Script executed:

# Search for documentation or comments about register_function_with timing
rg -nP 'register_function_with|register_worker' -B 3 -A 1 --type=rust | grep -i -E 'async|await|buffer|queue|immediate|sync|registr' | head -40

Repository: iii-hq/workers

Length of output: 1941


🏁 Script executed:

# Check test patterns - do they consistently sleep after register_worker?
rg -nP 'register_worker' -A 5 --type=rust -g '*/tests/*' | grep -E '(sleep|Duration|delay|register_worker)' | head -50

Repository: iii-hq/workers

Length of output: 2720


🏁 Script executed:

# Look at if there are any comments in main.rs files about registration timing
rg -nP --type=rust 'register_worker|register_all|register_function' -B 5 sandbox/src/main.rs turn-orchestrator/src/main.rs | grep -E '(//|/\*|register)' | head -40

Repository: iii-hq/workers

Length of output: 523


🏁 Script executed:

# Check if iii_sdk is in this workspace or external - look for it in any Cargo.toml
find . -name 'Cargo.toml' -exec grep -l 'iii.sdk\|iii_sdk' {} \; 2>/dev/null

Repository: iii-hq/workers

Length of output: 897


🏁 Script executed:

# Look at the actual timing - check if test sleeps have comments explaining them
rg -nP 'sleep.*500|sleep.*1500' -B 10 --type=rust | grep -E '(//|sleep|Duration|register_worker)' | head -60

Repository: iii-hq/workers

Length of output: 3461


🏁 Script executed:

# Check if there's any async aspect to register_function_with by looking at return type usage
rg -nP 'register_function_with' -A 8 --type=rust | grep -E '(await|\.await|async|fn_ref|_fn)' | head -40

Repository: iii-hq/workers

Length of output: 132


🏁 Script executed:

# Look at what sandbox does differently - compare with other workers that do work immediately
rg -nP --type=rust 'register_worker.*\(|register_all|register_function' -A 3 oauth-openai-codex/src/main.rs shell/src/main.rs

Repository: iii-hq/workers

Length of output: 4595


🏁 Script executed:

# Check oauth-openai-codex fully to see the async pattern
cat -n oauth-openai-codex/src/main.rs

Repository: iii-hq/workers

Length of output: 916


🏁 Script executed:

# Check if sandbox should be using an async pattern - look for register_with_iii or similar
fd -t f '*.rs' -path '*/sandbox/*' -exec grep -l 'register_with_iii\|await.*register' {} \;

Repository: iii-hq/workers

Length of output: 227


🏁 Script executed:

# Look at what happens after register_all in sandbox - does it await anything?
cat -n sandbox/src/main.rs | sed -n '35,45p'

Repository: iii-hq/workers

Length of output: 430


🏁 Script executed:

# Check the III type signature for register_function_with to understand if it's truly synchronous
rg -nP 'pub.*fn register_function_with' -A 3 --type=rust

Repository: iii-hq/workers

Length of output: 40


🏁 Script executed:

# Find the return type and documentation for register_function_with
rg -nP 'register_function_with.*\{|pub.*register_function_with' -A 5 --type=rust | head -40

Repository: iii-hq/workers

Length of output: 40


🏁 Script executed:

# Look for any III trait/impl documentation or comments about registration timing
rg -nP 'impl III|trait III' -A 20 --type=rust | head -80

Repository: iii-hq/workers

Length of output: 5264


🏁 Script executed:

# Check if there's any timeout or wait logic in register_worker or connection setup
rg -nP 'register_worker|InitOptions' -B 2 -A 8 --type=rust sandbox/src/main.rs

Repository: iii-hq/workers

Length of output: 693


🏁 Script executed:

# Compare test setup - check if tests that don't sleep ever fail or have issues
rg -nP 'register_worker' -A 6 --type=rust -g '*/tests/integration*' | grep -v 'sleep\|Duration' | head -60

Repository: iii-hq/workers

Length of output: 3862


🏁 Script executed:

# Check if there are any comments explaining the sleep in test setup
rg -nP 'sleep.*500' -B 5 --type=rust -g '*/tests/*' | grep -E '(//|/\*|sleep|Duration)' | head -40

Repository: iii-hq/workers

Length of output: 634


🏁 Script executed:

# Look for any open issues or TODOs about registration timing
rg -nP 'TODO|FIXME|XXX|race|registra' --type=rust sandbox/ | head -30

Repository: iii-hq/workers

Length of output: 40


🏁 Script executed:

# Check how shell worker handles this (it doesn't seem to sleep and uses register_function)
cat -n shell/src/main.rs | sed -n '75,160p' | head -50

Repository: iii-hq/workers

Length of output: 2390


🏁 Script executed:

# Look at turn-orchestrator which also sleeps - check surrounding context
cat -n turn-orchestrator/src/main.rs | sed -n '55,90p'

Repository: iii-hq/workers

Length of output: 1520


🏁 Script executed:

# Check what register_with_iii does (it's async in turn-orchestrator)
rg -nP 'pub.*fn register_with_iii|pub async fn register_with_iii' -A 10 --type=rust | head -50

Repository: iii-hq/workers

Length of output: 3120


🏁 Script executed:

# Check sandbox's register_all more carefully - does it communicate with engine?
rg -nP 'register_function_with' -B 3 -A 8 sandbox/src/register.rs | head -80

Repository: iii-hq/workers

Length of output: 584


🏁 Script executed:

# Look for what register_function_with does - check if it's just local or sends to engine
rg -nP 'fn register_function_with|pub fn register_function_with' -A 15 --type=rust | head -60

Repository: iii-hq/workers

Length of output: 40


🏁 Script executed:

# Critical: Check if the III type is async - does register_function_with actually communicate with engine?
rg -nP 'pub fn register_function|register_function_with' -B 2 -A 10 --type=rust | grep -E '(async|await|pub fn|ws|send|channel)' | head -50

Repository: iii-hq/workers

Length of output: 460


Add initialization synchronization between register_worker and handler registration.

The test code across this repo (storage, session-tree, session-inbox, harness) consistently adds a 500ms sleep after register_worker() before triggering functions. In contrast, sandbox calls synchronous register_all() immediately without waiting. If the engine begins routing sandbox::* invocations before all handlers propagate from register_all(), early calls will fail with "function not found" errors. Either add a sleep delay (matching the test pattern), make register_all() async and await it (like turn-orchestrator's registration), or verify from iii-sdk's contract that registrations are queued until an explicit start point.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@sandbox/src/main.rs` around lines 38 - 43, The register flow races: call
register_worker(&cli.url, InitOptions::default()) and then register_all(&iii,
ctx) without waiting allows the engine to route sandbox::* invocations before
handlers are propagated; fix by synchronizing initialization — either (a) add a
500ms tokio::time::sleep(Duration::from_millis(500)).await after
register_worker(...) to match the repo test pattern, or (b) make
register_all(...) async and await it (preferred) so registration completes
before proceeding; adjust calls around register_worker, register_all, and any
use of iii (e.g., iii.shutdown_async()) accordingly to ensure handlers are
registered before the router starts.

Comment thread sandbox/src/register.rs
Comment on lines +60 to +69
.map_err(|e| {
let msg = e.to_string();
if msg.contains("not found") || msg.contains("not registered") {
to_iii(AbiError::UnknownProvider(format!(
"{provider}: no adapter registered for {target}; run `iii worker add sandbox-{provider}`"
)))
} else {
IIIError::Handler(format!("[{}] forward to {target} failed: {msg}", SCode::ProviderUnavailable.as_str()))
}
})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check iii-sdk's IIIError surface for a typed "function not found" variant
# that could replace the substring match.
fd -t f 'Cargo.lock' --max-depth 2
rg -nP --type=rust -C3 '\b(IIIError|TriggerError|FunctionNotFound|NotRegistered|UnknownFunction)\b' -g '!target/**'
rg -nP --type=rust -C2 'fn\s+trigger\s*\(' -g '!target/**'

Repository: iii-hq/workers

Length of output: 50371


Substring matching on error strings is fragile and already addressed elsewhere in the codebase—use typed error variants instead.

The IIIError enum exposes a Remote { code, message, stacktrace } variant with a structured code field. The turn-orchestrator module already fixed this exact pattern: it switched from substring matching on Display to matches!(err, IIIError::Remote { code, .. } if code == "function_not_found") with focused tests to prevent regression (see turn-orchestrator/src/agent_call.rs:80-87 and test is_function_not_found_ignores_substring_in_other_variants).

Apply the same approach here: determine the exact code value iii-sdk emits for a missing adapter, then match on it instead of substring checks on msg. This avoids silent misclassification if future SDK versions rephrase or restructure error messages.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@sandbox/src/register.rs` around lines 60 - 69, Replace fragile substring
checks on e.to_string() in the .map_err closure with a structured match on the
IIIError::Remote variant: inspect the error as IIIError::Remote { code, .. } and
compare code to the exact SDK code used for a missing adapter (use the same
literal used in turn-orchestrator, e.g. "function_not_found" equivalent for
adapters) and only then return to_iii(AbiError::UnknownProvider(...)); otherwise
return IIIError::Handler(...) as before; update the closure that currently
builds msg and checks contains(...) to use matches! or a match arm on
IIIError::Remote { code, .. } and keep the same messages involving
AbiError::UnknownProvider, SCode::ProviderUnavailable and the target/provider
text.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant