feat(auth): add OAuth authority worker#132
Conversation
📝 WalkthroughWalkthroughAdds a new OAuth/OIDC "auth" worker crate (binary + lib), configuration and validation, pluggable storage backends, full OAuth flows (DCR, token, refresh, introspect, revoke, JWKS rotation), skills/docs, registry entry, and comprehensive tests (unit, integration, BDD, and E2E). ChangesAuth Worker Addition
Sequence Diagram(s): sequenceDiagram
participant Client
participant AuthWorker
participant AuthStore
participant III as iii_sdk::III
Client->>AuthWorker: POST /register or /token
AuthWorker->>AuthStore: get/set client, keyset, tokens
AuthWorker->>AuthStore: rotate keyset / rotate refresh token
AuthWorker->>III: trigger state ops (IiiStateAuthStore)
AuthWorker-->>Client: token / jwks / introspect / revoke responses
🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
|
skill-check — worker0 verified, 26 skipped (no docs/).
Three for three. Nicely done. |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (4)
auth/tests/skill.rs (1)
1-8: ⚡ Quick winConsider verifying all exposed functions in skill documentation.
The test checks for
auth::validate,auth::register, andauth::token, but the PR objectives list 8 exposed functions total (auth::validate,auth::server_metadata,auth::resource_metadata,auth::register,auth::jwks,auth::jwks_rotate,auth::token). Consider adding assertions for the remaining functions to ensure complete documentation coverage.📋 Suggested additions to enhance documentation coverage
#[test] fn skill_starts_with_heading_and_lists_functions() { let body = include_str!("../skill.md"); assert!(body.starts_with("# auth\n")); assert!(body.contains("auth::validate")); + assert!(body.contains("auth::server_metadata")); + assert!(body.contains("auth::resource_metadata")); assert!(body.contains("auth::register")); + assert!(body.contains("auth::jwks")); + assert!(body.contains("auth::jwks_rotate")); assert!(body.contains("auth::token")); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@auth/tests/skill.rs` around lines 1 - 8, The test function skill_starts_with_heading_and_lists_functions only asserts three exposed functions; update it to also assert that the skill documentation contains the remaining exposed symbols: auth::server_metadata, auth::resource_metadata, auth::jwks, and auth::jwks_rotate (in addition to the existing auth::validate, auth::register, auth::token checks). Locate the test by the function name skill_starts_with_heading_and_lists_functions and the include_str! usage for the skill document, and add assert!(body.contains("auth::<symbol>")) lines for each missing symbol to ensure complete coverage.auth/src/lib.rs (1)
402-415: 💤 Low valueReuse the already-resolved
client_nameinstead of re-parsing the payload.
client_namewas computed on lines 383-385 with the same fallback used here on line 405; reusing it keeps the response consistent if the resolution logic ever changes and avoids a second JSON lookup.♻️ Suggested change
let mut out = json!({ "client_id": client_id, "client_id_issued_at": now(), - "client_name": string_field(&body, "client_name").unwrap_or("iii client"), + "client_name": record.client_name, "redirect_uris": redirect_uris, "grant_types": grant_types, "scope": scopes.join(" "), "token_endpoint_auth_method": method, });This requires reading
record.client_namebeforestore.set_client(record).await?consumesrecord; either clone the string up front or move thejson!block above the store call.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@auth/src/lib.rs` around lines 402 - 415, The response currently reparses client_name from the request body instead of reusing the already-resolved value; to fix, use the previously-computed client_name (or clone it) when building the json! response instead of calling string_field(&body, "client_name") again — either move the json! creation to before calling store.set_client(record).await? so you can move record.client_name into the response, or clone record.client_name into a local String before store.set_client(record).await? and use that local when populating the "client_name" field in out (and keep the rest of the json! construction as-is, including conditionally inserting client_secret).auth/src/store.rs (1)
210-217: ⚖️ Poor tradeoffRevocation entries are written-once and never cleaned up.
revoke()writesrevoked:{token_id}to theauth:tokensscope with no TTL or compaction path, and there is no purge inrotate_jwksor refresh-token expiry. Over time the revocation namespace will accumulate one entry per revoked jti, which is fine in-memory but unbounded in the iii-state backend.Consider attaching the refresh-token expiry (
expires_at) as a TTL hint on the revocation record so a janitor / sweep step can prune entries whose underlying token has already expired.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@auth/src/store.rs` around lines 210 - 217, revoke() currently writes a permanent revocation entry via set_value(TOKENS_SCOPE, &format!("revoked:{token_id}"), json!({ "revoked": true })) which leaves unbounded entries; modify revoke (and any code path that creates revocation records) to include a TTL based on the underlying refresh-token expiry (expires_at) so the store can auto-expire the revocation entry—fetch or accept the token's expires_at, compute a remaining duration, and pass that TTL/expiry hint to set_value (or the store API you use) when writing the revoked:{token_id} record; ensure rotate_jwks/refresh-token logic does not rely on permanent entries and that the key name revoked:{token_id} and function revoke remain the places you change.auth/src/main.rs (1)
216-224: ⚡ Quick win
wait_for_connection_readysilently returns after ~1s without surfacing failure.The loop runs at most 100 × 10ms = 1s, then falls through and returns
()regardless of whether the connection ever becameConnected. Downstream calls (register_with_iii,register_triggers,register_skill_with_retry) will proceed against a half-open SDK and fail in confusing ways. Consider returningResult<()>and logging a warning (or aborting startup) on timeout, and/or making the bound configurable.♻️ Suggested change
-async fn wait_for_connection_ready(iii: &iii_sdk::III) { - for _ in 0..100 { +async fn wait_for_connection_ready(iii: &iii_sdk::III) { + for attempt in 0..100 { if iii.get_connection_state() == iii_sdk::IIIConnectionState::Connected { tokio::time::sleep(Duration::from_millis(50)).await; return; } tokio::time::sleep(Duration::from_millis(10)).await; + if attempt == 99 { + tracing::warn!( + state = ?iii.get_connection_state(), + "iii connection not ready after 1s; proceeding anyway" + ); + } } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@auth/src/main.rs` around lines 216 - 224, wait_for_connection_ready currently polls iii.get_connection_state() up to 100 times then returns silently even if not Connected, letting register_with_iii, register_triggers and register_skill_with_retry run against a half-open SDK; change the async fn wait_for_connection_ready(iii: &iii_sdk::III) to return Result<(), Error> (or anyhow::Result) and on timeout return an Err with a clear message (and/or log a warning and abort startup), make the retry count/duration configurable (function params or const) and ensure callers handle the Result (propagate error or abort startup) so failures are surfaced instead of proceeding.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@auth/build.rs`:
- Around line 2-5: The build script currently falls back to "unknown" when
std::env::var("TARGET") fails, silently masking missing TARGET; change the
retrieval to fail-fast (e.g., use expect or unwrap with a clear message) so that
the build script panics if TARGET is not set, and then use that value in the
println! call (referencing the std::env::var("TARGET") call and the println!
macro in auth/build.rs).
In `@auth/src/config.rs`:
- Around line 120-124: After deserializing in load_config, validate AuthConfig
invariants before returning: ensure numeric TTLs (e.g., token_ttl, refresh_ttl,
or similarly named fields on AuthConfig) are positive/non-zero and that
default_scopes is a subset of supported_scopes (and that supported_scopes is
non-empty if required). If any check fails, return an Err with a clear message
instead of Ok(cfg). Implement these checks immediately after
serde_yaml::from_str in load_config so malformed configs are rejected at load
time.
In `@auth/src/lib.rs`:
- Around line 491-506: The Basic/Bearer prefix checks in basic_credentials and
bearer_token_from_payload are case-sensitive and will miss valid auth schemes;
change them to perform a case-insensitive scheme match (e.g., compare the
header's scheme lowercased to "basic " / "bearer " or use an ASCII
case-insensitive starts_with) and then extract the remainder token from the
original header string (preserving credential casing). Update basic_credentials
and bearer_token_from_payload to normalize only the scheme for matching and then
strip the matching prefix to return the decoded credentials or token.
- Around line 566-593: The refresh flow in issue_for_refresh_token currently
issues a new access token but never rotates or revokes the used refresh token;
update issue_for_refresh_token to revoke the presented refresh_token via the
AuthStore (call store.revoke(refresh_token).await?) and create/persist a new
refresh token record (use the store's API for creating refresh tokens — e.g.,
store.create_refresh_token or store.save_refresh_token_record) tied to the same
client/subject/scopes and TTL, then return the new refresh_token in the JSON
response along with the access_token, token_type, expires_in and scope; keep the
existing checks (get_refresh_token, is_revoked, get_client) and ensure you
reference the same key functions: issue_for_refresh_token, store.revoke, and the
store method you use to save/create the new refresh token.
- Around line 105-116: The Claims struct uses usize for JWT time claims which
can overflow on 32-bit targets; change the Claims fields exp, iat, and nbf from
usize to u64, and update any places that set these (where now() is cast
currently with `as usize`) to cast to u64 instead (e.g., the code paths
assigning to Claims.exp, Claims.iat, Claims.nbf); ensure serde/jsonwebtoken
compatibility by keeping the fields as u64 so serialization/deserialization
remains correct.
In `@auth/src/main.rs`:
- Around line 118-152: The registered HTTP routes in register_triggers are using
api_path values with a leading '/' which causes double-slash mismatches in the
iii-engine; update the http_routes array (used when building each
RegisterTriggerInput) to remove the leading slash characters (e.g., change
"/.well-known/oauth-authorization-server" to
".well-known/oauth-authorization-server", "/register" to "register", etc.) so
the api_path JSON in each RegisterTriggerInput has no leading slash; leave the
cron trigger unchanged and verify any emitted server_metadata_document still
constructs external URLs with a single '/' as needed.
---
Nitpick comments:
In `@auth/src/lib.rs`:
- Around line 402-415: The response currently reparses client_name from the
request body instead of reusing the already-resolved value; to fix, use the
previously-computed client_name (or clone it) when building the json! response
instead of calling string_field(&body, "client_name") again — either move the
json! creation to before calling store.set_client(record).await? so you can move
record.client_name into the response, or clone record.client_name into a local
String before store.set_client(record).await? and use that local when populating
the "client_name" field in out (and keep the rest of the json! construction
as-is, including conditionally inserting client_secret).
In `@auth/src/main.rs`:
- Around line 216-224: wait_for_connection_ready currently polls
iii.get_connection_state() up to 100 times then returns silently even if not
Connected, letting register_with_iii, register_triggers and
register_skill_with_retry run against a half-open SDK; change the async fn
wait_for_connection_ready(iii: &iii_sdk::III) to return Result<(), Error> (or
anyhow::Result) and on timeout return an Err with a clear message (and/or log a
warning and abort startup), make the retry count/duration configurable (function
params or const) and ensure callers handle the Result (propagate error or abort
startup) so failures are surfaced instead of proceeding.
In `@auth/src/store.rs`:
- Around line 210-217: revoke() currently writes a permanent revocation entry
via set_value(TOKENS_SCOPE, &format!("revoked:{token_id}"), json!({ "revoked":
true })) which leaves unbounded entries; modify revoke (and any code path that
creates revocation records) to include a TTL based on the underlying
refresh-token expiry (expires_at) so the store can auto-expire the revocation
entry—fetch or accept the token's expires_at, compute a remaining duration, and
pass that TTL/expiry hint to set_value (or the store API you use) when writing
the revoked:{token_id} record; ensure rotate_jwks/refresh-token logic does not
rely on permanent entries and that the key name revoked:{token_id} and function
revoke remain the places you change.
In `@auth/tests/skill.rs`:
- Around line 1-8: The test function
skill_starts_with_heading_and_lists_functions only asserts three exposed
functions; update it to also assert that the skill documentation contains the
remaining exposed symbols: auth::server_metadata, auth::resource_metadata,
auth::jwks, and auth::jwks_rotate (in addition to the existing auth::validate,
auth::register, auth::token checks). Locate the test by the function name
skill_starts_with_heading_and_lists_functions and the include_str! usage for the
skill document, and add assert!(body.contains("auth::<symbol>")) lines for each
missing symbol to ensure complete coverage.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 077f0a2e-8984-4a6b-b907-e62787104902
⛔ Files ignored due to path filters (1)
auth/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (24)
README.mdauth/Cargo.tomlauth/README.mdauth/build.rsauth/config.yamlauth/iii.worker.yamlauth/skill.mdauth/skills/jwks.mdauth/skills/jwks_rotate.mdauth/skills/register.mdauth/skills/resource_metadata.mdauth/skills/server_metadata.mdauth/skills/token.mdauth/skills/validate.mdauth/src/config.rsauth/src/io.rsauth/src/lib.rsauth/src/main.rsauth/src/manifest.rsauth/src/store.rsauth/tests/integration.rsauth/tests/manifest.rsauth/tests/skill.rsregistry/index.json
There was a problem hiding this comment.
Actionable comments posted: 7
♻️ Duplicate comments (1)
auth/src/config.rs (1)
153-158:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMissing invariant:
rotation_overlap_secondsmust be strictly less thanrefresh_token_ttl_seconds.If
rotation_overlap_seconds >= refresh_token_ttl_seconds, every refresh token sits permanently within the rotation grace window, defeating rotation. Defaults satisfy this, but the validator should enforce it so misconfigured deployments fail at load.🛡️ Proposed fix
if cfg.refresh_token_ttl_seconds <= 0 { anyhow::bail!("refresh_token_ttl_seconds must be positive"); } if cfg.rotation_overlap_seconds <= 0 { anyhow::bail!("rotation_overlap_seconds must be positive"); } + if cfg.rotation_overlap_seconds >= cfg.refresh_token_ttl_seconds { + anyhow::bail!( + "rotation_overlap_seconds must be less than refresh_token_ttl_seconds" + ); + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@auth/src/config.rs` around lines 153 - 158, Add a validator that enforces rotation_overlap_seconds < refresh_token_ttl_seconds: in the same config validation block where cfg.refresh_token_ttl_seconds and cfg.rotation_overlap_seconds are checked, after verifying both are positive, bail with an error (e.g., anyhow::bail!) if cfg.rotation_overlap_seconds >= cfg.refresh_token_ttl_seconds and include a clear message such as "rotation_overlap_seconds must be less than refresh_token_ttl_seconds" so misconfigurations fail at startup; keep the existing positive checks intact.
🧹 Nitpick comments (2)
auth/src/config.rs (1)
176-183: ⚡ Quick winUse positive scheme allowlist for production URLs.
The current check only rejects the two known-insecure prefixes. An empty
engine_url/issuer, a typo, or any other scheme (e.g.http2://,file://) silently passes production validation.♻️ Proposed fix
if cfg.environment.eq_ignore_ascii_case("production") { - if cfg.engine_url.starts_with("ws://") { - anyhow::bail!("production auth config requires wss:// engine_url"); - } - if cfg.issuer.starts_with("http://") { - anyhow::bail!("production auth config requires https:// issuer"); - } + if !cfg.engine_url.starts_with("wss://") { + anyhow::bail!("production auth config requires wss:// engine_url"); + } + if !cfg.issuer.starts_with("https://") { + anyhow::bail!("production auth config requires https:// issuer"); + } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@auth/src/config.rs` around lines 176 - 183, In production validation (where cfg.environment.eq_ignore_ascii_case("production") is true) replace the current negative checks with positive allowlists: ensure cfg.engine_url is non-empty and begins with "wss://" and ensure cfg.issuer is non-empty and begins with "https://", otherwise bail with a clear error; update the checks around cfg.engine_url and cfg.issuer in config.rs to validate the required secure schemes rather than only rejecting "ws://" or "http://".auth/src/main.rs (1)
108-108: ⚡ Quick winBackground skill-registration task can race with
unregister_skillon early shutdown.
spawn_skill_registeris fire-and-forget — noJoinHandle, no cancellation signal. If aSIGTERMarrives whileregister_skill_with_retryis mid-backoff (up to 3 minutes per skill),wait_for_shutdownreturns andunregister_skillissues unregisters while the spawned task may still callskills::registerimmediately afterwards, potentially re-registering a skill that's just been removed (or, on the SUB_SKILLS loop, registering some skills after the main has been unregistered). Holding theJoinHandleand either awaiting it with a short timeout orabort()-ing it before unregister would make shutdown deterministic. Same applies to dropping the runtime mid-task, which silently cancels the in-flighttriggercall.♻️ Sketch: signal cancellation before unregister
-fn spawn_skill_register(iii: Arc<iii_sdk::III>, cfg: Arc<AuthConfig>) { - tokio::spawn(async move { +fn spawn_skill_register( + iii: Arc<iii_sdk::III>, + cfg: Arc<AuthConfig>, + cancel: tokio_util::sync::CancellationToken, +) -> tokio::task::JoinHandle<()> { + tokio::spawn(async move { + tokio::select! { + _ = cancel.cancelled() => return, + _ = async { register_skill_with_retry( &iii, iii_auth::SKILL_ID, iii_auth::SKILL_MD, cfg.skills_register_timeout_ms, ) .await; for (id, body) in iii_auth::SUB_SKILLS { register_skill_with_retry(&iii, id, body, cfg.skills_register_timeout_ms).await; } + } => {} + } }) }Then in
maincancel the token (and optionally.awaitthe handle with a short timeout) before callingunregister_skill.Also applies to: 181-194, 196-215
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@auth/src/main.rs` at line 108, The fire-and-forget call to spawn_skill_register is racing with shutdown/unregister; change spawn_skill_register to return a JoinHandle (or accept a CancellationToken) and keep the handle(s) when you call spawn_skill_register (also adjust the other calls around lines referenced that use the same helper). On shutdown (in wait_for_shutdown) signal cancellation or call handle.abort() and then await the handle(s) with a short timeout before calling unregister_skill (or use the token to stop register_skill_with_retry early), ensuring register_skill_with_retry and any in-flight trigger calls cannot complete after unregister_skill runs.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@auth/skills/register.md`:
- Around line 31-47: Replace the realistic-looking credentials in the sample
JSON (the "authorization" header value and the returned "client_id" and
"client_secret" fields) with clearly fake placeholders (e.g., "Bearer
<ADMIN_TOKEN>", "<CLIENT_ID_PLACEHOLDER>", "<CLIENT_SECRET_PLACEHOLDER>") so
readers won't mistake them for real secrets; update the example request/response
JSON around the "headers.authorization", "client_name", "scope", and returned
"client_id"/"client_secret"/"client_name" fields to use these explicit
placeholder tokens.
In `@auth/skills/token.md`:
- Around line 23-33: The Basic-auth example's Authorization header encodes a
different client_secret than the nearby client_secret_post example; update the
"Client-secret-basic input" sample so its "authorization" header is the Base64
encoding of the same client_id:client_secret pair used in the client_secret_post
example (i.e., replace the current "Basic
UUd4R3E3bTZiY3FYa0ZZN1EwYzFwMkpmOnNlY3JldA==" value with the Base64 of the exact
client_id:client_secret from the client_secret_post sample) so both examples are
consistent for copy/paste testing.
In `@auth/src/lib.rs`:
- Around line 753-766: The current flow revokes the presented refresh token then
writes the replacement which can leave the client tokenless if the second write
fails; change this to use an atomic rotate on the AuthStore (or a transactional
operation) instead of separate calls: add a new method like
AuthStore::rotate_refresh_token(old_token_id: &str, new_record:
RefreshTokenRecord) -> Result<()> that performs revoke+insert atomically in the
storage backend (or wraps both ops in a DB transaction for iii-state), and
replace the pair of calls to store.revoke(refresh_token) and
store.set_refresh_token(...) in the code that issues tokens (the code around
ensure_keyset, issue_access_token, random_url_token and creation of
RefreshTokenRecord) with a single call to
store.rotate_refresh_token(refresh_token, new_refresh_record).
- Around line 431-458: ensure_keyset and rotate_jwks perform non-atomic
read-modify-write on the entire KeySet so concurrent callers can clobber each
other; fix by adding a store-level atomic primitive (e.g.,
AuthStore::create_keyset_if_absent or AuthStore::compare_and_swap_keyset /
set_keyset_cas) and use it from ensure_keyset and rotate_jwks: in ensure_keyset
attempt an atomic create-if-absent (returning the existing set if present), and
in rotate_jwks loop read the keyset, compute the rotated keyset, then call the
CAS primitive to persist only if unchanged (retry on CAS failure a few times or
return a conflict error); reference the existing methods get_keyset and
set_keyset and replace their direct use in ensure_keyset and rotate_jwks with
the new atomic create/compare-and-swap calls so no two writers can overwrite
each other.
In `@auth/src/main.rs`:
- Line 170: The use of Duration::from_mins in the auth crate (e.g., the
comparison started.elapsed() > Duration::from_mins(3) and the similar call at
the other site) requires Rust 1.91; either declare the MSRV by adding
rust-version = "1.91" to auth/Cargo.toml, or change the calls to a
backwards-compatible form such as Duration::from_secs(3 * 60) (and any
single-minute uses to Duration::from_secs(60)); update both occurrences (the one
comparing started.elapsed() and the other at the second reported site)
accordingly.
- Around line 42-44: Replace the hardcoded connection-ready constants
(CONNECTION_READY_ATTEMPTS, CONNECTION_READY_INTERVAL, CONNECTION_READY_SETTLE
in auth/src/main.rs) with values read from the worker config (add new config
fields like connection_ready_attempts: usize = 150 and
connection_ready_interval_ms: u64 = 200 with CONNECTION_READY_SETTLE kept as
reasonable default or made configurable), update any polling loops that use
those constants (also the similar code at the second occurrence around lines
217-229) to use the config values, and instrument the poll loop to emit
tracing::debug! on each failed attempt including attempt index and error/result
so timeouts are visible in logs. Ensure defaults match ~150 attempts × 200 ms
(~30s) and preserve the settle delay behavior.
In `@auth/src/store.rs`:
- Around line 16-25: The AuthStore trait lacks methods to remove expired or old
revoked tokens leading to unbounded growth; add cleanup APIs (e.g., async fn
delete_refresh_token(&self, token_id: &str) -> anyhow::Result<()> and async fn
purge_revoked_older_than(&self, cutoff: chrono::DateTime<Utc>) ->
anyhow::Result<()>) and/or an async fn cleanup_expired_tokens(&self) ->
anyhow::Result<()> to the trait, implement these in all concrete stores, and
update the worker that issues tokens to call these cleanup methods periodically
(or when issuing tokens) so refresh_tokens and revoked entries are pruned;
reference the trait AuthStore and methods get_refresh_token, set_refresh_token,
is_revoked, revoke when adding and wiring the new deletion/cleanup methods.
---
Duplicate comments:
In `@auth/src/config.rs`:
- Around line 153-158: Add a validator that enforces rotation_overlap_seconds <
refresh_token_ttl_seconds: in the same config validation block where
cfg.refresh_token_ttl_seconds and cfg.rotation_overlap_seconds are checked,
after verifying both are positive, bail with an error (e.g., anyhow::bail!) if
cfg.rotation_overlap_seconds >= cfg.refresh_token_ttl_seconds and include a
clear message such as "rotation_overlap_seconds must be less than
refresh_token_ttl_seconds" so misconfigurations fail at startup; keep the
existing positive checks intact.
---
Nitpick comments:
In `@auth/src/config.rs`:
- Around line 176-183: In production validation (where
cfg.environment.eq_ignore_ascii_case("production") is true) replace the current
negative checks with positive allowlists: ensure cfg.engine_url is non-empty and
begins with "wss://" and ensure cfg.issuer is non-empty and begins with
"https://", otherwise bail with a clear error; update the checks around
cfg.engine_url and cfg.issuer in config.rs to validate the required secure
schemes rather than only rejecting "ws://" or "http://".
In `@auth/src/main.rs`:
- Line 108: The fire-and-forget call to spawn_skill_register is racing with
shutdown/unregister; change spawn_skill_register to return a JoinHandle (or
accept a CancellationToken) and keep the handle(s) when you call
spawn_skill_register (also adjust the other calls around lines referenced that
use the same helper). On shutdown (in wait_for_shutdown) signal cancellation or
call handle.abort() and then await the handle(s) with a short timeout before
calling unregister_skill (or use the token to stop register_skill_with_retry
early), ensuring register_skill_with_retry and any in-flight trigger calls
cannot complete after unregister_skill runs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: de6f698d-893a-465e-af50-a689d2e055aa
⛔ Files ignored due to path filters (1)
auth/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (22)
auth/Cargo.tomlauth/README.mdauth/build.rsauth/config.yamlauth/skills/index.mdauth/skills/introspect.mdauth/skills/jwks.mdauth/skills/jwks_rotate.mdauth/skills/register.mdauth/skills/resource_metadata.mdauth/skills/revoke.mdauth/skills/server_metadata.mdauth/skills/token.mdauth/skills/validate.mdauth/src/config.rsauth/src/lib.rsauth/src/main.rsauth/src/manifest.rsauth/src/store.rsauth/tests/integration.rsauth/tests/skill.rsregistry/index.json
✅ Files skipped from review due to trivial changes (6)
- auth/skills/server_metadata.md
- auth/skills/validate.md
- auth/skills/jwks_rotate.md
- auth/skills/resource_metadata.md
- auth/config.yaml
- auth/skills/jwks.md
🚧 Files skipped from review as they are similar to previous changes (6)
- auth/tests/skill.rs
- registry/index.json
- auth/src/manifest.rs
- auth/Cargo.toml
- auth/README.md
- auth/tests/integration.rs
|
Resolved the remaining review-body items in 896da72 as well: enforced rotation_overlap_seconds < refresh_token_ttl_seconds, switched production URL checks to positive wss:// / https:// allowlists, and made skill registration shutdown deterministic by aborting/awaiting the JoinHandle before unregistering skills. Validation run: cargo test --manifest-path auth/Cargo.toml; cargo clippy --manifest-path auth/Cargo.toml --all-targets -- -D warnings; cargo fmt --manifest-path auth/Cargo.toml -- --check; git diff --check. Also tested against a local iii engine with register -> token -> refresh rotation -> old refresh rejected. |
Move auth skill docs into the nested bundle layout expected by the worker documentation guidelines, including an index skill and one auth how-to per function. Register typed auth function metadata and schemas so skill generation exposes the real auth::* contracts instead of loose JSON-only handlers. Expand unit, integration, BDD, and skill bundle tests around provider normalization, API key rotation, OAuth round-trips, environment fallback, redaction, delete idempotency, list semantics, and malformed iii-state payloads. Add real-engine coverage for auth-credentials with an explicit iii-state engine fixture, plus restart coverage that waits for auth::list_providers before asserting stored credentials survive worker re-registration. Document the live engine test flow so reviewers can run the auth worker with iii-state registered explicitly instead of relying on default engine config.
Auth credential skill/test hardeningAdded commit What changed
Validation
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (5)
auth-credentials/tests/integration.rs (1)
433-433: 💤 Low valueConsider returning
anyhow::Result<()>for consistency.Most tests in this file return
anyhow::Result<()>and use?for early error propagation. While this test doesn't need it (it callsunwrap_err()directly), matching the signature would maintain consistency.♻️ Consistent signature
#[tokio::test] -async fn blank_provider_is_rejected_before_storage() { +async fn blank_provider_is_rejected_before_storage() -> anyhow::Result<()> { let store = auth_credentials::InMemoryStore::new(); let err = auth_credentials::handle_set_token( &store, auth_credentials::SetTokenInput { provider: " ".into(), credential: auth_credentials::Credential::ApiKey { key: "sk-invalid".into(), }, }, ) .await .unwrap_err(); assert!(err.to_string().contains("provider must be non-empty")); + Ok(()) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@auth-credentials/tests/integration.rs` at line 433, Change the test function signature for blank_provider_is_rejected_before_storage to return anyhow::Result<()> (i.e., async fn blank_provider_is_rejected_before_storage() -> anyhow::Result<()>) for consistency with other tests; update the body to use ? where appropriate and end with Ok(()) so it compiles and integrates with the existing test harness while keeping the current unwrap_err() usage unchanged.auth/src/store.rs (1)
343-357: 💤 Low valueDocument rollback behavior when revocation fails.
If
revoke(old_token_id)fails (line 350), the code attempts to delete the newly inserted refresh token to maintain consistency. However, ifdelete_valuealso fails, the new token remains in state without the old token being revoked, creating a split state.This is acceptable for iii-state (assumed reliable), but worth documenting so maintainers understand the rollback is best-effort only.
📝 Add clarifying comment
async fn rotate_refresh_token( &self, old_token_id: &str, new_record: RefreshTokenRecord, ) -> anyhow::Result<()> { let _guard = self.lock.lock().await; self.set_refresh_token(new_record.clone()).await?; + // Best-effort rollback: if revoke fails, delete the new token to avoid split state. + // If delete also fails, the new token may remain without revoking the old. if let Err(err) = self.revoke(old_token_id).await { let _ = self .delete_value(TOKENS_SCOPE, &format!("refresh:{}", new_record.token_id)) .await; return Err(err); } Ok(()) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@auth/src/store.rs` around lines 343 - 357, The rollback when revoking the old refresh token in rotate_refresh_token is best-effort: if revoke(old_token_id) fails the code attempts to delete the newly inserted token via delete_value(TOKENS_SCOPE, &format!("refresh:{}", new_record.token_id)) but that deletion can also fail leaving a split state; add a clear comment inside the rotate_refresh_token function (near calls to set_refresh_token, revoke, and delete_value) that documents this behavior as a best-effort rollback, explains the potential split-state scenario, and notes any assumptions (e.g., iii-state or why this approach was chosen) so maintainers understand the trade-off.auth-credentials/tests/engine_e2e.rs (1)
22-23: 💤 Low valueConsider preserving stderr for debugging failed E2E runs.
Redirecting both stdout and stderr to
/dev/nullimproves output cleanliness but hides worker crash logs. If the E2E test fails during development, capturing stderr to a temp file or inheriting it would simplify debugging.🔍 Optional: capture stderr for debugging
.env("OPENAI_API_KEY", format!("sk-env-openai-{nonce}")) .env("RUST_LOG", "warn") .stdout(Stdio::null()) - .stderr(Stdio::null()) + .stderr(Stdio::inherit()) .spawn()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@auth-credentials/tests/engine_e2e.rs` around lines 22 - 23, The test currently silences both stdout and stderr by calling .stdout(Stdio::null()) and .stderr(Stdio::null()) in the process builder; change the stderr handling so failures are visible by replacing .stderr(Stdio::null()) with either .stderr(Stdio::inherit()) to stream logs to the test runner or with code that writes stderr to a temp file (using tempfile::NamedTempFile and .stderr(File::from_std(tempfile.reopen()?))) to preserve logs for post-failure inspection; update the code near the process builder call in engine_e2e.rs where .stdout(...) and .stderr(...) are set (keep stdout null if you still want quiet output).auth-credentials/tests/skill.rs (1)
80-101: 💤 Low valueConsider extracting shared test helpers to reduce duplication.
The
strip_json_comments,json_blocks, andsplit_frontmatterfunctions are duplicated betweenskill.rsandbdd.rs. Extracting them to a shared test module (e.g.,tests/common/skill_helpers.rs) would reduce maintenance overhead if the parsing logic changes.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@auth-credentials/tests/skill.rs` around lines 80 - 101, Duplicate test helpers strip_json_comments, json_blocks, and split_frontmatter should be extracted into a single shared test helper module (e.g., a new skill_helpers test module); move the implementations there, mark the functions pub, and replace the duplicated copies in both skill.rs and bdd.rs with imports like use skill_helpers::{strip_json_comments, json_blocks, split_frontmatter}; update the test modules' mod/import declarations so both tests reference the single helper module.auth/skills/token.md (1)
18-18: ⚡ Quick winUse explicit placeholders for secret-like sample values.
Line 18, Line 42, Line 50, and Line 53 currently look like real secrets/tokens and trigger secret-scanner alerts. Swapping them to placeholders avoids security-noise churn while preserving the example intent.
Suggested doc tweak
{ "grant_type": "client_credentials", - "client_id": "QGxGq7m6bcqXkFY7Q0c1p2Jf", - "client_secret": "iN1PqXZJDEU6M5HsR3uHz12vQk1eQJ3TR1T1lPYU6Oc", + "client_id": "<client_id>", + "client_secret": "<client_secret>", "scope": "mcp:tools" }{ "headers": { - "authorization": "Basic UUd4R3E3bTZiY3FYa0ZZN1EwYzFwMkpmOmlOMVBxWFpKREVVNk01SHNSM3VIejEydlFrMWVRSjNUUjFUMWxQWVU2T2M=" + "authorization": "Basic <base64(<client_id>:<client_secret>)>" }, "grant_type": "client_credentials", "scope": "mcp:tools" }{ "grant_type": "refresh_token", - "refresh_token": "old-refresh-token", - "client_id": "QGxGq7m6bcqXkFY7Q0c1p2Jf", - "client_secret": "iN1PqXZJDEU6M5HsR3uHz12vQk1eQJ3TR1T1lPYU6Oc" + "refresh_token": "<old_refresh_token>", + "client_id": "<client_id>", + "client_secret": "<client_secret>" }Also applies to: 42-42, 50-50, 53-53
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@auth/skills/token.md` at line 18, Replace any real-looking secret/sample values in the doc with explicit placeholder strings: change the "client_secret" value and the other token-like examples to clear placeholders such as "<CLIENT_SECRET>", "<ACCESS_TOKEN>", "<REFRESH_TOKEN>" or "<API_KEY>" so they no longer resemble real secrets; update the JSON examples in token.md where these sample credentials appear to use those placeholders while preserving the same keys (e.g., "client_secret") and example structure.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@auth/src/lib.rs`:
- Around line 338-340: The current check only blocks "function:*" and
"trigger:*" but allows other wildcard scopes; change the validation to
generically reject any scope containing the wildcard character (e.g., if
scope.contains('*') { anyhow::bail!(...); }) so no scopes with '*' can be
issued; update the error message to state wildcard scopes are policy-time only
and include the offending scope variable name (scope) to aid debugging.
---
Nitpick comments:
In `@auth-credentials/tests/engine_e2e.rs`:
- Around line 22-23: The test currently silences both stdout and stderr by
calling .stdout(Stdio::null()) and .stderr(Stdio::null()) in the process
builder; change the stderr handling so failures are visible by replacing
.stderr(Stdio::null()) with either .stderr(Stdio::inherit()) to stream logs to
the test runner or with code that writes stderr to a temp file (using
tempfile::NamedTempFile and .stderr(File::from_std(tempfile.reopen()?))) to
preserve logs for post-failure inspection; update the code near the process
builder call in engine_e2e.rs where .stdout(...) and .stderr(...) are set (keep
stdout null if you still want quiet output).
In `@auth-credentials/tests/integration.rs`:
- Line 433: Change the test function signature for
blank_provider_is_rejected_before_storage to return anyhow::Result<()> (i.e.,
async fn blank_provider_is_rejected_before_storage() -> anyhow::Result<()>) for
consistency with other tests; update the body to use ? where appropriate and end
with Ok(()) so it compiles and integrates with the existing test harness while
keeping the current unwrap_err() usage unchanged.
In `@auth-credentials/tests/skill.rs`:
- Around line 80-101: Duplicate test helpers strip_json_comments, json_blocks,
and split_frontmatter should be extracted into a single shared test helper
module (e.g., a new skill_helpers test module); move the implementations there,
mark the functions pub, and replace the duplicated copies in both skill.rs and
bdd.rs with imports like use skill_helpers::{strip_json_comments, json_blocks,
split_frontmatter}; update the test modules' mod/import declarations so both
tests reference the single helper module.
In `@auth/skills/token.md`:
- Line 18: Replace any real-looking secret/sample values in the doc with
explicit placeholder strings: change the "client_secret" value and the other
token-like examples to clear placeholders such as "<CLIENT_SECRET>",
"<ACCESS_TOKEN>", "<REFRESH_TOKEN>" or "<API_KEY>" so they no longer resemble
real secrets; update the JSON examples in token.md where these sample
credentials appear to use those placeholders while preserving the same keys
(e.g., "client_secret") and example structure.
In `@auth/src/store.rs`:
- Around line 343-357: The rollback when revoking the old refresh token in
rotate_refresh_token is best-effort: if revoke(old_token_id) fails the code
attempts to delete the newly inserted token via delete_value(TOKENS_SCOPE,
&format!("refresh:{}", new_record.token_id)) but that deletion can also fail
leaving a split state; add a clear comment inside the rotate_refresh_token
function (near calls to set_refresh_token, revoke, and delete_value) that
documents this behavior as a best-effort rollback, explains the potential
split-state scenario, and notes any assumptions (e.g., iii-state or why this
approach was chosen) so maintainers understand the trade-off.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 44c78dc2-ea7b-407a-87ab-fbdf11f809a6
⛔ Files ignored due to path filters (1)
auth-credentials/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (33)
auth-credentials/Cargo.tomlauth-credentials/README.mdauth-credentials/docs/quickstart.mdauth-credentials/skill.mdauth-credentials/skills/auth/delete_token.mdauth-credentials/skills/auth/get_token.mdauth-credentials/skills/auth/list_providers.mdauth-credentials/skills/auth/set_token.mdauth-credentials/skills/auth/status.mdauth-credentials/skills/delete_token.mdauth-credentials/skills/get_token.mdauth-credentials/skills/index.mdauth-credentials/skills/list_providers.mdauth-credentials/skills/set_token.mdauth-credentials/skills/status.mdauth-credentials/src/lib.rsauth-credentials/src/store_iii_state.rsauth-credentials/tests/bdd.rsauth-credentials/tests/e2e/engine-state.yamlauth-credentials/tests/engine_e2e.rsauth-credentials/tests/features/auth_functions.featureauth-credentials/tests/features/auth_skill_bundle.featureauth-credentials/tests/integration.rsauth-credentials/tests/restart_e2e.rsauth-credentials/tests/skill.rsauth/README.mdauth/config.yamlauth/skills/register.mdauth/skills/token.mdauth/src/config.rsauth/src/lib.rsauth/src/main.rsauth/src/store.rs
💤 Files with no reviewable changes (6)
- auth-credentials/skill.md
- auth-credentials/skills/delete_token.md
- auth-credentials/skills/list_providers.md
- auth-credentials/skills/set_token.md
- auth-credentials/skills/status.md
- auth-credentials/skills/get_token.md
✅ Files skipped from review due to trivial changes (6)
- auth-credentials/skills/auth/delete_token.md
- auth-credentials/skills/auth/get_token.md
- auth-credentials/docs/quickstart.md
- auth-credentials/skills/auth/set_token.md
- auth-credentials/skills/index.md
- auth/skills/register.md
🚧 Files skipped from review as they are similar to previous changes (3)
- auth/README.md
- auth/src/main.rs
- auth/src/config.rs
| if scope == "function:*" || scope == "trigger:*" { | ||
| anyhow::bail!("wildcard scopes can be registered but cannot be issued: {scope}"); | ||
| } |
There was a problem hiding this comment.
Block wildcard scope issuance generically, not just function:*/trigger:*.
Current token-scope validation still allows wildcard scopes like iii:* to be minted if configured/registered. That contradicts the stated behavior that wildcard scopes are policy-time only and must not appear in issued tokens.
Suggested fix
- if scope == "function:*" || scope == "trigger:*" {
+ if scope.ends_with(":*") {
anyhow::bail!("wildcard scopes can be registered but cannot be issued: {scope}");
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if scope == "function:*" || scope == "trigger:*" { | |
| anyhow::bail!("wildcard scopes can be registered but cannot be issued: {scope}"); | |
| } | |
| if scope.ends_with(":*") { | |
| anyhow::bail!("wildcard scopes can be registered but cannot be issued: {scope}"); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@auth/src/lib.rs` around lines 338 - 340, The current check only blocks
"function:*" and "trigger:*" but allows other wildcard scopes; change the
validation to generically reject any scope containing the wildcard character
(e.g., if scope.contains('*') { anyhow::bail!(...); }) so no scopes with '*' can
be issued; update the error message to state wildcard scopes are policy-time
only and include the offending scope variable name (scope) to aid debugging.
Summary
authRust worker that owns theauth::*OAuth authority functionsfunction:*,trigger:*, andiii:*behind an admin registration token so public DCR cannot mint engine-level authorityiii-stateby default, with bounded state timeouts and an in-memory fallback for tests/local experimentsFunctions
auth::validateauth::server_metadataauth::resource_metadataauth::registerauth::jwksauth::jwks_rotateauth::tokenauth::introspectauth::revokeHTTP routes
GET /.well-known/oauth-authorization-server→auth::server_metadataGET /.well-known/oauth-protected-resource→auth::resource_metadataGET /.well-known/jwks.json→auth::jwksPOST /register→auth::registerPOST /token→auth::tokenPOST /introspect→auth::introspectPOST /revoke→auth::revokeSecurity hardening
mcp:toolsanda2a:messageAuthorization: Bearer $III_AUTH_REGISTRATION_TOKENoradmin_tokenclient_secret_basicandclient_secret_postare enforced according to each registered client's configured auth methodauth::validatehttp://issuer andws://engine URLsclient_credentialsandrefresh_tokenValidation
cargo fmt --manifest-path auth/Cargo.toml --all -- --checkcargo clippy --manifest-path auth/Cargo.toml --all-targets --all-features -- -D warningscargo test --manifest-path auth/Cargo.toml --all-featuresuv run --with pyyaml python .github/scripts/validate_worker.py --worker auth --base-ref main --source-changed '["auth"]'git diff --checkpython3 -m json.tool registry/index.jsonReal engine proof
Tested against a separate local iii engine from
/Users/rohitghumare/iiiso the existing agentmemory engine was not touched.iii 0.11.7-next.1iii-sdk 0.11.7-next.149234WebSocket,3211HTTPReal-engine test result:
The real test covered:
auth::registerpublic registration formcp:toolsfunction:*/iii:*scopes through bothiii triggerand HTTPIII_AUTH_REGISTRATION_TOKENauth::tokenclient credentials flowauth::validatepublic and privileged RBAC decisionsauth::introspectactive-token responseauth::revokeinvalidating an access tokenclient_secret_basicsuccess path and wrong-method rejectionGET /.well-known/oauth-authorization-serverGET /.well-known/jwks.jsonPOST /register,POST /token, andPOST /introspectstate::getunderauth:clientsEngine follow-up found by real testing
The auth worker itself passed the real engine test, but full worker-manager RBAC enablement still needs an iii engine follow-up.
When
iii-worker-manageris hot-reloaded to setrbac.auth_function_id: auth::validate, the engine restarts worker-manager, drops the existing auth worker connection, unregistersauth::validate, and then the auth worker cannot reconnect because RBAC now depends on the missing auth function. The same shape needs a clean bootstrap/trusted-internal path for cold start too.This PR should be reviewed as the auth worker foundation. Treat engine-level RBAC bootstrap as a separate engine PR before claiming end-to-end worker-manager enforcement is fully baked.
Notes
External IdP bridge adapters are intentionally left as a follow-up. This PR proves the local OAuth authority path first and keeps the interface ready for IdP-backed issuance later.
Summary by CodeRabbit