fix(inference): fail closed when BYOK intent cannot resolve a provider#2489
Conversation
When inference_url is set to a non-openhuman endpoint but no matching cloud_providers entry exists, the factory was silently falling back to the managed OpenHuman backend — violating the trust boundary for BYOK and custom-routing configs. Changes: - Add BYOK_INCOMPLETE_SENTINEL constant returned by resolve_primary_cloud_provider_string when BYOK intent is detected (inference_url points at a non-openhuman host) but no matching cloud_providers entry can be found. - has_custom_inference_intent() helper detects the BYOK signal. - create_chat_provider_from_string() catches the sentinel early and surfaces an actionable BYOK_INCOMPLETE error naming the inference_url and telling the user how to fix it (add a cloud_providers entry or clear inference_url). - Update legacy_inference_url_without_matching_provider test: now correctly expects the sentinel, not "openhuman". - Six new regression tests covering BYOK sentinel paths, explicit-route bypass, openhuman-URL non-triggering, and error message content. Closes tinyhumansai#2483
📝 WalkthroughWalkthroughThis PR adds a BYOK fail-closed sentinel and detection: when a non-OpenHuman ChangesFail-closed BYOK inference routing
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/openhuman/inference/provider/factory_test.rs (1)
742-745: ⚡ Quick winStrengthen the matching-entry BYOK assertion to an exact expected route.
Line 742 currently only asserts “not sentinel”. This would still pass if resolution regressed to
"openhuman". Assert the exact provider string (e.g."custom:gpt-4o") to lock the intended behavior.Suggested patch
- assert_ne!( - provider_for_role("reasoning", &config), - BYOK_INCOMPLETE_SENTINEL - ); + assert_eq!( + provider_for_role("reasoning", &config), + "custom:gpt-4o" + );🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/inference/provider/factory_test.rs` around lines 742 - 745, The test currently uses assert_ne!(provider_for_role("reasoning", &config), BYOK_INCOMPLETE_SENTINEL) which only ensures the result is not the sentinel; change it to assert_eq! and verify the exact expected provider string (e.g. assert_eq!(provider_for_role("reasoning", &config), "custom:gpt-4o")) so the test locks the exact routing behavior of provider_for_role rather than allowing regressions to other non-sentinel values; keep usage of the same provider_for_role("reasoning", &config) call and BYOK_INCOMPLETE_SENTINEL symbol in the surrounding assertions unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/openhuman/inference/provider/factory.rs`:
- Around line 364-368: Change the two log::warn! calls that print
config.inference_url to a lower verbosity (debug! or trace!) and avoid printing
the full URL: redact or extract only the host portion before logging (e.g.,
parse config.inference_url with Url::parse or a helper redact_inference_url()
and log only host_or_redacted). Update the messages in the BYOK detection branch
(the log that mentions the fail-closed sentinel) and the later diagnostic branch
(the second warn at the other site) to use debug!/trace! and the redacted host
value instead of the full config.inference_url so no credential-bearing URL is
written to logs.
---
Nitpick comments:
In `@src/openhuman/inference/provider/factory_test.rs`:
- Around line 742-745: The test currently uses
assert_ne!(provider_for_role("reasoning", &config), BYOK_INCOMPLETE_SENTINEL)
which only ensures the result is not the sentinel; change it to assert_eq! and
verify the exact expected provider string (e.g.
assert_eq!(provider_for_role("reasoning", &config), "custom:gpt-4o")) so the
test locks the exact routing behavior of provider_for_role rather than allowing
regressions to other non-sentinel values; keep usage of the same
provider_for_role("reasoning", &config) call and BYOK_INCOMPLETE_SENTINEL symbol
in the surrounding assertions unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 598a54f4-3356-42b1-909e-26aab9653781
📒 Files selected for processing (3)
src/openhuman/inference/provider/factory.rssrc/openhuman/inference/provider/factory_test.rssrc/openhuman/inference/provider/mod.rs
…est precision Downgrade BYOK sentinel log calls from warn to debug (these are configuration-in-progress states, not errors), redact the inference_url to only expose the host portion so credentials in query strings are never logged, and tighten the test assertion to lock the exact resolved provider string instead of just ruling out the sentinel.
|
Addressed the two CodeRabbit review items in commit 08473ee:
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/openhuman/inference/provider/factory.rs (1)
400-421: 💤 Low valueIPv6 literal edge case in host extraction.
The
rfind(':')on Line 413 will incorrectly strip part of an IPv6 address (e.g.,[::1]:8080→[after processing). Since this is only for logging and defaults to"<redacted>"on issues, the impact is cosmetic. Consider handling bracketed IPv6 literals if BYOK endpoints commonly use them.let host = authority .rfind('@') .map_or(authority, |i| &authority[i + 1..]); - let host = host.rfind(':').map_or(host, |i| &host[..i]); + // Strip port only if not an IPv6 literal (bracketed). + let host = if host.starts_with('[') { + host.find("]:").map_or(host, |i| &host[..i + 1]) + } else { + host.rfind(':').map_or(host, |i| &host[..i]) + };🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/inference/provider/factory.rs` around lines 400 - 421, redact_inference_url's host-port stripping incorrectly chops bracketed IPv6 literals; update the logic in redact_inference_url so that if the extracted authority starts with '[' you locate the matching ']' and keep the bracketed literal (and do not apply rfind(':') across the whole string), otherwise keep the existing rfind(':') behavior to strip a port; reference the authority/host variables in the function and conditionally branch on host.starts_with('[') to find ']' and slice accordingly before returning the host or "<redacted>".
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/openhuman/inference/provider/factory.rs`:
- Around line 400-421: redact_inference_url's host-port stripping incorrectly
chops bracketed IPv6 literals; update the logic in redact_inference_url so that
if the extracted authority starts with '[' you locate the matching ']' and keep
the bracketed literal (and do not apply rfind(':') across the whole string),
otherwise keep the existing rfind(':') behavior to strip a port; reference the
authority/host variables in the function and conditionally branch on
host.starts_with('[') to find ']' and slice accordingly before returning the
host or "<redacted>".
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e87a1a49-14a7-40cf-8ac5-e4c4c6c09217
📒 Files selected for processing (2)
src/openhuman/inference/provider/factory.rssrc/openhuman/inference/provider/factory_test.rs
Summary
inference_urlpointed at a non-openhuman endpoint but no matchingcloud_providersentry existed,resolve_primary_cloud_provider_stringsilently fell back toPROVIDER_OPENHUMAN— routing prompts through the managed backend even when the user intended BYOK/custom routing.BYOK_INCOMPLETE_SENTINELstring is returned instead, whichcreate_chat_provider_from_stringcatches early and turns into an actionableBYOK_INCOMPLETEerror naming the configuredinference_urland telling the user exactly how to fix it.reasoning_provider = "openhuman") and explicitly managed-backend configs are unaffected.Changes
factory.rsBYOK_INCOMPLETE_SENTINELconstant;has_custom_inference_intent(); sentinel path inresolve_primary_cloud_provider_string; early sentinel detection + error increate_chat_provider_from_stringfactory_test.rslegacy_…_stays_on_openhuman_primary→ expects sentinel; 6 new BYOK regression testsprovider/mod.rsBYOK_INCOMPLETE_SENTINELTest plan
cargo check -p openhuman— cleancargo fmt --all -- --check— cleancargo test -p openhuman -- factory_test— 54/54 pass (all pre-existing + 6 new BYOK tests)Related
Closes #2483
Summary by CodeRabbit