Skip to content

fix(inference): fail closed when BYOK intent cannot resolve a provider#2489

Merged
senamakel merged 2 commits into
tinyhumansai:mainfrom
M3gA-Mind:fix/inference-routing-fail-closed
May 22, 2026
Merged

fix(inference): fail closed when BYOK intent cannot resolve a provider#2489
senamakel merged 2 commits into
tinyhumansai:mainfrom
M3gA-Mind:fix/inference-routing-fail-closed

Conversation

@M3gA-Mind
Copy link
Copy Markdown
Contributor

@M3gA-Mind M3gA-Mind commented May 22, 2026

Summary

  • When inference_url pointed at a non-openhuman endpoint but no matching cloud_providers entry existed, resolve_primary_cloud_provider_string silently fell back to PROVIDER_OPENHUMAN — routing prompts through the managed backend even when the user intended BYOK/custom routing.
  • This PR adds fail-closed behavior: a BYOK_INCOMPLETE_SENTINEL string is returned instead, which create_chat_provider_from_string catches early and turns into an actionable BYOK_INCOMPLETE error naming the configured inference_url and telling the user exactly how to fix it.
  • Explicit workload routes (reasoning_provider = "openhuman") and explicitly managed-backend configs are unaffected.

Changes

File Change
factory.rs BYOK_INCOMPLETE_SENTINEL constant; has_custom_inference_intent(); sentinel path in resolve_primary_cloud_provider_string; early sentinel detection + error in create_chat_provider_from_string
factory_test.rs Rename legacy_…_stays_on_openhuman_primary → expects sentinel; 6 new BYOK regression tests
provider/mod.rs Re-export BYOK_INCOMPLETE_SENTINEL

Test plan

  • cargo check -p openhuman — clean
  • cargo fmt --all -- --check — clean
  • cargo test -p openhuman -- factory_test — 54/54 pass (all pre-existing + 6 new BYOK tests)

Related

Closes #2483

Summary by CodeRabbit

  • Bug Fixes
    • Fail-closed behavior added when a custom/incoming inference URL lacks a matching cloud-provider route, preventing silent fallback to the managed backend.
    • Error messages improved to clearly identify the problematic inference URL and direct remediation toward the cloud provider route and inference_url configuration.
    • Routing now respects explicit per-workload provider settings and surfaces configuration issues earlier.

Review Change Stack

When inference_url is set to a non-openhuman endpoint but no matching
cloud_providers entry exists, the factory was silently falling back to
the managed OpenHuman backend — violating the trust boundary for BYOK
and custom-routing configs.

Changes:
- Add BYOK_INCOMPLETE_SENTINEL constant returned by
  resolve_primary_cloud_provider_string when BYOK intent is detected
  (inference_url points at a non-openhuman host) but no matching
  cloud_providers entry can be found.
- has_custom_inference_intent() helper detects the BYOK signal.
- create_chat_provider_from_string() catches the sentinel early and
  surfaces an actionable BYOK_INCOMPLETE error naming the inference_url
  and telling the user how to fix it (add a cloud_providers entry or
  clear inference_url).
- Update legacy_inference_url_without_matching_provider test: now
  correctly expects the sentinel, not "openhuman".
- Six new regression tests covering BYOK sentinel paths, explicit-route
  bypass, openhuman-URL non-triggering, and error message content.

Closes tinyhumansai#2483
@M3gA-Mind M3gA-Mind requested a review from a team May 22, 2026 12:41
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 22, 2026

📝 Walkthrough

Walkthrough

This PR adds a BYOK fail-closed sentinel and detection: when a non-OpenHuman inference_url intent exists but no matching cloud_providers entry or workload route is found, provider resolution returns a sentinel and provider creation returns an actionable configuration error instead of falling back to the managed OpenHuman backend.

Changes

Fail-closed BYOK inference routing

Layer / File(s) Summary
Fail-closed sentinel definition and public export
src/openhuman/inference/provider/factory.rs, src/openhuman/inference/provider/mod.rs
Introduce BYOK_INCOMPLETE_SENTINEL constant and export it from the provider module.
Custom inference intent detection and sentinel return
src/openhuman/inference/provider/factory.rs
Add redact_inference_url and has_custom_inference_intent helpers and update resolve_primary_cloud_provider_string to return the sentinel when custom/BYOK inference_url intent exists but no matching cloud_providers entry is resolved.
Sentinel detection and provider-creation error handling
src/openhuman/inference/provider/factory.rs
Update create_chat_provider_from_string to detect BYOK_INCOMPLETE_SENTINEL early and return an actionable configuration error mentioning the configured inference_url and missing cloud_providers entry.
Test coverage for BYOK fail-closed routing
src/openhuman/inference/provider/factory_test.rs
Adjust legacy routing expectation and add tests covering sentinel behavior, successful resolution when cloud_providers matches, managed-backend preservation, explicit workload-provider bypass, and sentinel error message contents.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • tinyhumansai/openhuman#2014: Modifies src/openhuman/inference/provider/factory.rs provider creation/resolution flow and is directly related to the same decision points.

Suggested reviewers

  • senamakel
  • graycyrus

Poem

🐰 A little rabbit checks the cloud,
When custom routes are not allowed,
A sentinel stands, clear and bright,
"Fix your config" — it shows the light,
Hop on, sort hosts, and set things right.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main change: implementing fail-closed behavior when BYOK (Bring Your Own Key) intent cannot resolve a provider, which is the primary objective of the PR.
Linked Issues check ✅ Passed The PR fully addresses issue #2483 objectives: prevents silent fallback to OpenHuman when custom inference intent lacks a matching provider entry, returns fail-closed sentinel with actionable diagnostics, preserves explicit managed-backend routing, and includes comprehensive test coverage.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the linked issue #2483: adding BYOK_INCOMPLETE_SENTINEL, fail-closed logic in resolve_primary_cloud_provider_string, error detection in create_chat_provider_from_string, and BYOK-focused regression tests.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the bug label May 22, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/openhuman/inference/provider/factory_test.rs (1)

742-745: ⚡ Quick win

Strengthen the matching-entry BYOK assertion to an exact expected route.

Line 742 currently only asserts “not sentinel”. This would still pass if resolution regressed to "openhuman". Assert the exact provider string (e.g. "custom:gpt-4o") to lock the intended behavior.

Suggested patch
-    assert_ne!(
-        provider_for_role("reasoning", &config),
-        BYOK_INCOMPLETE_SENTINEL
-    );
+    assert_eq!(
+        provider_for_role("reasoning", &config),
+        "custom:gpt-4o"
+    );
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/inference/provider/factory_test.rs` around lines 742 - 745, The
test currently uses assert_ne!(provider_for_role("reasoning", &config),
BYOK_INCOMPLETE_SENTINEL) which only ensures the result is not the sentinel;
change it to assert_eq! and verify the exact expected provider string (e.g.
assert_eq!(provider_for_role("reasoning", &config), "custom:gpt-4o")) so the
test locks the exact routing behavior of provider_for_role rather than allowing
regressions to other non-sentinel values; keep usage of the same
provider_for_role("reasoning", &config) call and BYOK_INCOMPLETE_SENTINEL symbol
in the surrounding assertions unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/inference/provider/factory.rs`:
- Around line 364-368: Change the two log::warn! calls that print
config.inference_url to a lower verbosity (debug! or trace!) and avoid printing
the full URL: redact or extract only the host portion before logging (e.g.,
parse config.inference_url with Url::parse or a helper redact_inference_url()
and log only host_or_redacted). Update the messages in the BYOK detection branch
(the log that mentions the fail-closed sentinel) and the later diagnostic branch
(the second warn at the other site) to use debug!/trace! and the redacted host
value instead of the full config.inference_url so no credential-bearing URL is
written to logs.

---

Nitpick comments:
In `@src/openhuman/inference/provider/factory_test.rs`:
- Around line 742-745: The test currently uses
assert_ne!(provider_for_role("reasoning", &config), BYOK_INCOMPLETE_SENTINEL)
which only ensures the result is not the sentinel; change it to assert_eq! and
verify the exact expected provider string (e.g.
assert_eq!(provider_for_role("reasoning", &config), "custom:gpt-4o")) so the
test locks the exact routing behavior of provider_for_role rather than allowing
regressions to other non-sentinel values; keep usage of the same
provider_for_role("reasoning", &config) call and BYOK_INCOMPLETE_SENTINEL symbol
in the surrounding assertions unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 598a54f4-3356-42b1-909e-26aab9653781

📥 Commits

Reviewing files that changed from the base of the PR and between 733fcfe and b740f2a.

📒 Files selected for processing (3)
  • src/openhuman/inference/provider/factory.rs
  • src/openhuman/inference/provider/factory_test.rs
  • src/openhuman/inference/provider/mod.rs

Comment thread src/openhuman/inference/provider/factory.rs Outdated
…est precision

Downgrade BYOK sentinel log calls from warn to debug (these are
configuration-in-progress states, not errors), redact the inference_url
to only expose the host portion so credentials in query strings are never
logged, and tighten the test assertion to lock the exact resolved
provider string instead of just ruling out the sentinel.
@M3gA-Mind
Copy link
Copy Markdown
Contributor Author

Addressed the two CodeRabbit review items in commit 08473ee:

  1. Log verbosity — downgraded both BYOK sentinel log calls from warn to debug. These fire during a configuration-in-progress state (user has set inference_url but hasn't added the matching cloud_providers entry yet), which is expected during BYOK setup and not an operational error.

  2. URL redaction — replaced inference_url={:?} (full URL including path/query) with host={} via a new redact_inference_url() helper that extracts only the host component. This prevents any credentials embedded in query strings from appearing in logs.

  3. Test precision — changed assert_ne!(provider_for_role("reasoning", &config), BYOK_INCOMPLETE_SENTINEL) to assert_eq!(provider_for_role("reasoning", &config), "custom:gpt-4o") so the test locks the exact routing outcome rather than just ruling out the sentinel.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/openhuman/inference/provider/factory.rs (1)

400-421: 💤 Low value

IPv6 literal edge case in host extraction.

The rfind(':') on Line 413 will incorrectly strip part of an IPv6 address (e.g., [::1]:8080[ after processing). Since this is only for logging and defaults to "<redacted>" on issues, the impact is cosmetic. Consider handling bracketed IPv6 literals if BYOK endpoints commonly use them.

         let host = authority
             .rfind('@')
             .map_or(authority, |i| &authority[i + 1..]);
-        let host = host.rfind(':').map_or(host, |i| &host[..i]);
+        // Strip port only if not an IPv6 literal (bracketed).
+        let host = if host.starts_with('[') {
+            host.find("]:").map_or(host, |i| &host[..i + 1])
+        } else {
+            host.rfind(':').map_or(host, |i| &host[..i])
+        };
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/inference/provider/factory.rs` around lines 400 - 421,
redact_inference_url's host-port stripping incorrectly chops bracketed IPv6
literals; update the logic in redact_inference_url so that if the extracted
authority starts with '[' you locate the matching ']' and keep the bracketed
literal (and do not apply rfind(':') across the whole string), otherwise keep
the existing rfind(':') behavior to strip a port; reference the authority/host
variables in the function and conditionally branch on host.starts_with('[') to
find ']' and slice accordingly before returning the host or "<redacted>".
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/openhuman/inference/provider/factory.rs`:
- Around line 400-421: redact_inference_url's host-port stripping incorrectly
chops bracketed IPv6 literals; update the logic in redact_inference_url so that
if the extracted authority starts with '[' you locate the matching ']' and keep
the bracketed literal (and do not apply rfind(':') across the whole string),
otherwise keep the existing rfind(':') behavior to strip a port; reference the
authority/host variables in the function and conditionally branch on
host.starts_with('[') to find ']' and slice accordingly before returning the
host or "<redacted>".

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e87a1a49-14a7-40cf-8ac5-e4c4c6c09217

📥 Commits

Reviewing files that changed from the base of the PR and between b740f2a and 08473ee.

📒 Files selected for processing (2)
  • src/openhuman/inference/provider/factory.rs
  • src/openhuman/inference/provider/factory_test.rs

@senamakel senamakel merged commit e7e660f into tinyhumansai:main May 22, 2026
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fail closed when custom inference routing cannot leave the hosted backend

3 participants