Skip to content

Refactor inference around external Ollama routing#1975

Merged
senamakel merged 22 commits into
tinyhumansai:mainfrom
senamakel:codex/inference-external-ollama-routing
May 17, 2026
Merged

Refactor inference around external Ollama routing#1975
senamakel merged 22 commits into
tinyhumansai:mainfrom
senamakel:codex/inference-external-ollama-routing

Conversation

@senamakel
Copy link
Copy Markdown
Member

@senamakel senamakel commented May 17, 2026

Summary

  • add a first-class src/openhuman/inference core module and register a dedicated openhuman.inference_* controller surface
  • keep the local-model UI talking only to core while shifting Ollama integration to direct external runtime routing via core/inference
  • remove legacy app-managed Ollama lifecycle RPCs and strip renderer calls that assumed install/start/bootstrap ownership
  • update local-model diagnostics, onboarding, and debug UI copy to reflect external-runtime management instead of app-managed Ollama setup
  • add focused Rust and frontend coverage for inference routing, external-Ollama failure messaging, and the remaining local-model UI surfaces

Problem

  • inference behavior was coupled to the legacy local_ai catch-all module, which still assumed OpenHuman could install, launch, and manage Ollama directly
  • that contract conflicted with the intended product model: the desktop UI must always talk to core, and core should route inference to an already-running Ollama-compatible endpoint
  • the old RPC/UI surface also left stale lifecycle controls in place, which made the app promise runtime-management behavior it should not own

Solution

  • introduced a dedicated inference domain in core and rewired callers onto it for status, prompt, summarize, chat, and lightweight inference helpers
  • changed Ollama request paths to surface explicit external-runtime guidance on connection failure instead of raw low-level transport errors
  • removed registered legacy Ollama-management RPCs (download, download_all_assets, set_ollama_path, shutdown_owned) and simplified onboarding/bootstrap helpers to preset application only
  • preserved the Ollama-facing UI panels, but turned them into endpoint-status/diagnostics/model-visibility surfaces that instruct users to manage Ollama externally

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
  • Diff coverage ≥ 80% — N/A: not measured locally; CI coverage gate remains authoritative for this PR
  • Coverage matrix updated — N/A: behaviour/architecture refactor only; no matrix row rename/add/remove made
  • All affected feature IDs from the matrix are listed in the PR description under ## Related — N/A: no explicit matrix feature IDs were mapped during this refactor pass
  • No new external network dependencies introduced (mock backend used per Testing Strategy)
  • Manual smoke checklist updated if this touches release-cut surfaces (docs/RELEASE-MANUAL-SMOKE.md) — N/A: no release smoke doc changes required for this code-only refactor
  • Linked issue closed via Closes #NNN in the ## Related section — N/A: no issue number was provided in-session

Impact

  • desktop/core inference behavior now consistently routes through core-owned controllers and external Ollama-compatible APIs
  • local-model settings/debug UX remains available, but runtime ownership expectations are removed in favor of external endpoint management
  • compatibility impact: callers using removed legacy lifecycle RPCs must move to the surviving inference/status/diagnostics flows

Related

  • Closes: N/A
  • Follow-up PR(s)/TODOs:
    • measure changed-line coverage against the full merged gate and close remaining gaps if CI reports them
    • optionally migrate more surviving local_ai_* read-path callers to the inference_* namespace over time

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: codex/inference-external-ollama-routing
  • Commit SHA: 12bed1a522dec88b292f509796eee6e76d75add6

Validation Run

  • pnpm --filter openhuman-app format:check
  • pnpm typecheck
  • Focused tests:
    • pnpm debug unit src/components/settings/panels/local-model/ModelStatusSection.test.tsx src/utils/localAiBootstrap.test.ts src/pages/onboarding/steps/__tests__/LocalAIStep.test.tsx
    • pnpm debug unit src/services/api/__tests__/aiSettingsApi.test.ts
    • pnpm debug unit src/components/settings/panels/local-model/ModelDownloadSection.test.tsx
    • pnpm debug unit src/components/settings/panels/local-model/DeviceCapabilitySection.test.tsx
    • cargo test --manifest-path Cargo.toml inference:: --lib
    • cargo test --manifest-path Cargo.toml inference_connection_failure_mentions_external_ollama_runtime --lib
    • cargo test --manifest-path Cargo.toml json_rpc_inference_namespace_lm_studio_prompt_and_status --test json_rpc_e2e
    • cargo test --manifest-path Cargo.toml json_rpc_inference_prompt_requires_external_ollama_runtime_when_unreachable --test json_rpc_e2e
  • Rust fmt/check (if changed): cargo fmt --manifest-path Cargo.toml --all --check, cargo check --manifest-path Cargo.toml
  • Tauri fmt/check (if changed): cargo fmt --manifest-path app/src-tauri/Cargo.toml --all --check, cargo check --manifest-path app/src-tauri/Cargo.toml

Validation Blocked

  • command: pnpm test:coverage
  • error: not completed during shipping pass; no final coverage report captured locally
  • impact: changed-line coverage confirmation is deferred to CI unless rerun locally in a follow-up

Behavior Changes

  • Intended behavior change: OpenHuman no longer advertises app-managed Ollama lifecycle behavior; core routes inference to external Ollama-compatible APIs instead
  • User-visible effect: local-model UI still exists, but it now guides users toward an already-running Ollama runtime rather than trying to install/start one itself

Parity Contract

  • Legacy behavior preserved: UI still talks only to core; local-model status/diagnostics/test surfaces remain present
  • Guard/fallback/dispatch parity checks: targeted Rust and JSON-RPC tests cover the new inference_* surface plus external-Ollama unreachable and LM Studio happy paths

Duplicate / Superseded PR Handling

  • Duplicate PR(s): N/A
  • Canonical PR: this PR
  • Resolution (closed/superseded/updated): N/A

Summary by CodeRabbit

  • New Features

    • OpenAI-compatible /v1 HTTP API for chat completions and model listing.
    • Per-model temperature suppression for specific reasoning models.
  • Bug Fixes

    • Clearer error and guidance text when external inference runtimes are unreachable.
  • Behavior Changes

    • Ollama treated as an external runtime — install/download/repair flows removed from the app; UI guidance and onboarding text updated.
    • Tenor/GIF decision/search paths removed.
  • Tests & Docs

    • Added Rust Wiremock E2E tests and updated local-inference docs.

Review Change Stack

@senamakel senamakel requested a review from a team May 17, 2026 00:26
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 17, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a new inference namespace, routes AI RPCs through it, treats Ollama as an externally managed runtime (removes in-app install/download/path controls), updates provider APIs (temperature suppression, model listing), implements OpenAI-compatible /v1 HTTP endpoints, and updates frontend/backend tests and CI e2e tooling.

Changes

Inference externalization and routing

Layer / File(s) Summary
Frontend tauri & debug UI
app/src/utils/tauriCommands/localAi.ts, app/src/components/settings/panels/*, app/src/utils/localAiBootstrap.ts
Re-route frontend calls to openhuman.inference_*, remove download/bootstrap/path controls from debug/settings UI, hardcode/silence removed callbacks, and simplify bootstrap helper to only select/apply presets.
AI settings façade
app/src/services/api/aiSettingsApi.ts, app/src/services/rpcMethods.ts
Trim localProvider façade to presets and enablement; update CORE_RPC_METHODS and legacy aliases to inference_*; change listProviderModels RPC target.
Inference module and RPC schemas/ops
src/openhuman/inference/*, src/core/all.rs, src/core/legacy_aliases.rs
Add inference module with ops/schemas/registries, re-export controller schemas, register controllers in core, and migrate legacy aliases to new inference methods.
Local runtime service
src/openhuman/inference/local/*, src/openhuman/inference/local/service/*
Treat Ollama as external: health checks only, remove automatic install/start/download repair actions, update diagnostics/asset-status to prefer runtime reachability, and expose revised visibility for needed helpers.
Provider refactor & model listing
src/openhuman/inference/provider/*, src/openhuman/inference/provider/ops.rs
Move provider APIs under inference::provider, add temperature suppression (glob patterns), add list_configured_models/ModelInfo, update compatible provider construction to accept suppression list.
OpenAI-compatible HTTP endpoints
src/openhuman/inference/http/*, src/core/jsonrpc.rs
Add /v1/chat/completions and /v1/models router and types, mount under core router, and add auth-tests.
Frontend tests & e2e
app/src/components/*test*.tsx, app/test/e2e/*, app/src/utils/__tests__/*
Update unit tests for UI guidance and preset application, skip UI bootstrap e2e, update tauri command tests to expect inference_* methods.
Backend tests & e2e
src/openhuman/inference/*_tests.rs, tests/inference_provider_e2e.rs, Cargo.toml, e2e/docker-compose.yml, scripts/
Add Rust inference ops/schemas tests, Wiremock-based provider e2e tests, dev-dep wiremock, CI docker service and runner script for inference e2e.
Broad import/path updates
many src/* and app/* files
Update imports and type paths from providers/local_ai to inference::provider/inference::local, preserving runtime behavior while moving module boundaries.

Sequence Diagram(s)

sequenceDiagram
  participant App as Frontend
  participant RPC as Tauri / frontend RPC
  participant Core as Core inference RPC
  participant Local as Local inference service
  participant Ext as External runtime (Ollama / LM Studio)

  App->>RPC: inference_prompt
  RPC->>Core: openhuman.inference_prompt
  Core->>Local: inference_prompt(...)
  alt provider = ollama/lm_studio
    Local->>Ext: HTTP /v1/chat or Ollama API
    Ext-->>Local: response/stream
  else cloud provider
    Local->>Ext: provider endpoint (with auth)
    Ext-->>Local: response/errors
  end
  Local-->>Core: RpcOutcome
  Core-->>RPC: result
  RPC-->>App: render status/guidance
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

"A rabbit taps the runtime’s rim,
No installs now — a simpler hymn.
Inference paths all routed through,
Ollama hums outside the crew.
Tests pass, docs note how to run —
We hop, we link, the work is done. 🐇"

@coderabbitai coderabbitai Bot added the working A PR that is being worked on by the team. label May 17, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/openhuman/local_ai/README.md (1)

3-3: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update the module synopsis to match external-runtime ownership.

Line 3 still says local_ai “owns the bundled Ollama runtime” and “download + install management,” which contradicts the new external-runtime contract.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/local_ai/README.md` at line 3, Update the module synopsis in
the README for the local_ai module to remove claims that it "owns the bundled
Ollama runtime" and "download + install management" and instead state that
external-runtime now owns the bundled runtimes and asset install/management;
keep references to what local_ai still owns (e.g., LM Studio integration,
whisper.cpp, Piper, sentiment scoring, vision-embedding routing, GIF heuristic,
and the per-session LocalAiService singleton) and clarify that remote-provider
HTTP transport (providers/) and agent tool loop (agent/) remain out of scope.
src/openhuman/local_ai/service/ollama_admin.rs (1)

46-52: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix stale contract docs for ensure_ollama_server_fresh.

The comment says this function forces a fresh install, but Line 52 now just delegates to ensure_ollama_server with no install/start behavior.

Suggested patch
-    /// Like `ensure_ollama_server`, but forces a fresh install of the Ollama binary
-    /// (ignoring cached/workspace binaries). Used as a retry after the first attempt fails.
+    /// Retry wrapper around `ensure_ollama_server`.
+    /// Kept for call-site compatibility; does not perform install/start side effects.
     pub(in crate::openhuman::local_ai::service) async fn ensure_ollama_server_fresh(
         &self,
         config: &Config,
     ) -> Result<(), String> {
         self.ensure_ollama_server(config).await
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/local_ai/service/ollama_admin.rs` around lines 46 - 52, The doc
comment for ensure_ollama_server_fresh is stale: the function currently just
delegates to ensure_ollama_server but the comment claims it forces a fresh
install; either update the comment to reflect delegation or implement the
fresh-install behavior. To fix, either (A) change the doc for
ensure_ollama_server_fresh to state that it currently calls ensure_ollama_server
and does not force reinstall, or (B) implement the intended behavior by making
ensure_ollama_server_fresh clear any cached/workspace binaries (or call an
existing install helper with a "force" flag) and then call the install/start
routines before returning; locate ensure_ollama_server_fresh and
ensure_ollama_server to add the cache-clear or force-install logic or to update
the comment accordingly.
🧹 Nitpick comments (5)
src/openhuman/local_ai/service/ollama_admin_tests.rs (1)

126-219: 🏗️ Heavy lift

Split this test module before adding more cases.

This file is already far above the Rust module size target, and these additions increase the maintenance/merge-conflict surface. Please move the new external-runtime diagnostics tests into a dedicated sibling test module.

As per coding guidelines: src/**/*.rs: Keep Rust source files to ≤ ~500 lines; split modules when growing larger.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/local_ai/service/ollama_admin_tests.rs` around lines 126 - 219,
The test module has grown too large; move the three new tests
(ensure_ollama_server_requires_external_runtime_when_unreachable,
ensure_ollama_server_reports_broken_external_runner_without_restart_attempt,
assets_status_marks_ollama_unavailable_when_runtime_is_down_even_if_binary_exists)
into a new sibling test module file to reduce module size and merge-conflict
surface. Create a separate test-only module file, import the same helpers (e.g.,
local_ai_test_guard, spawn_mock, Config, LocalAiService) and any env
setup/teardown logic, copy those three test functions verbatim into it, and
remove them from the original file; ensure the new file compiles by adding the
necessary use statements and keeping the #[tokio::test] annotations intact.
app/src/components/settings/panels/local-model/DeviceCapabilitySection.tsx (1)

45-51: ⚡ Quick win

Prune legacy install-state contract from this component.

These constants make install/retry paths permanently unreachable while the related props remain in the API. Remove the obsolete install props/branches now to avoid dead UI paths and no-op parent wiring.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src/components/settings/panels/local-model/DeviceCapabilitySection.tsx`
around lines 45 - 51, Remove the dead install-state contract by deleting the
unused install-related constants and any conditional branches that reference
them: remove void onTriggerOllamaInstall, void isTriggeringInstall, void
installState, void installWarning, void installError and the const
installInProgress/installFailed flags, then eliminate any UI paths or prop
wiring that check those symbols (e.g., install retry buttons, install warnings,
and install error displays) and the prop types/signatures that still accept
install props so the component and its parent no longer expect or pass these
obsolete install values (refer to DeviceCapabilitySection and any handlers named
onTriggerOllamaInstall/isTriggeringInstall).
app/test/e2e/specs/local-model-runtime.spec.ts (1)

46-49: ⚡ Quick win

Add an explicit re-enable tracker for this skipped suite.

describe.skip here can silently become permanent. Please add a linked TODO/issue with concrete unblock criteria for re-enabling this flow once the harness is available.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/e2e/specs/local-model-runtime.spec.ts` around lines 46 - 49, Add an
explicit re-enable tracker comment immediately above the describe.skip('Local
model runtime flow') declaration: create or link to a TODO/GitHub issue ID and
include clear unblock criteria (e.g., "deterministic mockable local-runtime
harness available for WDIO"), an owner, and a target milestone or PR that will
re-enable the test; ensure the comment references describe.skip so reviewers can
find it easily and update the issue link/ID when the harness exists.
app/src/components/settings/panels/local-model/ModelStatusSection.tsx (1)

74-87: ⚡ Quick win

Remove the now-obsolete install/path/repair props from this component API.

The props are kept only to be voided, which preserves stale call-site requirements and obscures the new external-runtime behavior. Tightening the prop surface here will simplify callers and reduce drift.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src/components/settings/panels/local-model/ModelStatusSection.tsx` around
lines 74 - 87, Remove the obsolete install/path/repair props from the
ModelStatusSection component API by deleting the unused prop declarations and
usages for isTriggeringDownload, bootstrapMessage, isInstalling, isInstallError,
showErrorDetail, ollamaPathInput, isSettingPath, runtimeEnabled,
onTriggerDownload, onSetOllamaPath, onClearOllamaPath, onSetOllamaPathInput,
onToggleErrorDetail, and onRepairAction; update the component's props
type/interface and its call sites to stop passing these props (use the new
external-runtime behavior instead), and remove any internal references or
conditional logic that only existed to support them so the component surface is
tightened and callers no longer need to supply voided props.
app/src/components/settings/panels/local-model/ModelStatusSection.test.tsx (1)

172-241: ⚡ Quick win

Extract a shared downloads fixture for the unavailable-Ollama tests.

The same large downloads object is duplicated across two tests, which makes schema updates error-prone.

♻️ Suggested cleanup
+const makeUnavailableOllamaDownloads = () => ({
+  state: 'idle',
+  warning: null,
+  progress: 0,
+  downloaded_bytes: null,
+  total_bytes: null,
+  speed_bps: null,
+  eta_seconds: null,
+  ollama_available: false,
+  chat: { id: 'gemma3:1b-it-qat', provider: 'ollama', state: 'missing', progress: null, downloaded_bytes: null, total_bytes: null, speed_bps: null, eta_seconds: null, warning: null, path: null },
+  vision: { id: '', provider: 'ollama', state: 'missing', progress: null, downloaded_bytes: null, total_bytes: null, speed_bps: null, eta_seconds: null, warning: null, path: null },
+  embedding: { id: 'bge-m3', provider: 'ollama', state: 'missing', progress: null, downloaded_bytes: null, total_bytes: null, speed_bps: null, eta_seconds: null, warning: null, path: null },
+  stt: { id: 'whisper', provider: 'whisper', state: 'missing', progress: null, downloaded_bytes: null, total_bytes: null, speed_bps: null, eta_seconds: null, warning: null, path: null },
+  tts: { id: 'piper', provider: 'piper', state: 'missing', progress: null, downloaded_bytes: null, total_bytes: null, speed_bps: null, eta_seconds: null, warning: null, path: null },
+});
...
-        downloads={{ ...large object... }}
+        downloads={makeUnavailableOllamaDownloads()}
...
-        downloads={{ ...large object... }}
+        downloads={makeUnavailableOllamaDownloads()}

Also applies to: 254-323

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src/components/settings/panels/local-model/ModelStatusSection.test.tsx`
around lines 172 - 241, Duplicate large downloads object used in the
unavailable-Ollama tests should be extracted to a shared fixture to avoid drift;
create a single named constant (e.g., unavailableOllamaDownloads)
exported/defined near the top of ModelStatusSection.test.tsx and replace the two
inline downloads objects in the tests with that constant (references: the
downloads object literal in the tests and the test cases that assert unavailable
Ollama behavior). Ensure the fixture preserves all keys shown (state, chat,
vision, embedding, stt, tts, etc.) and update both places (the duplicate at
lines ~172-241 and the similar block at ~254-323) to reference the new constant.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/inference/ops_tests.rs`:
- Around line 4-12: The helper disabled_config() currently creates a TempDir
that gets dropped on return causing workspace_dir and config_path to point to a
deleted location; change disabled_config() to return both the Config and the
TempDir (e.g., (Config, TempDir)) so the TempDir lives for the duration of each
test, update all tests that call disabled_config() to use the (config, _tmp)
pattern, and ensure you set config.workspace_dir and config.config_path from
tmp.path() before returning the tuple (referencing disabled_config,
config.workspace_dir, and config.config_path).

In `@src/openhuman/inference/ops.rs`:
- Around line 11-85: Add debug/trace logging to each inference delegate wrapper
(inference_status, inference_summarize, inference_prompt,
inference_vision_prompt, inference_embed, inference_chat,
inference_should_react, inference_analyze_sentiment, inference_should_send_gif,
inference_tenor_search): log entry with key inputs, log exit with outcome
(success path) and any important return metadata, and log errors/failures
including the error string from the awaited call (local_ai::rpc::... or
local_ai::sentiment::..., local_ai::gif_decision::...). Use tracing::debug or
tracing::trace at entry/exit and tracing::error when the Result is Err, and
avoid logging sensitive data (e.g., redact full messages if needed).

In `@src/openhuman/local_ai/ops.rs`:
- Around line 351-354: Replace the capability-specific success message "local ai
voice asset download triggered" used in the RpcOutcome::single_log call with a
capability-neutral message (e.g., "local ai asset download triggered" or include
a dynamic type if an asset_type/variant is available) so it applies to
chat/vision/embedding assets as well; locate the RpcOutcome::single_log(output,
"...") invocation in the function that returns Ok(RpcOutcome::single_log(...))
and update the string literal to a neutral wording or interpolate the asset type
variable if present.

In `@src/openhuman/local_ai/service/bootstrap.rs`:
- Around line 304-309: In the Err arm handling
ensure_ollama_server(&effective_config).await, add an explicit error log before
mutating status and returning: log a development-oriented message (e.g.,
"bootstrap failure: cannot connect to runtime") that includes the error (err)
and relevant context from effective_config so traces show why
ensure_ollama_server failed; place this log call immediately before obtaining
self.status.lock(), then leave the existing status.state, status.error_category,
and status.warning assignment (which uses format_degraded_warning) and the
return unchanged.

---

Outside diff comments:
In `@src/openhuman/local_ai/README.md`:
- Line 3: Update the module synopsis in the README for the local_ai module to
remove claims that it "owns the bundled Ollama runtime" and "download + install
management" and instead state that external-runtime now owns the bundled
runtimes and asset install/management; keep references to what local_ai still
owns (e.g., LM Studio integration, whisper.cpp, Piper, sentiment scoring,
vision-embedding routing, GIF heuristic, and the per-session LocalAiService
singleton) and clarify that remote-provider HTTP transport (providers/) and
agent tool loop (agent/) remain out of scope.

In `@src/openhuman/local_ai/service/ollama_admin.rs`:
- Around line 46-52: The doc comment for ensure_ollama_server_fresh is stale:
the function currently just delegates to ensure_ollama_server but the comment
claims it forces a fresh install; either update the comment to reflect
delegation or implement the fresh-install behavior. To fix, either (A) change
the doc for ensure_ollama_server_fresh to state that it currently calls
ensure_ollama_server and does not force reinstall, or (B) implement the intended
behavior by making ensure_ollama_server_fresh clear any cached/workspace
binaries (or call an existing install helper with a "force" flag) and then call
the install/start routines before returning; locate ensure_ollama_server_fresh
and ensure_ollama_server to add the cache-clear or force-install logic or to
update the comment accordingly.

---

Nitpick comments:
In `@app/src/components/settings/panels/local-model/DeviceCapabilitySection.tsx`:
- Around line 45-51: Remove the dead install-state contract by deleting the
unused install-related constants and any conditional branches that reference
them: remove void onTriggerOllamaInstall, void isTriggeringInstall, void
installState, void installWarning, void installError and the const
installInProgress/installFailed flags, then eliminate any UI paths or prop
wiring that check those symbols (e.g., install retry buttons, install warnings,
and install error displays) and the prop types/signatures that still accept
install props so the component and its parent no longer expect or pass these
obsolete install values (refer to DeviceCapabilitySection and any handlers named
onTriggerOllamaInstall/isTriggeringInstall).

In `@app/src/components/settings/panels/local-model/ModelStatusSection.test.tsx`:
- Around line 172-241: Duplicate large downloads object used in the
unavailable-Ollama tests should be extracted to a shared fixture to avoid drift;
create a single named constant (e.g., unavailableOllamaDownloads)
exported/defined near the top of ModelStatusSection.test.tsx and replace the two
inline downloads objects in the tests with that constant (references: the
downloads object literal in the tests and the test cases that assert unavailable
Ollama behavior). Ensure the fixture preserves all keys shown (state, chat,
vision, embedding, stt, tts, etc.) and update both places (the duplicate at
lines ~172-241 and the similar block at ~254-323) to reference the new constant.

In `@app/src/components/settings/panels/local-model/ModelStatusSection.tsx`:
- Around line 74-87: Remove the obsolete install/path/repair props from the
ModelStatusSection component API by deleting the unused prop declarations and
usages for isTriggeringDownload, bootstrapMessage, isInstalling, isInstallError,
showErrorDetail, ollamaPathInput, isSettingPath, runtimeEnabled,
onTriggerDownload, onSetOllamaPath, onClearOllamaPath, onSetOllamaPathInput,
onToggleErrorDetail, and onRepairAction; update the component's props
type/interface and its call sites to stop passing these props (use the new
external-runtime behavior instead), and remove any internal references or
conditional logic that only existed to support them so the component surface is
tightened and callers no longer need to supply voided props.

In `@app/test/e2e/specs/local-model-runtime.spec.ts`:
- Around line 46-49: Add an explicit re-enable tracker comment immediately above
the describe.skip('Local model runtime flow') declaration: create or link to a
TODO/GitHub issue ID and include clear unblock criteria (e.g., "deterministic
mockable local-runtime harness available for WDIO"), an owner, and a target
milestone or PR that will re-enable the test; ensure the comment references
describe.skip so reviewers can find it easily and update the issue link/ID when
the harness exists.

In `@src/openhuman/local_ai/service/ollama_admin_tests.rs`:
- Around line 126-219: The test module has grown too large; move the three new
tests (ensure_ollama_server_requires_external_runtime_when_unreachable,
ensure_ollama_server_reports_broken_external_runner_without_restart_attempt,
assets_status_marks_ollama_unavailable_when_runtime_is_down_even_if_binary_exists)
into a new sibling test module file to reduce module size and merge-conflict
surface. Create a separate test-only module file, import the same helpers (e.g.,
local_ai_test_guard, spawn_mock, Config, LocalAiService) and any env
setup/teardown logic, copy those three test functions verbatim into it, and
remove them from the original file; ensure the new file compiles by adding the
necessary use statements and keeping the #[tokio::test] annotations intact.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: deaef292-c2ea-4738-8474-77952bd66846

📥 Commits

Reviewing files that changed from the base of the PR and between f0b5fdb and 12bed1a.

📒 Files selected for processing (39)
  • app/src/components/settings/panels/LocalModelDebugPanel.tsx
  • app/src/components/settings/panels/local-model/DeviceCapabilitySection.test.tsx
  • app/src/components/settings/panels/local-model/DeviceCapabilitySection.tsx
  • app/src/components/settings/panels/local-model/ModelDownloadSection.test.tsx
  • app/src/components/settings/panels/local-model/ModelDownloadSection.tsx
  • app/src/components/settings/panels/local-model/ModelStatusSection.test.tsx
  • app/src/components/settings/panels/local-model/ModelStatusSection.tsx
  • app/src/pages/onboarding/steps/LocalAIStep.tsx
  • app/src/services/api/__tests__/aiSettingsApi.test.ts
  • app/src/services/api/aiSettingsApi.ts
  • app/src/utils/__tests__/localAiBootstrap.test.ts
  • app/src/utils/localAiBootstrap.ts
  • app/src/utils/tauriCommands/localAi.ts
  • app/test/e2e/specs/local-model-runtime.spec.ts
  • src/core/all.rs
  • src/core/cli_tests.rs
  • src/core/jsonrpc_tests.rs
  • src/openhuman/app_state/ops.rs
  • src/openhuman/channels/providers/presentation.rs
  • src/openhuman/inference/mod.rs
  • src/openhuman/inference/ops.rs
  • src/openhuman/inference/ops_tests.rs
  • src/openhuman/inference/schemas.rs
  • src/openhuman/inference/schemas_tests.rs
  • src/openhuman/local_ai/README.md
  • src/openhuman/local_ai/mod.rs
  • src/openhuman/local_ai/ops.rs
  • src/openhuman/local_ai/schemas.rs
  • src/openhuman/local_ai/schemas_tests.rs
  • src/openhuman/local_ai/service/assets.rs
  • src/openhuman/local_ai/service/bootstrap.rs
  • src/openhuman/local_ai/service/ollama_admin.rs
  • src/openhuman/local_ai/service/ollama_admin_tests.rs
  • src/openhuman/local_ai/service/public_infer.rs
  • src/openhuman/local_ai/service/public_infer_tests.rs
  • src/openhuman/local_ai/types.rs
  • src/openhuman/mod.rs
  • src/openhuman/subconscious/executor.rs
  • tests/json_rpc_e2e.rs
💤 Files with no reviewable changes (4)
  • app/src/services/api/aiSettingsApi.ts
  • src/core/jsonrpc_tests.rs
  • src/openhuman/local_ai/schemas_tests.rs
  • src/openhuman/local_ai/schemas.rs

Comment thread src/openhuman/inference/ops_tests.rs Outdated
Comment thread src/openhuman/inference/ops.rs
Comment thread src/openhuman/local_ai/ops.rs
Comment thread src/openhuman/inference/local/service/bootstrap.rs
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 17, 2026
senamakel added 4 commits May 16, 2026 18:39
…/inference/

Move local_ai/, providers/, and voice inference files under inference/ so that
the entire inference surface lives in one place:

  inference/local/   ← Ollama / LM Studio / Whisper / Piper runtime
  inference/provider/ ← cloud + local provider trait, routing, reliability
  inference/voice/   ← STT and TTS inference implementations
  inference/http/    ← OpenAI-compatible /v1/chat/completions + /v1/models

src/openhuman/local_ai/ and src/openhuman/providers/ become thin re-export shims
for backward compatibility; src/openhuman/voice/ re-exports the moved modules.

Add OpenAI-compatible HTTP endpoint mounted at /v1 with three integration tests
(no-bearer → 401, bearer present → not 401/403, GET /v1/models → 401 without
bearer). Both cargo check manifests (core + Tauri shell) pass clean.
…cker wiring

- temperature_unsupported_models config field (glob, default ["o1*","o3*","o4*","gpt-5*"])
  added to Config; serde default and Config::default() both initialize it
- temperature.rs: glob_match helper + temperature_for_model; 13 unit tests cover
  prefix/suffix/contains globs and all default model patterns
- ApiChatRequest and NativeChatRequest temperature fields changed to Option<f64>
  with skip_serializing_if, so matched models get no temperature key on the wire
- OpenAiCompatibleProvider gains temperature_unsupported_models field and
  effective_temperature() helper; all six request-body construction sites updated
- Factory threads the config list through make_cloud_provider_by_slug and
  make_ollama_provider via new with_temperature_unsupported_models builder
- server.rs logs a warning when the caller supplies temperature for an unsupported model
- tests/inference_provider_e2e.rs: 14 tests via wiremock covering OpenAI-compat chat,
  temperature present/absent, Anthropic auth style, streaming, Ollama, /v1 HTTP auth
- scripts/test-rust-inference-e2e.sh + e2e/docker-compose.yml inference-e2e service
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/openhuman/inference/voice/local_transcribe.rs (1)

18-23: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix stale module path in voice-install docs.

Line 23 still references local_ai/paths.rs; after this refactor it should point to the inference paths module.

📝 Proposed doc fix
-//! `local_ai/paths.rs` picks it up automatically — no env var to set.
+//! `inference/paths.rs` picks it up automatically — no env var to set.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/inference/voice/local_transcribe.rs` around lines 18 - 23,
Update the stale docs reference that currently points to local_ai/paths.rs:
change the module path to the new inference paths module so the comment about
install_whisper (crate::openhuman::inference::local::install_whisper) points to
the correct resolver (resolve_whisper_binary) in the inference paths module
(e.g., crate::openhuman::inference::paths::resolve_whisper_binary) so readers
can find the helper after the refactor.
src/openhuman/inference/voice/local_speech.rs (1)

29-42: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix stale module reference in the Piper docs.

Line 42 still points to local_ai/paths.rs, but this flow now resolves via the inference paths module.

📝 Proposed doc fix
-//! helper in `local_ai/paths.rs` picks it up automatically.
+//! helper in `src/openhuman/inference/paths.rs` picks it up automatically.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/inference/voice/local_speech.rs` around lines 29 - 42, Update
the stale doc link that currently points to local_ai/paths.rs so it references
the inference paths helper instead: replace the mention of local_ai/paths.rs
with the fully-qualified
crate::openhuman::inference::paths::resolve_piper_binary (and ensure surrounding
text still references the installer function
crate::openhuman::inference::local::install_piper and the same install flow), so
readers are directed to the correct resolve_piper_binary symbol in the inference
paths module.
🧹 Nitpick comments (2)
src/openhuman/inference/local/service/ollama_admin.rs (1)

42-45: ⚡ Quick win

Log unreachable external-runtime failures before returning.

This changed failure path returns a user-facing error but emits no log line with endpoint context. Add a warning log so production failures are grep-visible.

📈 Suggested fix
         let base_url = ollama_base_url();
+        log::warn!(
+            "[local_ai] external Ollama runtime unreachable at {}",
+            base_url
+        );
         Err(format!(
             "OpenHuman no longer starts or installs Ollama automatically. Start your inference runtime yourself and make sure it is reachable at {base_url}."
         ))
As per coding guidelines: "Ship heavy `debug`/`trace` level logs on new flows and critical checkpoints ... and error paths."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/inference/local/service/ollama_admin.rs` around lines 42 - 45,
The current error path builds and returns an Err with the ollama_base_url but
does not emit any log; before returning the Err in the Ollama admin flow (the
spot that calls ollama_base_url() and returns Err(format!(...))), add a
warning-level log that includes the same base_url and a brief context string so
production failures are grep-visible (use the project's logging macro, e.g.
warn! or tracing::warn!, to log the endpoint and message) and then return the
Err as before.
src/openhuman/inference/schemas.rs (1)

213-389: 💤 Low value

Consider returning Result instead of panicking for unknown schema names.

The panic! at line 387 will crash the process if schemas() is ever called with an unrecognized function name. While current internal usage via all_controller_schemas() and all_registered_controllers() uses hardcoded valid strings, this function is pub and could be called by other code with dynamic input.

♻️ Suggested defensive alternative
-pub fn schemas(function: &str) -> ControllerSchema {
+pub fn schemas(function: &str) -> Option<ControllerSchema> {
     match function {
-        "status" => ControllerSchema {
+        "status" => Some(ControllerSchema {
             namespace: "inference",
             function: "status",
             // ... rest unchanged, wrap each arm in Some(...)
-        },
+        }),
         // ... other arms
-        other => panic!("unknown inference schema: {other}"),
+        _ => None,
     }
 }

Then update callers to .expect("known schema") or handle None.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/inference/schemas.rs` around lines 213 - 389, The schemas()
function currently panics on unknown names (panic!("unknown inference schema:
{other}")), which can crash the process; change its signature to return
Result<ControllerSchema, SchemaError> or Option<ControllerSchema> (e.g., pub fn
schemas(function: &str) -> Option<ControllerSchema>) and replace the final panic
arm with None or Err(...), then update callers such as all_controller_schemas()
and all_registered_controllers() to handle the None/Err (e.g., .expect("known
schema") where appropriate or propagate the error) so unknown schema lookups are
handled defensively instead of aborting.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/src/services/api/aiSettingsApi.ts`:
- Around line 272-275: Replace the direct callCoreRpc call in aiSettingsApi (the
call that invokes 'openhuman.inference_list_models' with params { provider_id:
providerId }) with the Tauri IPC wrapper used elsewhere: call
tauriCommands.invoke('core_rpc_relay', { method:
'openhuman.inference_list_models', params: { provider_id: providerId } }) and
unwrap the returned payload into the same shape ({ result: { models: ModelInfo[]
} }) so the rest of the function continues to operate unchanged; preserve
existing try/catch logic but rely on the invoke path so the call waits for the
in-process core relay instead of using callCoreRpc.

In `@src/openhuman/inference/local/service/ollama_admin.rs`:
- Around line 48-55: The docstring for ensure_ollama_server_fresh is stale: it
claims to force a fresh install/start but the function simply calls
ensure_ollama_server; either restore the fresh-install behavior or update the
docs. Fix by editing the function ensure_ollama_server_fresh: if you intend to
keep delegation, change the docstring to state that it currently delegates to
ensure_ollama_server and acts as an alias/retry wrapper; if you intend to
reinstate fresh behavior, implement the fresh-install logic (e.g., bypass
cache/workspace and force reinstall/start) within ensure_ollama_server_fresh
before calling or instead of calling ensure_ollama_server. Reference
ensure_ollama_server_fresh and ensure_ollama_server when making the change so
callers and reviewers see the intended contract.

In `@src/openhuman/inference/local/service/public_infer.rs`:
- Around line 11-16: The error message in external_ollama_request_error
currently embeds ollama_base_url() verbatim and may leak credentials; instead
parse the URL returned by ollama_base_url(), remove or replace any userinfo
(username/password) before formatting the message, and include the sanitized URL
string; locate external_ollama_request_error and use a URL-parsing utility
(e.g., url::Url::parse) to strip userinfo or substitute a placeholder like
"<redacted>" prior to including the value in the formatted error string.

In `@src/openhuman/inference/provider/ops.rs`:
- Around line 86-87: The match arm in ops.rs currently treats
AuthStyle::OpenhumanJwt like AuthStyle::None, sending unauthenticated requests;
update the match so AuthStyle::OpenhumanJwt follows the authenticated path used
for other token-based styles (i.e., call the same code that attaches the
JWT/bearer header or signs the request instead of returning plain request).
Locate the match on AuthStyle (the arm returning `request`) and change the
OpenhumanJwt branch to invoke the existing request-authentication helper (the
method that adds Authorization headers or signs requests) so model-list calls
include the provider JWT.
- Around line 96-102: The code currently returns a raw/truncated upstream body;
instead, call the module's secret-scrubbing helper before truncation and
returning the error. Replace use of body/truncated with a sanitized string
produced by e.g. crate::openhuman::util::scrub_secrets(&body) (or the module's
actual secret-scrubbing function), then pass that sanitized text into
crate::openhuman::util::truncate_with_ellipsis and include the
sanitized+truncated result in the Err message (keeping status.as_u16() as-is).

In `@tests/json_rpc_e2e.rs`:
- Around line 3626-3808: Add JSON-RPC e2e assertions for the new methods: call
openhuman.inference_list_models (via post_json_rpc) after configuring the
provider (e.g., in json_rpc_inference_namespace_lm_studio_prompt_and_status) and
assert_no_jsonrpc_error and that the returned list contains the expected
"local-model" entry; also call openhuman.inference_update_model_settings (via
post_json_rpc) with a payload that changes model settings and
assert_no_jsonrpc_error and that a subsequent inference_status or
inference_list_models reflects the updated setting. Locate the test flow around
post_json_rpc calls and helpers like assert_no_jsonrpc_error,
assert_jsonrpc_error, extract_string_outcome, and add these new calls and
assertions in the same tests so the RPC transport and parameter shapes for
inference_list_models and inference_update_model_settings are validated.

---

Outside diff comments:
In `@src/openhuman/inference/voice/local_speech.rs`:
- Around line 29-42: Update the stale doc link that currently points to
local_ai/paths.rs so it references the inference paths helper instead: replace
the mention of local_ai/paths.rs with the fully-qualified
crate::openhuman::inference::paths::resolve_piper_binary (and ensure surrounding
text still references the installer function
crate::openhuman::inference::local::install_piper and the same install flow), so
readers are directed to the correct resolve_piper_binary symbol in the inference
paths module.

In `@src/openhuman/inference/voice/local_transcribe.rs`:
- Around line 18-23: Update the stale docs reference that currently points to
local_ai/paths.rs: change the module path to the new inference paths module so
the comment about install_whisper
(crate::openhuman::inference::local::install_whisper) points to the correct
resolver (resolve_whisper_binary) in the inference paths module (e.g.,
crate::openhuman::inference::paths::resolve_whisper_binary) so readers can find
the helper after the refactor.

---

Nitpick comments:
In `@src/openhuman/inference/local/service/ollama_admin.rs`:
- Around line 42-45: The current error path builds and returns an Err with the
ollama_base_url but does not emit any log; before returning the Err in the
Ollama admin flow (the spot that calls ollama_base_url() and returns
Err(format!(...))), add a warning-level log that includes the same base_url and
a brief context string so production failures are grep-visible (use the
project's logging macro, e.g. warn! or tracing::warn!, to log the endpoint and
message) and then return the Err as before.

In `@src/openhuman/inference/schemas.rs`:
- Around line 213-389: The schemas() function currently panics on unknown names
(panic!("unknown inference schema: {other}")), which can crash the process;
change its signature to return Result<ControllerSchema, SchemaError> or
Option<ControllerSchema> (e.g., pub fn schemas(function: &str) ->
Option<ControllerSchema>) and replace the final panic arm with None or Err(...),
then update callers such as all_controller_schemas() and
all_registered_controllers() to handle the None/Err (e.g., .expect("known
schema") where appropriate or propagate the error) so unknown schema lookups are
handled defensively instead of aborting.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 29c390c6-cb6a-425c-b040-cba30bb58fd6

📥 Commits

Reviewing files that changed from the base of the PR and between 9c89f06 and acaf8b5.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (197)
  • Cargo.toml
  • app/src/components/settings/panels/local-model/ModelDownloadSection.tsx
  • app/src/services/__tests__/coreRpcClient.test.ts
  • app/src/services/__tests__/rpcMethods.test.ts
  • app/src/services/api/__tests__/aiSettingsApi.test.ts
  • app/src/services/api/aiSettingsApi.ts
  • app/src/services/rpcMethods.ts
  • app/src/utils/__tests__/tauriCommands.test.ts
  • app/src/utils/tauriCommands/__tests__/config.test.ts
  • app/src/utils/tauriCommands/config.test.ts
  • app/src/utils/tauriCommands/config.ts
  • app/src/utils/tauriCommands/localAi.ts
  • e2e/docker-compose.yml
  • gitbooks/developing/e2e-testing.md
  • scripts/test-rust-inference-e2e.sh
  • src/api/rest.rs
  • src/core/all.rs
  • src/core/jsonrpc.rs
  • src/core/legacy_aliases.rs
  • src/core/observability.rs
  • src/main.rs
  • src/openhuman/agent/bus.rs
  • src/openhuman/agent/cost.rs
  • src/openhuman/agent/dispatcher.rs
  • src/openhuman/agent/dispatcher_tests.rs
  • src/openhuman/agent/harness/bughunt_tests.rs
  • src/openhuman/agent/harness/fork_context.rs
  • src/openhuman/agent/harness/harness_gap_tests.rs
  • src/openhuman/agent/harness/parse.rs
  • src/openhuman/agent/harness/session/builder.rs
  • src/openhuman/agent/harness/session/runtime.rs
  • src/openhuman/agent/harness/session/runtime_tests.rs
  • src/openhuman/agent/harness/session/tests.rs
  • src/openhuman/agent/harness/session/transcript.rs
  • src/openhuman/agent/harness/session/turn.rs
  • src/openhuman/agent/harness/session/turn_tests.rs
  • src/openhuman/agent/harness/session/types.rs
  • src/openhuman/agent/harness/subagent_runner/extract_tool.rs
  • src/openhuman/agent/harness/subagent_runner/mod.rs
  • src/openhuman/agent/harness/subagent_runner/ops.rs
  • src/openhuman/agent/harness/subagent_runner/ops_tests.rs
  • src/openhuman/agent/harness/test_support.rs
  • src/openhuman/agent/harness/test_support_test.rs
  • src/openhuman/agent/harness/tests.rs
  • src/openhuman/agent/harness/tool_loop.rs
  • src/openhuman/agent/harness/tool_loop_tests.rs
  • src/openhuman/agent/multimodal.rs
  • src/openhuman/agent/schemas.rs
  • src/openhuman/agent/stop_hooks.rs
  • src/openhuman/agent/tests.rs
  • src/openhuman/agent/triage/evaluator.rs
  • src/openhuman/agent/triage/evaluator_tests.rs
  • src/openhuman/agent/triage/routing.rs
  • src/openhuman/app_state/ops.rs
  • src/openhuman/autocomplete/core/engine.rs
  • src/openhuman/channels/context.rs
  • src/openhuman/channels/providers/web.rs
  • src/openhuman/channels/routes.rs
  • src/openhuman/channels/routes_tests.rs
  • src/openhuman/channels/runtime/dispatch.rs
  • src/openhuman/channels/runtime/startup.rs
  • src/openhuman/channels/tests/common.rs
  • src/openhuman/channels/tests/context.rs
  • src/openhuman/channels/tests/discord_integration.rs
  • src/openhuman/channels/tests/memory.rs
  • src/openhuman/channels/tests/runtime_dispatch.rs
  • src/openhuman/channels/tests/runtime_tool_calls.rs
  • src/openhuman/channels/tests/telegram_integration.rs
  • src/openhuman/config/ops.rs
  • src/openhuman/config/schema/cloud_providers.rs
  • src/openhuman/config/schema/load.rs
  • src/openhuman/config/schema/types.rs
  • src/openhuman/config/schemas.rs
  • src/openhuman/context/guard.rs
  • src/openhuman/context/manager.rs
  • src/openhuman/context/manager_tests.rs
  • src/openhuman/context/microcompact.rs
  • src/openhuman/context/pipeline.rs
  • src/openhuman/context/summarizer.rs
  • src/openhuman/context/summarizer_tests.rs
  • src/openhuman/credentials/ops.rs
  • src/openhuman/cron/scheduler.rs
  • src/openhuman/doctor/core.rs
  • src/openhuman/embeddings/cloud.rs
  • src/openhuman/embeddings/factory.rs
  • src/openhuman/inference/device.rs
  • src/openhuman/inference/http/mod.rs
  • src/openhuman/inference/http/server.rs
  • src/openhuman/inference/http/tests.rs
  • src/openhuman/inference/http/types.rs
  • src/openhuman/inference/local/core.rs
  • src/openhuman/inference/local/install.rs
  • src/openhuman/inference/local/install_piper.rs
  • src/openhuman/inference/local/install_whisper.rs
  • src/openhuman/inference/local/lm_studio.rs
  • src/openhuman/inference/local/mod.rs
  • src/openhuman/inference/local/ollama.rs
  • src/openhuman/inference/local/ops.rs
  • src/openhuman/inference/local/ops_tests.rs
  • src/openhuman/inference/local/process_util.rs
  • src/openhuman/inference/local/provider.rs
  • src/openhuman/inference/local/schemas.rs
  • src/openhuman/inference/local/schemas_tests.rs
  • src/openhuman/inference/local/service/assets.rs
  • src/openhuman/inference/local/service/bootstrap.rs
  • src/openhuman/inference/local/service/lm_studio.rs
  • src/openhuman/inference/local/service/mod.rs
  • src/openhuman/inference/local/service/ollama_admin.rs
  • src/openhuman/inference/local/service/ollama_admin_tests.rs
  • src/openhuman/inference/local/service/public_infer.rs
  • src/openhuman/inference/local/service/public_infer_tests.rs
  • src/openhuman/inference/local/service/spawn_marker.rs
  • src/openhuman/inference/local/service/speech.rs
  • src/openhuman/inference/local/service/vision_embed.rs
  • src/openhuman/inference/local/service/whisper_engine.rs
  • src/openhuman/inference/local/voice_install_common.rs
  • src/openhuman/inference/mod.rs
  • src/openhuman/inference/model_ids.rs
  • src/openhuman/inference/ops.rs
  • src/openhuman/inference/ops_tests.rs
  • src/openhuman/inference/parse.rs
  • src/openhuman/inference/paths.rs
  • src/openhuman/inference/presets.rs
  • src/openhuman/inference/presets_tests.rs
  • src/openhuman/inference/provider/billing_error.rs
  • src/openhuman/inference/provider/compatible.rs
  • src/openhuman/inference/provider/compatible_dump.rs
  • src/openhuman/inference/provider/compatible_parse.rs
  • src/openhuman/inference/provider/compatible_stream.rs
  • src/openhuman/inference/provider/compatible_tests.rs
  • src/openhuman/inference/provider/compatible_types.rs
  • src/openhuman/inference/provider/factory.rs
  • src/openhuman/inference/provider/factory_test.rs
  • src/openhuman/inference/provider/mod.rs
  • src/openhuman/inference/provider/openhuman_backend.rs
  • src/openhuman/inference/provider/ops.rs
  • src/openhuman/inference/provider/reliable.rs
  • src/openhuman/inference/provider/reliable_tests.rs
  • src/openhuman/inference/provider/router.rs
  • src/openhuman/inference/provider/router_test.rs
  • src/openhuman/inference/provider/schemas.rs
  • src/openhuman/inference/provider/temperature.rs
  • src/openhuman/inference/provider/thread_context.rs
  • src/openhuman/inference/provider/traits.rs
  • src/openhuman/inference/provider/traits_tests.rs
  • src/openhuman/inference/schemas.rs
  • src/openhuman/inference/schemas_tests.rs
  • src/openhuman/inference/sentiment.rs
  • src/openhuman/inference/types.rs
  • src/openhuman/inference/voice/cloud_transcribe.rs
  • src/openhuman/inference/voice/hallucination.rs
  • src/openhuman/inference/voice/local_speech.rs
  • src/openhuman/inference/voice/local_transcribe.rs
  • src/openhuman/inference/voice/mod.rs
  • src/openhuman/inference/voice/postprocess.rs
  • src/openhuman/inference/voice/streaming.rs
  • src/openhuman/learning/linkedin_enrichment.rs
  • src/openhuman/learning/reflection.rs
  • src/openhuman/learning/reflection_tests.rs
  • src/openhuman/learning/transcript_ingest/extract.rs
  • src/openhuman/learning/transcript_ingest/tests.rs
  • src/openhuman/local_ai/README.md
  • src/openhuman/local_ai/mod.rs
  • src/openhuman/local_ai/schemas.rs
  • src/openhuman/mcp_server/tools.rs
  • src/openhuman/memory/store/factories.rs
  • src/openhuman/memory/tree/chat/cloud.rs
  • src/openhuman/memory/tree/chat/mod.rs
  • src/openhuman/memory/tree/score/embed/factory.rs
  • src/openhuman/migrations/mod_tests.rs
  • src/openhuman/migrations/phase_out_profile_md_tests.rs
  • src/openhuman/migrations/unify_ai_provider_settings.rs
  • src/openhuman/mod.rs
  • src/openhuman/providers/schemas.rs
  • src/openhuman/routing/factory.rs
  • src/openhuman/routing/mod.rs
  • src/openhuman/routing/provider.rs
  • src/openhuman/routing/provider_tests.rs
  • src/openhuman/screen_intelligence/processing_worker.rs
  • src/openhuman/subconscious/executor.rs
  • src/openhuman/threads/ops.rs
  • src/openhuman/tools/impl/agent/delegate.rs
  • src/openhuman/tools/impl/agent/spawn_parallel_agents_test.rs
  • src/openhuman/tools/impl/agent/spawn_worker_thread.rs
  • src/openhuman/tools/impl/agent/todo_write.rs
  • src/openhuman/tools/ops.rs
  • src/openhuman/tree_summarizer/engine.rs
  • src/openhuman/tree_summarizer/ops.rs
  • src/openhuman/voice/mod.rs
  • src/openhuman/voice/ops.rs
  • src/openhuman/voice/types.rs
  • tests/agent_builder_public.rs
  • tests/agent_harness_public.rs
  • tests/agent_multimodal_public.rs
  • tests/calendar_grounding_e2e.rs
  • tests/inference_provider_e2e.rs
  • tests/json_rpc_e2e.rs
💤 Files with no reviewable changes (4)
  • src/openhuman/providers/schemas.rs
  • src/openhuman/local_ai/mod.rs
  • src/openhuman/local_ai/README.md
  • src/openhuman/local_ai/schemas.rs
✅ Files skipped from review due to trivial changes (32)
  • src/openhuman/inference/http/mod.rs
  • scripts/test-rust-inference-e2e.sh
  • src/openhuman/embeddings/cloud.rs
  • src/openhuman/agent/harness/subagent_runner/mod.rs
  • app/src/utils/tests/tauriCommands.test.ts
  • src/openhuman/config/schema/cloud_providers.rs
  • src/openhuman/agent/harness/fork_context.rs
  • src/openhuman/autocomplete/core/engine.rs
  • src/openhuman/context/manager.rs
  • src/openhuman/inference/provider/thread_context.rs
  • src/openhuman/agent/multimodal.rs
  • gitbooks/developing/e2e-testing.md
  • src/openhuman/inference/local/service/mod.rs
  • src/openhuman/context/manager_tests.rs
  • src/openhuman/context/pipeline.rs
  • src/openhuman/learning/transcript_ingest/tests.rs
  • src/openhuman/agent/harness/test_support_test.rs
  • src/openhuman/context/microcompact.rs
  • src/openhuman/migrations/phase_out_profile_md_tests.rs
  • src/main.rs
  • src/openhuman/agent/harness/session/transcript.rs
  • src/openhuman/agent/harness/harness_gap_tests.rs
  • src/openhuman/routing/mod.rs
  • src/openhuman/migrations/unify_ai_provider_settings.rs
  • src/openhuman/agent/harness/parse.rs
  • src/openhuman/migrations/mod_tests.rs
  • tests/agent_harness_public.rs
  • src/openhuman/channels/tests/common.rs
  • src/openhuman/memory/tree/chat/cloud.rs
  • src/openhuman/tools/impl/agent/todo_write.rs
  • src/openhuman/inference/local/mod.rs
  • src/openhuman/tools/impl/agent/spawn_parallel_agents_test.rs
🚧 Files skipped from review as they are similar to previous changes (4)
  • src/openhuman/mod.rs
  • app/src/services/api/tests/aiSettingsApi.test.ts
  • src/openhuman/app_state/ops.rs
  • src/openhuman/inference/schemas_tests.rs

Comment thread app/src/services/api/aiSettingsApi.ts
Comment thread src/openhuman/inference/local/service/ollama_admin.rs Outdated
Comment thread src/openhuman/inference/local/service/public_infer.rs
Comment thread src/openhuman/inference/provider/ops.rs Outdated
Comment thread src/openhuman/inference/provider/ops.rs
Comment thread tests/json_rpc_e2e.rs
- ollama_admin: correct stale docstring on `ensure_ollama_server_fresh`
  to reflect external-runtime mode (no auto install/start anymore)
- public_infer: redact userinfo/query/fragment from ollama_base_url
  before embedding in error payloads; add unit tests
- inference::provider::ops: route AuthStyle::OpenhumanJwt through the
  Bearer header path instead of falling into the unauthenticated branch
- inference::provider::ops: sanitize upstream provider error bodies via
  sanitize_api_error before returning them in RPC errors
- tests/json_rpc_e2e: add coverage for openhuman.inference_list_models
  and openhuman.inference_update_model_settings over the RPC transport
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 17, 2026
senamakel added 2 commits May 17, 2026 01:42
…ernal-ollama-routing

# Conflicts:
#	src/openhuman/agent/harness/session/builder.rs
#	src/openhuman/tools/impl/agent/todo_write.rs
…guard

- inference_diagnostics: return value with empty logs (`RpcOutcome::new`)
  instead of `single_log` so callers see the diagnostics object directly,
  matching the legacy `local_ai_diagnostics` shape that the UI and
  json_rpc_e2e tests assert against (`provider`, `lm_studio_running`,
  `expected.chat_found`, etc.).
- rpcMethods drift guard: read schemas from `inference/provider/schemas.rs`
  rather than the deleted `providers/schemas.rs`.
…t config

`LocalAiService` is a process-wide `OnceCell` singleton whose cached
`provider` field is set at first init and never refreshed. After an
`inference_update_local_settings` call swaps providers (ollama → lm_studio)
the cached value goes stale, so `inference_status` returns the previous
provider even though the on-disk config has the new one.

CI ran this test as part of the full suite where an earlier ollama-config
test had already initialized the singleton, so the lm_studio assertion
failed; locally in isolation the singleton picked lm_studio first and the
test passed.

Overlay the current config's provider on the returned snapshot so
`local_ai_status` reflects on-disk config without disturbing other
service state (state machine, model ids, etc.).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant