Skip to content

fix(providers): drop budget-exhausted 400s from Sentry (#3M, #12, #13)#1633

Merged
senamakel merged 7 commits into
tinyhumansai:mainfrom
oxoxDev:fix/budget-noise
May 14, 2026
Merged

fix(providers): drop budget-exhausted 400s from Sentry (#3M, #12, #13)#1633
senamakel merged 7 commits into
tinyhumansai:mainfrom
oxoxDev:fix/budget-noise

Conversation

@oxoxDev
Copy link
Copy Markdown
Contributor

@oxoxDev oxoxDev commented May 13, 2026

Summary

  • Added a tight budget-exhaustion phrase matcher for provider/backend 400 bodies.
  • Demoted budget-exhausted 400 emit sites from Sentry error events to structured info logs while preserving Err propagation.
  • Added defense-in-depth before_send filtering in both the core binary and Tauri shell.
  • Added runtime Sentry smoke coverage proving budget events are dropped and non-budget 400s still surface.

Problem

  • Sentry is receiving high-volume deterministic user-state failures when the OpenHuman backend returns 400 Bad Request with budget/credits/balance copy.
  • These are not application bugs: the request should fail, and the UI already surfaces the credit/budget issue.
  • The noisy Sentry IDs are OPENHUMAN-TAURI-3M, OPENHUMAN-TAURI-12, and OPENHUMAN-TAURI-13 (~134 events total).

Solution

  • Added is_budget_exhausted_message with a deliberately small case-insensitive phrase list: insufficient budget, budget exceeded, add credits, insufficient balance.
  • Wrapped provider/backend non-2xx report sites so matching 400 responses log at info with kind=budget and skip report_error.
  • Added is_budget_event for before_send, keyed to failure=non_2xx, status=400, and budget copy in the event message/exception.
  • Wired the filter into both Sentry clients: core src/main.rs and shell app/src-tauri/src/lib.rs.
  • Preserved caller control flow: budget responses still return Err; only Sentry emission changes.

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason. Do not delete items.

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
  • Diff coverage >= 80% — N/A: focused Rust unit/smoke tests cover the changed paths; full diff-cover not run locally.
  • Coverage matrix updated — N/A: observability-only behavior change, no feature matrix row added/removed/renamed.
  • All affected feature IDs from the matrix are listed in the PR description under ## Related — N/A: no matrix feature IDs; Sentry IDs listed under ## Related.
  • No new external network dependencies introduced (mock backend used per Testing Strategy)
  • Manual smoke checklist updated if this touches release-cut surfaces (docs/RELEASE-MANUAL-SMOKE.md) — N/A: no release-cut surface/manual smoke checklist change.
  • Linked issue closed via Closes #NNN in the ## Related section

Impact

  • Desktop/core observability only. No frontend UX change.
  • Budget-exhausted user-state errors stop paging Sentry at error level.
  • Non-budget 400s and transient aggregate outage signals still reach Sentry.

Related

  • Closes OPENHUMAN-TAURI-3M
  • Closes OPENHUMAN-TAURI-12
  • Closes OPENHUMAN-TAURI-13
  • Follow-up PR(s)/TODOs: N/A

AI Authored PR Metadata (required for Codex/Linear PRs)

Keep this section for AI-authored PRs. For human-only PRs, mark each field N/A.

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: fix/budget-noise
  • Commit SHA: 1b8b9a39039b33ab93c06cc4aa8ff37a87a0707e

Validation Run

  • pnpm --filter openhuman-app format:check — N/A: no TS/frontend formatting changes in this PR.
  • pnpm typecheck — N/A: no TS/frontend type changes in this PR.
  • Focused tests:
    • cargo test --lib openhuman::providers
    • cargo test --test observability_smoke
  • Rust fmt/check (if changed):
    • cargo fmt --check
    • cargo check --manifest-path Cargo.toml
    • cargo clippy --workspace -- -D warnings (blocked; see below)
  • Tauri fmt/check (if changed):
    • cargo check --manifest-path app/src-tauri/Cargo.toml

Validation Blocked

  • command: cargo clippy --workspace -- -D warnings
  • error: Fails on existing workspace lint debt (78 errors across unrelated modules: unused imports, deprecated rand::thread_rng, derivable impls, type complexity, doc list indentation, etc.).
  • impact: Clippy is not green in this worktree; budget-specific checks and smoke tests pass.

Behavior Changes

  • Intended behavior change: budget-exhausted 400 responses no longer create Sentry error events.
  • User-visible effect: no UI/control-flow change; the existing error/toast behavior is preserved.

Parity Contract

  • Legacy behavior preserved: requests still return Err; retry/toast callers continue to observe failure.
  • Guard/fallback/dispatch parity checks: non-budget 400 smoke still captures one event; transient aggregate all_exhausted smoke still captures one event.

Duplicate / Superseded PR Handling

  • Duplicate PR(s): N/A
  • Canonical PR: this PR
  • Resolution (closed/superseded/updated): N/A

Summary by CodeRabbit

  • Bug Fixes
    • Suppress deterministic "budget exhausted" 400 responses across REST, provider APIs, streaming/chat flows, and agent runtime so they are logged as informational user-state events and no longer generate error alerts or get sent to error reporting.
  • Tests
    • Added observability tests to verify budget-exhaustion filtering and to ensure non-budget 400 responses continue to be reported.

Review Change Stack

@oxoxDev oxoxDev requested a review from a team May 13, 2026 11:41
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 13, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds deterministic detection of provider “budget exhausted” messages, classifies such responses as ExpectedErrorKind::BudgetExhausted, suppresses/logs them at provider and REST/agent layers, and drops matching Sentry events via before_send (with unit and smoke tests).

Changes

Budget-exhaustion detection and Sentry filtering

Layer / File(s) Summary
Core budget-exhaustion message detection
src/openhuman/providers/billing_error.rs, src/openhuman/providers/mod.rs
New billing_error module exports is_budget_exhausted_message that identifies budget-related phrases via case-insensitive substring matching against a hardcoded allowlist, with unit tests.
Sentry event classification for budget conditions
src/core/observability.rs
Adds ExpectedErrorKind::BudgetExhausted, wires it into expected-error reporting to emit an info breadcrumb, adds is_budget_event classifier that matches failure="non_2xx" and status="400" and inspects message/exception values; includes test helper and unit tests.
Provider ops layer: budget detection and logging
src/openhuman/providers/ops.rs
Replaces previous suppression predicate with is_budget_exhausted_http_400 (400-only + delegated message check) and log_budget_exhausted_http_400; updates api_error to use them and rewrites tests accordingly.
Budget handling in OpenAI-compatible provider endpoints
src/openhuman/providers/compatible.rs
Updates several non-success paths (responses API, streaming/native chat, chat completions, native chat, spawned stream chat) to call log_budget_exhausted_http_400(...) for 400 budget-exhausted responses instead of the generic reporting path.
Upstream integration: REST, Agent, and Sentry before_send hooks
src/api/rest.rs, src/openhuman/agent/harness/session/runtime.rs, src/main.rs, app/src-tauri/src/lib.rs
Detects budget-exhausted 400s in BackendOAuthClient, updates agent error comments to reflect budget-noise handling, and adds early budget-event drops in main and Tauri Sentry before_send callbacks.
Observability smoke test: budget event filtering
tests/observability_smoke.rs
Extends the observability smoke tests with event_with_message helper and tests that budget-exhausted 400 events are filtered while non-budget 400s are kept.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs

Suggested reviewers

  • senamakel

Poem

🐰 I sniffed the logs where budgets sigh,
A tiny hop to hush the cry.
Quiet crumbs and gentle trace,
We drop the noise and keep the grace. ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title directly describes the main change: filtering budget-exhausted HTTP 400 responses from Sentry reporting, which is the primary focus of this changeset across multiple files.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
src/openhuman/agent/harness/session/runtime.rs (1)

505-527: ⚡ Quick win

Narrow run_single budget suppression to provider-originated errors.

Using phrase match alone here can suppress unrelated failures if their text contains budget phrases.

🎯 Suggested guard
                 let sanitized_message = Self::sanitize_event_error_message(&err);
                 let full_error = format!("{err:#}");
-                if providers::is_budget_exhausted_message(&full_error) {
+                let is_provider_budget = sanitized_message == "provider_error"
+                    && providers::is_budget_exhausted_message(&full_error);
+                if is_provider_budget {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/agent/harness/session/runtime.rs` around lines 505 - 527, The
budget-suppression currently uses providers::is_budget_exhausted_message on the
formatted message (full_error) and can wrongly suppress non-provider errors;
update the guard in run_single to first detect that the error originates from a
provider error (e.g., inspect err's source chain or downcast to the provider
error type) and only then call providers::is_budget_exhausted_message/full_error
to decide suppression; adjust the if condition so it checks both
"is_provider_error(&err)" (or an equivalent downcast/chain check) and
providers::is_budget_exhausted_message(&full_error) before skipping the
report_error call, leaving report_error(&err, "agent", "run_single", ...)
unchanged for all other errors.
src/core/observability.rs (1)

215-223: ⚡ Quick win

Scope budget-event filtering to known domains.

is_budget_event currently ignores domain, so any event with failure=non_2xx, status=400, and matching text can be dropped even outside provider/backend paths.

🔧 Proposed tightening
 pub fn is_budget_event(event: &sentry::protocol::Event<'_>) -> bool {
     let tags = &event.tags;
+    if !matches!(
+        tags.get("domain").map(String::as_str),
+        Some("llm_provider") | Some("backend_api")
+    ) {
+        return false;
+    }
     if tags.get("failure").map(String::as_str) != Some("non_2xx") {
         return false;
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/core/observability.rs` around lines 215 - 223, is_budget_event currently
drops events solely by failure/status/message; restrict it to known domains by
checking the event.tags "domain" value (e.g., only accept "provider" or
"backend" or whichever canonical domain strings your system uses) before
returning true. Update the is_budget_event function to read
tags.get("domain").map(String::as_str) and ensure it matches the allowlist of
domains in addition to the existing failure=="non_2xx", status=="400", and
event_contains_budget_exhausted_message checks so only provider/backend-path
events are filtered.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/src-tauri/src/lib.rs`:
- Around line 1254-1257: The current budget-drop logging uses log::debug and
prints the full event.message (event.message.as_deref()), which may leak
sensitive text; change the log call in the budget-exhausted path to avoid
logging raw event.message and instead log only non-sensitive metadata such as
event.level, event.event_id or id, timestamp, and any non-PII tags/keys (e.g.
use event.level, event.id/event.event_id, event.timestamp, and a boolean
indicating whether message exists). Update the log::debug invocation in the same
block to construct a metadata-only message (no event.message content) and, if
needed, use event.message.is_some() to indicate presence without including the
text.

In `@src/openhuman/providers/compatible.rs`:
- Around line 738-745: When detecting a deterministic budget-400 in the
streaming path (after is_budget_exhausted_http_400 and
log_budget_exhausted_http_400), return a specific sentinel/typed error from
stream_native_chat (e.g., a ProviderError::BudgetExhausted or similar new
variant) instead of the current generic Err so the caller chat() can detect it
and short-circuit the non-streaming fallback; update stream_native_chat to
produce that typed error and ensure chat() checks for that sentinel and stops
retrying/does not invoke the non-streaming fallback, preventing duplicate logs
and duplicate provider calls.

---

Nitpick comments:
In `@src/core/observability.rs`:
- Around line 215-223: is_budget_event currently drops events solely by
failure/status/message; restrict it to known domains by checking the event.tags
"domain" value (e.g., only accept "provider" or "backend" or whichever canonical
domain strings your system uses) before returning true. Update the
is_budget_event function to read tags.get("domain").map(String::as_str) and
ensure it matches the allowlist of domains in addition to the existing
failure=="non_2xx", status=="400", and event_contains_budget_exhausted_message
checks so only provider/backend-path events are filtered.

In `@src/openhuman/agent/harness/session/runtime.rs`:
- Around line 505-527: The budget-suppression currently uses
providers::is_budget_exhausted_message on the formatted message (full_error) and
can wrongly suppress non-provider errors; update the guard in run_single to
first detect that the error originates from a provider error (e.g., inspect
err's source chain or downcast to the provider error type) and only then call
providers::is_budget_exhausted_message/full_error to decide suppression; adjust
the if condition so it checks both "is_provider_error(&err)" (or an equivalent
downcast/chain check) and providers::is_budget_exhausted_message(&full_error)
before skipping the report_error call, leaving report_error(&err, "agent",
"run_single", ...) unchanged for all other errors.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: de8baa8e-d856-4b47-8c03-b1afc56d1f28

📥 Commits

Reviewing files that changed from the base of the PR and between 57c6d1b and 1b8b9a3.

📒 Files selected for processing (10)
  • app/src-tauri/src/lib.rs
  • src/api/rest.rs
  • src/core/observability.rs
  • src/main.rs
  • src/openhuman/agent/harness/session/runtime.rs
  • src/openhuman/providers/billing_error.rs
  • src/openhuman/providers/compatible.rs
  • src/openhuman/providers/mod.rs
  • src/openhuman/providers/ops.rs
  • tests/observability_smoke.rs

Comment thread app/src-tauri/src/lib.rs
Comment on lines +738 to +745
if super::is_budget_exhausted_http_400(status, &body) {
super::log_budget_exhausted_http_400(
"streaming_chat",
self.name.as_str(),
Some(native_request.model.as_str()),
status,
);
} else if super::should_report_provider_http_failure(status) {
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Skip the non-streaming fallback for deterministic budget 400s.

This branch now knows the failure is a terminal user-state error, but stream_native_chat still returns a generic Err, so chat() falls through to the non-streaming retry on Line 1485. That adds a second doomed provider call and duplicates the budget log for every streaming budget miss. Please bubble up a typed/sentinel error here so the caller can short-circuit instead of retrying.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/providers/compatible.rs` around lines 738 - 745, When detecting
a deterministic budget-400 in the streaming path (after
is_budget_exhausted_http_400 and log_budget_exhausted_http_400), return a
specific sentinel/typed error from stream_native_chat (e.g., a
ProviderError::BudgetExhausted or similar new variant) instead of the current
generic Err so the caller chat() can detect it and short-circuit the
non-streaming fallback; update stream_native_chat to produce that typed error
and ensure chat() checks for that sentinel and stops retrying/does not invoke
the non-streaming fallback, preventing duplicate logs and duplicate provider
calls.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged — this is a valid follow-up but out of scope for the noise-drop PR. The typed sentinel error would require a provider-error taxonomy refactor (ProviderError::BudgetExhausted variant + every chat()-consumer learning to short-circuit). I'd rather ship this Sentry-drop now (which already prevents the second event from reaching Sentry via before_send) and open a follow-up for the streaming-fallback short-circuit. Tracking as TODO.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oxoxDev, understood — the rationale is sound. The before_send defense-in-depth already prevents the second call's event from reaching Sentry, so the duplicate-log risk is bounded to structured tracing::info! only (not a Sentry noise problem). The full ProviderError::BudgetExhausted refactor is a meaningful taxonomy change that deserves its own PR.

Would you like me to open a follow-up GitHub issue to track the streaming-fallback short-circuit work (ProviderError::BudgetExhausted typed error + chat() short-circuit)?


🧠 Learnings used
Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 416
File: src/openhuman/memory/relex.rs:441-464
Timestamp: 2026-04-07T15:49:51.275Z
Learning: When using the `ort` Rust crate v2.x with the `load-dynamic` feature enabled, don’t require individual execution-provider feature flags (e.g., `directml`, `coreml`, `cuda`) alongside `load-dynamic` to get EP registration code. The `ort` crate already compiles EP registration via `#[cfg(any(feature = "load-dynamic", feature = "<ep_name>"))]` guards, and adding per-EP feature flags can pull in static-linking dependencies that conflict with the dynamic loading approach. At runtime, EP availability is determined by what the dynamically loaded ONNX Runtime library (`onnxruntime.dll`/`.so`/`.dylib`) supports; ort docs indicate providers like `directml`/`xnnpack`/`coreml` are available in builds when the platform supports them.

Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 416
File: src/openhuman/memory/relex.rs:441-464
Timestamp: 2026-04-07T15:49:51.275Z
Learning: When integrating the `ort` Rust crate v2.x with the `load-dynamic` feature enabled, do NOT also require/enable individual provider EP Cargo features like `directml`, `coreml`, or `cuda`. In `ort` v2.x, EP registration for providers (e.g., DirectML, CoreML, CUDA, etc.) is already compiled in under source-level `#[cfg(any(feature = "load-dynamic", feature = "<provider>"))]` guards, such as in `ep/directml.rs`. Adding provider feature flags alongside `load-dynamic` can pull in static-linking dependencies that conflict with the dynamic-loading approach. Provider availability should be treated as runtime-determined by what the loaded `onnxruntime` library (`onnxruntime.dll`/`libonnxruntime.so`/`libonnxruntime.dylib`) actually supports.

Learnt from: oxoxDev
Repo: tinyhumansai/openhuman PR: 571
File: src/openhuman/local_ai/service/whisper_engine.rs:69-80
Timestamp: 2026-04-14T19:59:04.826Z
Learning: When reviewing Rust code in this repo that uses the upstream `whisper-rs` crate (v0.16.0), do not report `WhisperContextParameters::use_gpu(...)` or `WhisperContextParameters::flash_attn(...)` as missing/invalid APIs. These builder-style methods exist upstream and return `&mut Self`; they are not limited to `WhisperVadContextParams`.

Learnt from: graycyrus
Repo: tinyhumansai/openhuman PR: 1078
File: src/openhuman/agent/agents/welcome/prompt.rs:24-24
Timestamp: 2026-05-01T13:41:00.958Z
Learning: For Rust code under `src/openhuman/**/*.rs`, use `snake_case` for local variables (not `camelCase`). If a local variable name is written in `camelCase`, treat it as a style/lint issue because it will trigger Rust’s `non_snake_case` warning (and related clippy linting, if enabled). Avoid suggesting `camelCase` for any Rust local variable names in this repository.

Learnt from: senamakel
Repo: tinyhumansai/openhuman PR: 1173
File: tests/agent_memory_loader_public.rs:88-88
Timestamp: 2026-05-04T06:50:47.877Z
Learning: In this repository, the general camelCase naming guideline should not be applied to Rust source files. For all .rs files, Rust function (and related) names should use snake_case, and snake_case Rust function names should not be flagged—even for async test functions annotated with attributes like #[tokio::test]. This is consistent with Rust’s non_snake_case lint behavior.

oxoxDev added a commit to oxoxDev/openhuman that referenced this pull request May 13, 2026
tinyhumansai#1633 CR)

CodeRabbit major: the [sentry-budget-filter] log emitted event.message
verbatim, which can carry upstream provider error text including
tokens / pasted-through secrets. Per openhuman/CLAUDE.md "never log
secrets or full PII".

Switch to logging only the structured tag values (domain + status)
that gate the drop decision; the message text is no longer needed
for diagnostic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@oxoxDev
Copy link
Copy Markdown
Contributor Author

oxoxDev commented May 13, 2026

CR Round 1 addressed:

  • [Major / Quick win] app/src-tauri/src/lib.rs:1257 budget-filter log redacted — now logs (domain, status) tag metadata only, no event.message text. (commit a44ab8f)
  • [Major / Heavy lift] src/openhuman/providers/compatible.rs:745 typed sentinel for streaming budget-exhausted — deferred as follow-up. Out of scope for this noise-drop PR; would require ProviderError::BudgetExhausted variant + caller short-circuit across every chat() consumer. The duplicate non-streaming-fallback call is already absorbed by before_send (both events drop), so net Sentry signal is unaffected. Tracking as TODO.

Replied inline on the compatible.rs thread with same reasoning.

@graycyrus
Copy link
Copy Markdown
Contributor

@oxoxDev this PR has merge conflicts with main — please rebase/resolve before review.

@senamakel
Copy link
Copy Markdown
Member

Deferred question for backend team — ops.rs 402 suppression

In src/openhuman/providers/ops.rs, is_budget_exhausted_http_400 is deliberately scoped to status 400 only — a matching budget phrase on a 402 is not suppressed (there is now a dedicated smoke test budget_exhausted_402_is_not_suppressed in tests/observability_smoke.rs confirming this).

The rationale in the current implementation is that 402 Payment Required carries a different semantic: it could signal the backend adding a hard payment gate rather than a soft budget-exceeded user state, so it should still reach Sentry as a signal.

Question for backend team: Is this the correct policy?

  • If the backend will never return 402 for budget/credits reasons (only 400), the current behaviour is correct and no change is needed.
  • If the backend can return 402 with budget-exhausted bodies (payment wall, subscription lapsed, etc.) and those should also be suppressed from Sentry, then is_budget_exhausted_http_400 needs to be widened to also accept 402, and the smoke test + function name updated accordingly.

Please confirm which applies before this is merged. Tagging as unresolved.

oxoxDev added a commit to oxoxDev/openhuman that referenced this pull request May 13, 2026
tinyhumansai#1633 CR)

CodeRabbit major: the [sentry-budget-filter] log emitted event.message
verbatim, which can carry upstream provider error text including
tokens / pasted-through secrets. Per openhuman/CLAUDE.md "never log
secrets or full PII".

Switch to logging only the structured tag values (domain + status)
that gate the drop decision; the message text is no longer needed
for diagnostic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@oxoxDev oxoxDev force-pushed the fix/budget-noise branch from a44ab8f to d5807e3 Compare May 13, 2026 20:06
@oxoxDev
Copy link
Copy Markdown
Contributor Author

oxoxDev commented May 13, 2026

Defaulting to 400-only as the PR currently has it.

Rationale: every observed Sentry event in the cluster (OPENHUMAN-TAURI-3M ×72, -12 ×49, -13 ×13) returned HTTP 400, never 402. Suppressing 402 here would silence a future signal — if the backend introduces a hard-payment-wall on 402 (subscription lapsed, plan downgrade, etc.), we want that to surface as actionable rather than be lumped with soft budget-exceeded.

The smoke test budget_exhausted_402_is_not_suppressed deliberately locks this behaviour — if backend later adopts 402 for budget, the test will flag the policy decision and the fix is a one-line widening of is_budget_exhausted_http_400 (rename to is_budget_exhausted_http_status + accept 400 or 402). Easy to revert.

So: status quo is the right default. Happy to widen in a follow-up PR if @backend-team confirms 402 will be used for soft-budget. Marking your question as answered — leaving the test name as a tripwire for the future-widening case.

Conflict resolution also pushed — rebased onto current main.

@oxoxDev
Copy link
Copy Markdown
Contributor Author

oxoxDev commented May 13, 2026

@graycyrus rebased onto current main (now on commit d5807e3, rebased from 2b64ea8 upstream tip).

Conflict resolution:

  • Only meaningful conflict was src/openhuman/agent/harness/session/runtime.rs — main landed fix(observability): skip Sentry for transport-level + transient-upstream errors (TAURI-32 / 5Z / 2G) #1601 (report_error_or_expected unified classifier) which collided with this PR's parallel if providers::is_budget_exhausted_message ... branch. Resolved by folding budget into the unified classifier: added ExpectedErrorKind::BudgetExhausted to expected_error_kind in src/core/observability.rs, removed the duplicate if/else from runtime.rs. One pathway now handles transient HTTP, network-unreachable, budget, etc.
  • Three files auto-merged cleanly: src/api/rest.rs, src/openhuman/providers/compatible.rs, src/openhuman/providers/ops.rs.
  • No semantic change to the budget-filter path: is_budget_exhausted_http_400 is still 400-only and still backed by the smoke + unit tests.

Local checks green:

  • cargo check --manifest-path Cargo.toml
  • cargo check --manifest-path app/src-tauri/Cargo.toml
  • cargo fmt --check
  • cargo test --lib openhuman::providers — 168/168 ✓
  • cargo test --lib core::observability — 21/21 ✓ (includes the 4 budget-filter before_send cases)
  • cargo test --lib openhuman::providers::ops — 9/9 ✓ (includes budget_exhausted_402_is_not_suppressed)
  • cargo test --test observability_smoke — 6/6 ✓

@senamakel senamakel self-assigned this May 14, 2026
@senamakel
Copy link
Copy Markdown
Member

PR Manager follow-up (2026-05-13)

Quality suite run against the conflict-resolved state (local pr/1633 branch, merged with main):

Check Result
cargo fmt (core + tauri) PASS
cargo check (core) PASS
cargo check (tauri) PASS
cargo test --test observability_smoke PASS (9/9)
cargo test --lib -- providers::ops PASS (9/9)
pnpm typecheck PASS
pnpm lint PASS
pnpm format PASS

Deferred CR nits applied (from the original review):

  • is_budget_exhausted_http_400 in src/openhuman/providers/ops.rs: doc comment added explaining 402 is intentionally not suppressed
  • is_budget_event in src/core/observability.rs: doc comment added clarifying the domain tag is intentionally omitted (defense-in-depth over primary emit-site classifiers)
  • log_budget_exhausted_http_400: model.unwrap_or("")model.unwrap_or("<unknown>")

Outstanding items for the author:

  1. PR body has 3 unchecked checklist items — the "PR Submission Checklist" CI job fails because [ ] items aren't counted as checked even with N/A annotations. Mark them [x] or remove.

  2. CodeRabbit CHANGES_REQUESTED — two nitpick suggestions remain from the original review (both are style/safety wins but not correctness bugs):

    • is_budget_event: consider adding domain tag guard (already addressed with a doc note explaining the intentional omission, but CR still marks CHANGES_REQUESTED)
    • run_single in runtime.rs: the phrase-match guard was replaced with report_error_or_expected in commit c4c8241e which routes through the proper classifier — this should satisfy the CR concern.
  3. PR is CONFLICTING (merge conflict with upstream main) — needs rebase on tinyhumansai/openhuman:main from the contributor fork.

  4. Inline comment on lib.rs (around line 1254): the budget-drop debug log prints event.message.as_deref() which may expose sensitive text. Consider logging only metadata (event level, event ID, tag keys) rather than the raw message content.

  5. Inline comment on compatible.rs (around line 738): streaming budget-400 path returns a generic Err — returning a typed ProviderError::BudgetExhausted would let chat() short-circuit the non-streaming fallback cleanly.

Items 4 and 5 are non-trivial architecture choices (new error variant) and deferred for the author.

senamakel added a commit to oxoxDev/openhuman that referenced this pull request May 14, 2026
…mansai#1633)

Conflict resolution (5 files):
- src/core/observability.rs: kept both BudgetExhausted and
  LocalAiCapabilityUnavailable variants; merged is_budget_event,
  is_transient_backend_http_status, and new transient-integrations helper
- src/main.rs: merged both before_send budget-drop hooks
- src/api/rest.rs: unified to is_transient_http_status_code (generalised)
- app/src-tauri/src/lib.rs: merged both budget-drop before_send hooks
- tests/observability_smoke.rs: combined all smoke tests (9 total)

CR nits applied:
- is_budget_exhausted_http_400: doc comment explaining 402 intentionally
  not suppressed
- is_budget_event: doc comment clarifying domain tag intentionally omitted
- log_budget_exhausted_http_400: model.unwrap_or("") → unwrap_or("<unknown>")

All checks pass: cargo fmt, cargo check (core + tauri), cargo test
(observability_smoke: 9/9, providers::ops: 9/9), pnpm typecheck,
pnpm lint, pnpm format.
@coderabbitai coderabbitai Bot added the working A PR that is being worked on by the team. label May 14, 2026
oxoxDev and others added 7 commits May 14, 2026 12:59
tinyhumansai#1633 CR)

CodeRabbit major: the [sentry-budget-filter] log emitted event.message
verbatim, which can carry upstream provider error text including
tokens / pasted-through secrets. Per openhuman/CLAUDE.md "never log
secrets or full PII".

Switch to logging only the structured tag values (domain + status)
that gate the drop decision; the message text is no longer needed
for diagnostic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@oxoxDev oxoxDev force-pushed the fix/budget-noise branch from a4b7d61 to 99aad44 Compare May 14, 2026 07:40
@oxoxDev
Copy link
Copy Markdown
Contributor Author

oxoxDev commented May 14, 2026

Addressed PR Manager follow-up (2026-05-14):

  1. Submission Checklist CI: Marked the three N/A items as checked with N/A: reasons (Diff coverage >= 80%, pnpm --filter openhuman-app format:check, pnpm typecheck). Re-running the check now.
  2. CR nits (Major): Both already addressed —
    • is_budget_event domain tag: doc comment in src/core/observability.rs explains intentional omission (defense-in-depth over the emit-site classifier).
    • run_single phrase-match guard: replaced with report_error_or_expected (linear-history equivalent of c6c5904d) which now routes through the unified ExpectedErrorKind classifier (incl. BudgetExhausted).
  3. Rebase: Already aligned with upstream/main tip 26727063; no new conflicts since the previous rebase folded the budget classifier into the unified ExpectedErrorKind taxonomy. Force-pushed fix/budget-noise to a clean linear history (tip 99aad444) — the previous merge-commit was replaced with the rebased form.
  4. Inline Refactor testing scripts in package.json and update dependencies #4 (event.message redaction): Already addressed in 99aad444 (metadata-only logging in budget-filter before_send) for app/src-tauri/src/lib.rs. Verified the core binary src/main.rs budget-drop block (lines 62-68) silently return Nones without logging event.message — no parity fix needed.
  5. Inline sync branches #5 (typed ProviderError::BudgetExhausted): Deferring per your "non-trivial architecture choice" call — happy to open a follow-up issue. Current PR keeps the surface tight (filter + classifier only).

@oxoxDev
Copy link
Copy Markdown
Contributor Author

oxoxDev commented May 14, 2026

Re: CodeRabbit summary nitpicks —

1. runtime.rs:505-527 narrow budget suppression to provider-originated errors

The pre-rebase shape (direct providers::is_budget_exhausted_message(&full_error) check in run_single) was refactored away during the rebase onto upstream/main. run_single now delegates to crate::core::observability::report_error_or_expected which routes through the ExpectedErrorKind taxonomy — provider/budget classification is centralized in expected_error_kind(). Nit is moot for the current code shape; the explicit phrase-match in agent harness is gone.

2. observability.rs:215-223 scope is_budget_event to known domains

Intentional defense-in-depth — see the function's doc-comment at src/core/observability.rs:565-569:

Note: domain is intentionally not gated here as defense-in-depth over the emit-site classifier — any non_2xx/400 event that carries the budget-exhausted phrasing is dropped regardless of which domain produced it, so a future re-emitter under a different tag still gets filtered.

The failure=non_2xx + status=400 + budget-exhausted-message triple is already tight enough that false positives require synthesizing all three deliberately. Adding the domain gate would re-introduce the leak shape this PR was filed to close (events under domain=agent from run_single escaped the original domain=llm_provider filter; that's the cluster #3M / #12 / #13 we're tracking).

@senamakel senamakel merged commit bf9404a into tinyhumansai:main May 14, 2026
23 of 24 checks passed
AusAgentSmith pushed a commit to AusAgentSmith/openhuman that referenced this pull request May 23, 2026
…nsai#12, tinyhumansai#13) (tinyhumansai#1633)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants