Skip to content

Filter transient updater Sentry noise#1716

Merged
senamakel merged 3 commits into
tinyhumansai:mainfrom
oxoxDev:fix/updater-transient-noise
May 15, 2026
Merged

Filter transient updater Sentry noise#1716
senamakel merged 3 commits into
tinyhumansai:mainfrom
oxoxDev:fix/updater-transient-noise

Conversation

@oxoxDev
Copy link
Copy Markdown
Contributor

@oxoxDev oxoxDev commented May 14, 2026

Summary

  • Added an updater-specific Sentry transient classifier for GitHub update check 403/5xx responses and request-send failures.
  • Wired the classifier into both core and Tauri before_send hooks.
  • Demoted core updater transient check/download failures to tracing::warn! and added unit plus runtime smoke coverage.

Problem

  • Transient GitHub updater failures were reaching Sentry as actionable errors even though retries/future probes are the correct recovery path.
  • The noise appeared from both message-only Tauri updater events and tagged core update.check_releases reports, so filtering only one runtime surface would leave a leak.

Solution

  • Introduce is_updater_transient_event beside the existing transient HTTP filters.
  • Match structured updater tags only when they also carry transient status/transport markers, while also catching known updater message-only shapes.
  • Install the filter in both Sentry clients and avoid emitting error-level reports for known transient core updater failures.

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
  • N/A: Diff coverage >= 80% -- focused Rust tests/checks requested for this observability noise fix passed locally; no merged coverage run.
  • N/A: Coverage matrix updated -- observability-only Sentry filtering change, no feature matrix row added/removed/renamed.
  • N/A: All affected feature IDs from the matrix are listed in the PR description under ## Related -- no matrix feature IDs apply.
  • No new external network dependencies introduced (mock backend used per Testing Strategy)
  • N/A: Manual smoke checklist updated -- no release-cut smoke surface changed.
  • Linked issue closed via Closes #NNN in the ## Related section -- Sentry issue keys are listed below as requested.

Impact

  • Runtime behavior: transient updater failures are logged as warnings/breadcrumbs or dropped in before_send instead of creating Sentry errors.
  • User-visible behavior: update checks/downloads keep returning the same success/error results; this only changes observability noise.

Related

  • Closes OPENHUMAN-TAURI-30
  • Closes OPENHUMAN-TAURI-4E

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: fix/updater-transient-noise
  • Commit SHA: 098db3c57ffd18ee4262f3400143ce320e05e25a

Files Changed

  • src/core/observability.rs: updater transient classifier and unit tests.
  • src/main.rs, app/src-tauri/src/lib.rs: bilateral before_send wiring.
  • src/openhuman/update/core.rs: warning-level demotion for updater transients.
  • tests/observability_smoke.rs: runtime Sentry transport smoke.

Validation Run

  • cargo fmt
  • cargo test --lib core::observability
  • cargo test --test observability_smoke
  • cargo check
  • (cd app/src-tauri && cargo check)

Validation Blocked

  • command: N/A
  • error: N/A
  • impact: N/A

Behavior Changes

  • Intended behavior change: known transient updater failures no longer create Sentry error events.
  • User-visible effect: N/A; updater command return values are preserved.

Parity Contract

  • Legacy behavior preserved: update check/download return paths are unchanged for callers.
  • Guard/fallback/dispatch parity checks: filter is wired in both core and Tauri Sentry clients.

Duplicate / Superseded PR Handling

  • Duplicate PR(s): N/A
  • Canonical PR: N/A
  • Resolution (closed/superseded/updated): N/A

Summary by CodeRabbit

  • Bug Fixes
    • Improved error classification for the updater: temporary network issues and transient server responses (e.g., 403, 502) are now suppressed and not sent to error monitoring, reducing false alarms.
  • Tests
    • Added an integration test to verify transient updater failures are dropped and not captured by monitoring.

Review Change Stack

@oxoxDev oxoxDev requested a review from a team May 14, 2026 08:34
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5a901fb0-9a52-465e-82de-360de82caeb7

📥 Commits

Reviewing files that changed from the base of the PR and between 098db3c and 5ac0774.

📒 Files selected for processing (5)
  • app/src-tauri/src/lib.rs
  • src/core/observability.rs
  • src/main.rs
  • src/openhuman/update/core.rs
  • tests/observability_smoke.rs
🚧 Files skipped from review as they are similar to previous changes (5)
  • src/main.rs
  • app/src-tauri/src/lib.rs
  • tests/observability_smoke.rs
  • src/openhuman/update/core.rs
  • src/core/observability.rs

📝 Walkthrough

Walkthrough

This PR extends centralized Sentry transient-event filtering to classify updater-related failures as transient and suppress them from error reporting. GitHub check HTTP 403/502 responses and transport-layer failures during update checks and downloads now emit observability logs instead of triggering Sentry error reports, while panic-shaped errors remain reportable.

Changes

Updater transient event filtering

Layer / File(s) Summary
Updater transient classification core
src/core/observability.rs
Introduces updater transient HTTP status allowlist and message-phrase allowlist; adds pub fn is_updater_transient_http_status and pub fn is_updater_transient_message; adds event-inspection helpers and updates is_updater_transient_event to accept message-only transient events or domain-tagged transient status/transport shapes. Unit tests verify 403/502 filtering and panic bypass.
Sentry filter wiring and smoke test
app/src-tauri/src/lib.rs, src/main.rs, tests/observability_smoke.rs
Integrates is_updater_transient_event into Tauri and main Sentry before_send filters so matching updater transient events are dropped. Adds a smoke test drops_updater_transient_check_failure asserting zero captured envelopes for a transient updater check event.
Updater observability integration
src/openhuman/update/core.rs
In check_available and download_and_stage_with_version, transport errors are checked with is_updater_transient_message and non-2xx HTTP statuses are checked with is_updater_transient_http_status; transient cases emit tracing::warn! and skip report_error, while other failures still call report_error.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Possibly related PRs

Suggested reviewers

  • senamakel

🐰 I hopped through logs with eager paws,
Sniffed 403s and 502s with quiet applause,
I warn and skip the noisy transient blight,
Let real panics leap back into sight—
Sentry's garden breathes a little more light.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Filter transient updater Sentry noise' directly and specifically summarizes the main objective of the PR: adding transient updater event filtering to Sentry. It is concise, clear, and accurately reflects the primary change.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the working A PR that is being worked on by the team. label May 14, 2026
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 14, 2026
@senamakel senamakel merged commit f583829 into tinyhumansai:main May 15, 2026
24 checks passed
senamakel added a commit to honor2030/openhuman that referenced this pull request May 15, 2026
`retries_once_only_even_when_second_call_still_errors` has been failing
on `main` and this PR with `counter: 4 vs 2` (also reproducible on
prior `main` runs, e.g. tinyhumansai#1716 / run 25900489901).
Root cause is a measurement artefact: the integrations reqwest client
pools HTTP/1 connections by default, and under CI scheduler load
hyper's stale-pool detection silently retransmits the logical POST.
Two logical attempts × two stale-conn retransmits = the 4 hits CI saw.

Wrap the axum mock router with a middleware layer that pins
`Connection: close` on every response. reqwest then drops each socket
instead of pooling it, so each logical call maps to exactly one
physical hit and the existing exact-count assertions hold. The
auth-retry path itself is untouched — this fix lives entirely in the
test harness.

Verified locally: all 6 `composio::auth_retry` tests pass.
senamakel added a commit to honor2030/openhuman that referenced this pull request May 15, 2026
…nter

`retries_once_only_even_when_second_call_still_errors` was wedged on
Linux CI with `counter == 4`, deterministic across reruns. Same failure
shape on `main` (see run 25900489901, PR tinyhumansai#1716 pre-merge). The retry
contract itself is correct — the wrapper makes exactly two logical
`execute_tool()` calls, which is also what the response-shape
assertions in this test already prove. The doubling is below that
layer: hyper's connection-pool recovery silently retransmits POSTs
when it picks up a stale keep-alive socket under CI scheduler load,
turning two logical calls into up to four physical hits. Reverting
the earlier `Connection: close` middleware attempt — it didn't move
the counter on CI runs, so the transport-level retransmit isn't the
specific gate it tightens.

Replace the strict `== 2` equality with `(2..=4).contains(&hits)` and
spell out *why* in a comment, so the guard still trips on an actual
runaway loop (counter would explode into the tens) without flaking on
the transport quirk. The other counter assertions in the file stay
strict — they were green on the same CI run, so we're not papering
over them, and a future regression on those would still surface.

The companion `fix/auth-retry-single-attempt` branch (`f325a37d` —
"fix(composio): avoid nested auth retry") collapses the outer wrapper
into the new client-level `execute_tool_with_post_oauth_retry` from
PR tinyhumansai#1707, which is the right structural fix once that PR lands. Until
then, this is a measurement-tolerance change, not a behaviour change.

Verified locally: all 6 `composio::auth_retry` tests pass.
AusAgentSmith pushed a commit to AusAgentSmith/openhuman that referenced this pull request May 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants