refactor(owhisper-client): extract shared utilities for adapters #2130

yujonglee · 2025-12-05T07:04:40Z

refactor(owhisper-client): extract shared utilities for adapters

Summary

This PR reduces code duplication in the STT adapter implementations by extracting common patterns into shared utility modules:

adapter/audio.rs: Shared audio decoding utilities (decode_audio_to_linear16, decode_audio_to_bytes, mix_to_mono) that were previously duplicated across deepgram, argmax, and assemblyai batch adapters
adapter/http.rs: Shared HTTP error handling (ensure_success, parse_json_response, parse_provider_json) to standardize response validation
parsing.rs: Added TranscriptResponseBuilder and build_transcript_response helper for constructing StreamResponse objects
test_utils.rs: Added define_realtime_e2e_tests! macro to reduce boilerplate in E2E tests

The deepgram, argmax, and assemblyai batch adapters have been refactored to use these shared utilities.

Updates since last revision

Fixed audio resampling: The original duplicated code passed the source sample_rate to resample_audio(), which was a no-op. Now it properly resamples to 16kHz (TARGET_SAMPLE_RATE = 16000), which is the standard rate expected by STT services.

Review & Testing Checklist for Human

Verify 16kHz resampling is correct: The shared decode_audio_to_linear16 now resamples all audio to 16kHz. The original code was NOT resampling (passing source rate was a no-op). Confirm this behavioral change is intended and doesn't break STT providers that expect different sample rates.
Run E2E tests with real STT providers: Use infisical run --env=dev --projectId=87dad7b5-72a6-4791-9228-b3b86b169db1 --path="/stt" -- cargo test --ignored to verify deepgram, argmax, and assemblyai batch transcription still works with the 16kHz resampling
Verify stereo-to-mono mixing is unchanged: The mix_to_mono function was extracted from duplicated code - confirm the mixing logic produces identical results
Decide on unused utilities: TranscriptResponseBuilder, build_transcript_response, parse_json_response, and parse_provider_json are added but not yet used (compiler warnings confirm). Decide if these should be removed or kept for follow-up work

Notes

The new utilities in parsing.rs and some in http.rs are not yet adopted by adapters - they were added as infrastructure for future refactoring
JSON parse errors in http.rs are mapped to Error::AudioProcessing as a workaround to avoid adding a new error variant - this is semantically imprecise
The E2E test macro was added but existing tests weren't migrated to use it yet

Link to Devin run: https://app.devin.ai/sessions/127bbb6142c340ffba9fedd68f22ed9c
Requested by: yujonglee (@yujonglee)

- Add shared audio decoding utilities in adapter/audio.rs - Add shared HTTP error handling utilities in adapter/http.rs - Add TranscriptResponseBuilder and build_transcript_response helper in parsing.rs - Add define_realtime_e2e_tests! macro in test_utils.rs - Refactor deepgram/batch.rs to use shared audio and HTTP utilities - Refactor argmax/batch.rs to use shared audio and HTTP utilities - Refactor assemblyai/batch.rs to use shared audio and HTTP utilities Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>

devin-ai-integration · 2025-12-05T07:04:43Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR that start with 'DevinAI' or '@devin'.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

netlify · 2025-12-05T07:04:46Z

✅ Deploy Preview for hyprnote ready!

Name	Link
🔨 Latest commit	`23ccca4`
🔍 Latest deploy log	https://app.netlify.com/projects/hyprnote/deploys/69328e460628160008331159
😎 Deploy Preview	https://deploy-preview-2130--hyprnote.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

coderabbitai · 2025-12-05T07:05:05Z

📝 Walkthrough

Walkthrough

Shared HTTP response handling and audio decoding were extracted into new adapter/http.rs and adapter/audio.rs modules. Adapter batch implementations (argmax, assemblyai, deepgram) were updated to use these helpers. A transcript response builder and a realtime test-generation macro were added; adapter module now exposes audio and http.

Changes

Cohort / File(s)	Summary
HTTP helpers (new) `src/adapter/http.rs`	Added `ensure_success(Response) -> Result<Response, Error>`, `parse_json_response<T>(Response, provider) -> Result<T, Error>`, and `parse_provider_json<T>(raw, provider) -> Option<T>` with tests and centralized status/body error handling.
Audio decoding (new) `src/adapter/audio.rs`	Added async `decode_audio_to_linear16(PathBuf) -> Result<(Bytes, u32), Error>` and `decode_audio_to_bytes(PathBuf) -> Result<Bytes, Error>`, plus `mix_to_mono` and tests; uses spawn_blocking, resampling to 16k, and i16 encoding.
Adapter batch refactors `src/adapter/argmax/batch.rs`, `src/adapter/assemblyai/batch.rs`, `src/adapter/deepgram/batch.rs`	Replaced per-file HTTP status checks with `ensure_success`, removed local audio decode implementations and import `decode_audio_to_*`; JSON deserialization moved after success check.
Adapter module surface `src/adapter/mod.rs`	Exposed new public modules: `audio` and `http`.
Transcript parsing API `src/adapter/parsing.rs`	Added `build_transcript_response(...) -> StreamResponse` and `TranscriptResponseBuilder` fluent API; updated imports to include `Alternatives, Channel, Metadata, StreamResponse, Word`.
Test utilities `src/test_utils.rs`	Added `#[macro_export] define_realtime_e2e_tests!` macro (two variants) to generate tokio-based realtime E2E test templates.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Review audio resampling/mixing/encoding and error mapping in src/adapter/audio.rs.
Verify ensure_success and parse_json_response handle edge cases (non-UTF8 bodies, large bodies) and used consistently across batch adapters.
Confirm adapter refactors preserved prior control flow and error types in argmax, assemblyai, and deepgram batch modules.
Check TranscriptResponseBuilder output matches existing contract and macro syntax in test_utils.rs.

Possibly related PRs

Add AssemblyAI adapter for streaming and batch transcription #2073 — touches src/adapter/assemblyai/batch.rs and overlaps response handling changes applied here.
Explicit sample_rate in owhisper client #1651 — modifies audio decoding and sample-rate handling; relates to extracted decode_audio_to_linear16.
feat(owhisper-client): add response parsing utilities #2111 — updates adapter/parsing.rs utilities and builder-like helpers similar to the new transcript builder.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 22.58% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'refactor(owhisper-client): extract shared utilities for adapters' directly and concisely describes the main objective of the changeset—extracting duplicate code into shared utility modules across STT adapters.
Description check	✅ Passed	The description is comprehensive and relates directly to the changeset, detailing the new utility modules (audio.rs, http.rs, parsing.rs, test_utils.rs), refactored adapters, and behavioral changes.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch devin/1764917644-adapter-refactoring

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ad5a95d and 23ccca4.

📒 Files selected for processing (1)

owhisper/owhisper-client/src/adapter/audio.rs (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

owhisper/owhisper-client/src/adapter/audio.rs

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: Redirect rules - hyprnote
GitHub Check: fmt
GitHub Check: Header rules - hyprnote
GitHub Check: Pages changed - hyprnote
GitHub Check: Devin

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

netlify · 2025-12-05T07:05:15Z

✅ Deploy Preview for hyprnote-storybook ready!

Name	Link
🔨 Latest commit	`23ccca4`
🔍 Latest deploy log	https://app.netlify.com/projects/hyprnote-storybook/deploys/69328e469164fd0008da37c3
😎 Deploy Preview	https://deploy-preview-2130--hyprnote-storybook.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (4)

owhisper/owhisper-client/src/test_utils.rs (1)

11-78: Verify that the macro is intended for single invocation per test module.

Both arms of this macro generate test functions with identical names (test_build_single and test_build_dual). If the macro is invoked multiple times within the same module, it will cause duplicate definition errors. Please confirm this is the intended design—that each test module should invoke the macro only once for a specific adapter.

The macro implementation is correct and follows proper hygiene practices with $crate:: paths. The two forms appropriately handle default vs. custom parameters.

Optional: Consider reducing duplication between the two macro arms.

The two forms are nearly identical except for the params line (lines 26 vs 59, and 39 vs 72). Consider whether a single form with an optional params parameter might simplify maintenance, though the current explicit design may be clearer for users.

Optional: Add documentation for the macro.

Consider adding doc comments explaining:

When to use each form (default params vs custom params)

The purpose of each parameter (adapter type, provider name, env key, base URL)

Expected usage pattern (one invocation per test module)

owhisper/owhisper-client/src/adapter/parsing.rs (1)

80-172: Well-designed fluent builder API.

The TranscriptResponseBuilder provides a clean, ergonomic API with sensible defaults. The fallback to computed timing from words when not explicitly set is a good design choice.

One minor observation: there's no confidence setter on the builder, so it always defaults to 1.0. If this is intentional (confidence is always assumed to be 1.0 for these use cases), this is fine. Otherwise, consider adding a confidence method for completeness.
owhisper/owhisper-client/src/adapter/http.rs (2)
26-31: Consider potential PII/sensitive data in logged bodies.

Both parse_json_response and parse_provider_json log the full response body/raw JSON on parse failure. If API responses could contain sensitive information (user data, API keys in error messages, etc.), this might inadvertently log PII.

Consider truncating the logged body or sanitizing it:
 tracing::warn!(
     error = ?e,
     %provider,
-    body = %text,
+    body = %text.chars().take(500).collect::<String>(),
     "stt_json_parse_failed"
 );
Also applies to: 44-49

32-35: Consider a more specific error variant for JSON parsing.

Using Error::AudioProcessing for JSON parse errors is semantically confusing since it's not actually an audio processing issue. If the Error enum supports it, consider a more descriptive variant like Error::JsonParsing or similar.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5f898bb and ad5a95d.

📒 Files selected for processing (8)

owhisper/owhisper-client/src/adapter/argmax/batch.rs (2 hunks)
owhisper/owhisper-client/src/adapter/assemblyai/batch.rs (4 hunks)
owhisper/owhisper-client/src/adapter/audio.rs (1 hunks)
owhisper/owhisper-client/src/adapter/deepgram/batch.rs (2 hunks)
owhisper/owhisper-client/src/adapter/http.rs (1 hunks)
owhisper/owhisper-client/src/adapter/mod.rs (1 hunks)
owhisper/owhisper-client/src/adapter/parsing.rs (2 hunks)
owhisper/owhisper-client/src/test_utils.rs (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (4)

owhisper/owhisper-client/src/adapter/deepgram/batch.rs (3)

owhisper/owhisper-client/src/adapter/audio.rs (1)

decode_audio_to_linear16 (7-31)

owhisper/owhisper-client/src/adapter/deepgram_compat/mod.rs (1)

build_batch_url (97-145)

owhisper/owhisper-client/src/adapter/http.rs (1)

ensure_success (6-14)

owhisper/owhisper-client/src/adapter/assemblyai/batch.rs (2)

owhisper/owhisper-client/src/adapter/audio.rs (1)

decode_audio_to_bytes (33-36)

owhisper/owhisper-client/src/adapter/http.rs (1)

ensure_success (6-14)

owhisper/owhisper-client/src/adapter/audio.rs (1)

crates/audio-utils/src/lib.rs (3)

f32_to_i16_bytes (66-76)

resample_audio (171-220)

source_from_path (129-135)

owhisper/owhisper-client/src/adapter/parsing.rs (3)

packages/store/src/schema-external.ts (1)

Word (171-171)

owhisper/owhisper-client/src/adapter/assemblyai/live.rs (2)

words (240-243)

words (249-252)

owhisper/owhisper-interface/src/batch.rs (1)

channel (85-89)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: Redirect rules - hyprnote-storybook
GitHub Check: Header rules - hyprnote-storybook
GitHub Check: Pages changed - hyprnote-storybook
GitHub Check: Redirect rules - hyprnote
GitHub Check: Header rules - hyprnote
GitHub Check: Pages changed - hyprnote
GitHub Check: Devin
GitHub Check: fmt

🔇 Additional comments (9)

owhisper/owhisper-client/src/adapter/parsing.rs (1)

48-78: LGTM - Clean helper function for transcript response construction.

The function correctly computes timing from words and constructs a well-formed StreamResponse. The hardcoded confidence: 1.0 is consistent with the builder implementation below.

owhisper/owhisper-client/src/adapter/mod.rs (1)

3-3: LGTM - New modules correctly exposed.

The new audio and http modules are appropriately declared as public, enabling their use across adapter implementations.

Also applies to: 8-8

owhisper/owhisper-client/src/adapter/deepgram/batch.rs (1)

62-63: LGTM - Clean refactor to centralized HTTP handling.

The replacement of manual status checking with ensure_success followed by response.json() simplifies the code while maintaining equivalent error handling. This is a good application of DRY principles.

owhisper/owhisper-client/src/adapter/assemblyai/batch.rs (2)

127-128: LGTM - Consistent use of ensure_success across all API calls.

The upload, transcript creation, and polling responses all now use the centralized ensure_success helper, providing uniform error handling throughout the transcription workflow.

176-177: Good integration within the polling loop.

The ensure_success call inside the polling closure maintains the same error semantics while simplifying the code. Non-2xx responses during polling will now be handled consistently.

owhisper/owhisper-client/src/adapter/argmax/batch.rs (1)

60-61: LGTM - Matches the pattern in other batch adapters.

Consistent refactoring to use ensure_success followed by response.json(), aligning with the Deepgram adapter implementation.

owhisper/owhisper-client/src/adapter/http.rs (1)

6-14: LGTM - Clean HTTP status validation.

The ensure_success function correctly checks for 2xx status codes and captures both status and body for error reporting.

owhisper/owhisper-client/src/adapter/audio.rs (2)

38-53: LGTM - Correct mono mixing implementation.

The mix_to_mono function correctly handles edge cases:

Returns input unchanged for single-channel audio

Properly averages all channels per frame

Handles empty input gracefully

The use of frame.len() as f32 for division is safe since empty frames are skipped via continue.

59-76: Good test coverage for audio decoding.

Tests verify both decode_audio_to_linear16 and decode_audio_to_bytes produce non-empty output with valid sample rates. The mono mixing tests cover single-channel, stereo, and empty input cases.

owhisper/owhisper-client/src/adapter/audio.rs

Previously, resample_audio was called with the source sample rate, which was a no-op. Now it properly resamples to 16kHz, which is the standard sample rate expected by STT services. Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>

coderabbitai bot reviewed Dec 5, 2025

View reviewed changes

owhisper/owhisper-client/src/adapter/audio.rs Outdated Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(owhisper-client): extract shared utilities for adapters #2130

refactor(owhisper-client): extract shared utilities for adapters #2130

Uh oh!

yujonglee commented Dec 5, 2025 •

edited by devin-ai-integration bot

Loading

Uh oh!

devin-ai-integration bot commented Dec 5, 2025

Uh oh!

netlify bot commented Dec 5, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Dec 5, 2025 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

netlify bot commented Dec 5, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

refactor(owhisper-client): extract shared utilities for adapters #2130

Are you sure you want to change the base?

refactor(owhisper-client): extract shared utilities for adapters #2130

Uh oh!

Conversation

yujonglee commented Dec 5, 2025 • edited by devin-ai-integration bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

refactor(owhisper-client): extract shared utilities for adapters

Summary

Updates since last revision

Review & Testing Checklist for Human

Notes

Uh oh!

devin-ai-integration bot commented Dec 5, 2025

🤖 Devin AI Engineer

Uh oh!

netlify bot commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for hyprnote ready!

Uh oh!

coderabbitai bot commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Pre-merge checks and finishing touches

Uh oh!

netlify bot commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for hyprnote-storybook ready!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yujonglee commented Dec 5, 2025 •

edited by devin-ai-integration bot

Loading

netlify bot commented Dec 5, 2025 •

edited

Loading

coderabbitai bot commented Dec 5, 2025 •

edited

Loading

netlify bot commented Dec 5, 2025 •

edited

Loading