Skip to content

AWS Transcribe STT plugin#528

Merged
dangusev merged 14 commits intomainfrom
feature/aws-transcribe-stt-plugin
May 5, 2026
Merged

AWS Transcribe STT plugin#528
dangusev merged 14 commits intomainfrom
feature/aws-transcribe-stt-plugin

Conversation

@dangusev
Copy link
Copy Markdown
Collaborator

Why

This PR adds a plugin for AWS Transcribe STT, completing the selection of AWS plugins (LLM, TTS, STT, and Realtime).

Changes

  • new aws.TranscribeSTT plugin with turn detection and reconnection handling
  • moved Boto3CredentialsResolver to a shared _credentials module (it used to live in aws_realtime.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 30, 2026

📝 Walkthrough

Walkthrough

Adds TranscribeSTT, an AWS Transcribe streaming STT backend that resamples audio to 16 kHz, manages a duplex stream, handles partial/final transcripts and turn lifecycle, suppresses stale results via a media-time watermark, and implements automatic reconnect with capped exponential backoff and graceful shutdown. Adds Boto3CredentialsResolver to load AWS credentials via boto3.Session. Exposes STT from the aws plugin package, updates realtime to use the new resolver, extends TTS to accept explicit credentials/profiles, adds tests and an example pipeline, updates README, and adds an aws transcribe streaming dependency.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
plugins/aws/vision_agents/plugins/aws/_credentials.py (1)

11-18: ⚡ Quick win

Keep this resolver behind a public API.

aws_realtime.py and stt.py now need to import this helper directly, which violates the repo rule against importing private modules outside __init__.py. Re-export Boto3CredentialsResolver from the package, or move it to a public module, before other plugin code depends on it. As per coding guidelines, Never import from private modules (_foo) outside of the package's own __init__.py.


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2002b825-97b3-44e4-8773-5deb26f1c014

📥 Commits

Reviewing files that changed from the base of the PR and between 4927062 and 81a046a.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • plugins/aws/pyproject.toml
  • plugins/aws/tests/test_aws_stt.py
  • plugins/aws/vision_agents/plugins/aws/__init__.py
  • plugins/aws/vision_agents/plugins/aws/_credentials.py
  • plugins/aws/vision_agents/plugins/aws/aws_realtime.py
  • plugins/aws/vision_agents/plugins/aws/stt.py

Comment thread plugins/aws/vision_agents/plugins/aws/_credentials.py Outdated
Comment thread plugins/aws/vision_agents/plugins/aws/stt.py
Comment thread plugins/aws/vision_agents/plugins/aws/stt.py
Comment thread plugins/aws/vision_agents/plugins/aws/stt.py
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d8543c6c-869a-467b-abf6-f2e179c2c76f

📥 Commits

Reviewing files that changed from the base of the PR and between 81a046a and 7c8a762.

📒 Files selected for processing (3)
  • plugins/aws/tests/test_aws_stt.py
  • plugins/aws/vision_agents/plugins/aws/_credentials.py
  • plugins/aws/vision_agents/plugins/aws/stt.py
✅ Files skipped from review due to trivial changes (1)
  • plugins/aws/vision_agents/plugins/aws/stt.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • plugins/aws/tests/test_aws_stt.py

Comment thread plugins/aws/vision_agents/plugins/aws/_credentials.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
plugins/aws/vision_agents/plugins/aws/_credentials.py (1)

2-2: ⚡ Quick win

Use str | None instead of Optional[str].

Optional[str] is legacy typing syntax. Replace with str | None per the modern-syntax guideline, and drop Optional from the import.

Proposed fix
-from typing import Any, Optional
+from typing import Any
-    def __init__(self, profile_name: Optional[str] = None) -> None:
+    def __init__(self, profile_name: str | None = None) -> None:

As per coding guidelines: "Use modern syntax: X | Y unions".

Also applies to: 21-21

plugins/aws/vision_agents/plugins/aws/stt.py (3)

137-153: 💤 Low value

Dead branch in rollback.

self._supervisor_task is assigned on the last statement of the try; if anything raises, it can only have raised before that assignment, so self._supervisor_task is guaranteed None in the except. The cancel_and_wait block is unreachable. Drop it to keep the rollback honest, or move supervisor task creation earlier if you do want it covered.


359-400: 💤 Low value

Reconnect storm if _open_stream keeps failing.

On a persistent failure (e.g., AWS down, bad creds after rotation), the except Exception branch immediately re-sets _reconnect_event, but attempt was already incremented before the sleep at the top of the next iteration, so backoff continues to grow — good. However attempt is never capped, so after many failures 2.0**attempt overflows to inf before being clamped by min(...). That's fine numerically, but consider clamping attempt itself (e.g., attempt = min(attempt + 1, 30)) to avoid the implicit reliance on min masking unbounded growth, and to keep logs honest if you ever switch to logging the exponent.


402-427: ⚡ Quick win

timeout parameter is untyped.

Annotate the parameter to satisfy the project's "type annotations everywhere" rule.

Proposed fix
-    async def _close_streams(self, timeout=5.0):
+    async def _close_streams(self, timeout: float = 5.0) -> None:

As per coding guidelines: "Use type annotations everywhere."


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 72caac98-b7e0-4512-a8b5-e946ed545d8b

📥 Commits

Reviewing files that changed from the base of the PR and between 7c8a762 and 6967f10.

📒 Files selected for processing (2)
  • plugins/aws/vision_agents/plugins/aws/_credentials.py
  • plugins/aws/vision_agents/plugins/aws/stt.py

Comment thread plugins/aws/vision_agents/plugins/aws/stt.py
Comment thread plugins/aws/vision_agents/plugins/aws/stt.py Outdated
Comment thread plugins/aws/vision_agents/plugins/aws/stt.py
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2481afa3-0385-48a3-888f-ffaf40c0daaf

📥 Commits

Reviewing files that changed from the base of the PR and between 6967f10 and 4ff7751.

📒 Files selected for processing (1)
  • plugins/aws/README.md

Comment thread plugins/aws/README.md Outdated
Comment thread plugins/aws/README.md Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a2ef482d-67d1-487b-bf3c-69c9859eadcf

📥 Commits

Reviewing files that changed from the base of the PR and between fb29cca and 77c7f16.

📒 Files selected for processing (2)
  • plugins/aws/README.md
  • plugins/aws/vision_agents/plugins/aws/stt.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • plugins/aws/vision_agents/plugins/aws/stt.py

Comment thread plugins/aws/README.md Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: cf581245-b228-469e-8507-3a97a377e365

📥 Commits

Reviewing files that changed from the base of the PR and between 77c7f16 and cef2d91.

📒 Files selected for processing (1)
  • plugins/aws/vision_agents/plugins/aws/stt.py

Comment thread plugins/aws/vision_agents/plugins/aws/stt.py
Comment thread plugins/aws/vision_agents/plugins/aws/stt.py
Comment thread plugins/aws/vision_agents/plugins/aws/stt.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (5)
plugins/aws/vision_agents/plugins/aws/stt.py (5)

32-32: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Private-module import still in place.

from ._credentials import Boto3CredentialsResolver violates the project rule against importing from _foo modules outside the package's own __init__.py. Re-export Boto3CredentialsResolver from vision_agents/plugins/aws/__init__.py (or rename _credentials.py to credentials.py) and import via the public surface. The same fix applies to aws_realtime.py.

As per coding guidelines: "Never import from private modules (_foo) outside of the package's own __init__.py. Use the public re-export instead."


69-70: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate max_reconnect_backoff_seconds > 0.

A non-positive value makes await asyncio.sleep(backoff) at Line 383 a no-op and the supervisor will spin reconnects against AWS without any backoff. Reject in __init__.

Proposed fix
         if bool(aws_access_key_id) != bool(aws_secret_access_key):
             raise ValueError(
                 "aws_access_key_id and aws_secret_access_key must be provided together"
             )
+        if max_reconnect_backoff_seconds <= 0:
+            raise ValueError(
+                "max_reconnect_backoff_seconds must be greater than 0"
+            )

As per coding guidelines: "Raise ValueError with a descriptive message for invalid constructor arguments."


196-211: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Provisional stream leaks if await_output() times out or is cancelled.

asyncio.wait_for(_connect(), timeout=timeout) cancels _connect() mid-await_output(), but the already-created _stream is local to the coroutine and never closed. Each timed-out connect leaks an open Transcribe session. Close the stream on cancellation/error inside _connect().

Proposed fix
         async def _connect():
-            _stream = await client.start_stream_transcription(
+            stream = await client.start_stream_transcription(
                 input=self._build_transcription_input()
             )
-            _, _output_stream = await _stream.await_output()
-            return _stream, _output_stream
+            try:
+                _, output_stream = await stream.await_output()
+            except BaseException:
+                try:
+                    await stream.close()
+                except Exception:
+                    logger.warning("Error closing stream during connect rollback", exc_info=True)
+                raise
+            return stream, output_stream

As per coding guidelines: "Clean up resources in finally blocks."


311-320: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Final-only result skips turn_started.

When AWS emits a final Result directly with no preceding partial (short utterance, or enable_partial_results_stabilization=False), _turn_in_progress is False, so the code emits transcript + turn_ended without ever emitting turn_started. Consumers that pair start/end events get an unbalanced sequence.

Proposed fix
             if result.is_partial:
                 if not self._turn_in_progress:
                     self._turn_in_progress = True
                     self._emit_turn_started_event(participant)
                 self._emit_partial_transcript_event(text, participant, response)
             else:
+                if not self._turn_in_progress:
+                    self._emit_turn_started_event(participant)
                 self._emit_transcript_event(text, participant, response)
                 self._audio_start_time = None
                 self._turn_in_progress = False
                 self._emit_turn_ended_event(participant)

385-396: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

_watermark_lock held across reconnect I/O stalls process_audio.

process_audio acquires _watermark_lock to send each frame. The supervisor holds the same lock across _close_streams() (which awaits _recv_task up to 5s) and _open_stream() (up to 10s). During reconnect, every audio chunk on the producer side blocks on the lock instead of being dropped immediately, causing upstream stalls. Split the critical sections: take the lock only to swap watermark state and stream references; do close/open outside the lock.

🧹 Nitpick comments (1)
plugins/aws/vision_agents/plugins/aws/tts.py (1)

77-83: 💤 Low value

Concurrent first-access of client can build two boto3 sessions.

Two coroutines awaiting self.client before _client is assigned will each spawn a boto3.Session(...).client("polly") thread. Only one survives; the other is leaked. Guard with an asyncio.Lock (or pre-build in __init__ lazily on first use under lock).


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: db0fad13-1b20-49a8-ae51-d3065a56d5a0

📥 Commits

Reviewing files that changed from the base of the PR and between cef2d91 and 9946dfe.

📒 Files selected for processing (4)
  • plugins/aws/README.md
  • plugins/aws/tests/test_tts.py
  • plugins/aws/vision_agents/plugins/aws/stt.py
  • plugins/aws/vision_agents/plugins/aws/tts.py

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
plugins/aws/vision_agents/plugins/aws/stt.py (2)

93-96: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Also reject non-positive max_reconnect_backoff_seconds.

max_reconnect_backoff_seconds <= 0 makes the supervisor's min(2**attempt, cap) evaluate to ≤0, so asyncio.sleep(backoff) returns immediately and persistent failures spin a hot reconnect loop against AWS.

Proposed fix
         if bool(aws_access_key_id) != bool(aws_secret_access_key):
             raise ValueError(
                 "aws_access_key_id and aws_secret_access_key must be provided together"
             )
+        if max_reconnect_backoff_seconds <= 0:
+            raise ValueError(
+                "max_reconnect_backoff_seconds must be greater than 0"
+            )

As per coding guidelines, "Raise ValueError with a descriptive message for invalid constructor arguments."


326-335: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Final-only result emits turn_ended without a matching turn_started.

When AWS sends a final Result with no preceding partial (very short utterances or enable_partial_results_stabilization=False), _turn_in_progress is False and the code emits transcript + turn_ended without ever emitting turn_started, leaving downstream pair-matchers unbalanced.

Proposed fix
             if result.is_partial:
                 if not self._turn_in_progress:
                     self._turn_in_progress = True
                     self._emit_turn_started_event(participant)
                 self._emit_partial_transcript_event(text, participant, response)
             else:
+                if not self._turn_in_progress:
+                    self._emit_turn_started_event(participant)
                 self._emit_transcript_event(text, participant, response)
                 self._audio_start_time = None
                 self._turn_in_progress = False
                 self._emit_turn_ended_event(participant)

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6e8a7ee6-0554-4fce-877c-a6e36ac4cdea

📥 Commits

Reviewing files that changed from the base of the PR and between 9946dfe and da8d175.

📒 Files selected for processing (1)
  • plugins/aws/vision_agents/plugins/aws/stt.py

Comment thread plugins/aws/vision_agents/plugins/aws/stt.py
@dangusev dangusev merged commit 3a80e7f into main May 5, 2026
6 checks passed
@dangusev dangusev deleted the feature/aws-transcribe-stt-plugin branch May 5, 2026 21:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants