Skip to content

Conversation

@ftsef
Copy link

@ftsef ftsef commented Jan 20, 2026

Summary by CodeRabbit

  • New Features
    • Added support for selecting US or EU AssemblyAI streaming endpoints. Users can now configure the streaming endpoint during setup (US is the default). The selected endpoint is used for live transcription connections, and documentation now describes endpoint options and usage.

✏️ Tip: You can customize this high-level summary in your review settings.

@CLAassistant
Copy link

CLAassistant commented Jan 20, 2026

CLA assistant check
All committers have signed the CLA.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 20, 2026

📝 Walkthrough

Walkthrough

Adds an endpoint_url parameter to STT and SpeechStream, stores it as _endpoint_url, and uses it to build the AssemblyAI WebSocket connection instead of a hardcoded URL.

Changes

Cohort / File(s) Summary
AssemblyAI STT & SpeechStream
livekit-plugins/livekit-plugins-assemblyai/livekit/plugins/assemblyai/stt.py
Added endpoint_url parameter to STT.__init__ (default wss://streaming.assemblyai.com/v3/ws) and stored as _endpoint_url. STT.stream() forwards endpoint_url to SpeechStream. SpeechStream.__init__ now accepts/stores endpoint_url, and _connect_ws() constructs the WebSocket URL from self._endpoint_url instead of a hardcoded value.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant STT
    participant SpeechStream
    participant AssemblyAI_WS

    Client->>STT: request stream(start, optional endpoint_url)
    STT->>SpeechStream: create stream(..., endpoint_url)
    SpeechStream->>AssemblyAI_WS: open WebSocket at endpoint_url
    SpeechStream->>AssemblyAI_WS: send audio frames
    AssemblyAI_WS-->>SpeechStream: deliver transcription events
    SpeechStream-->>STT: forward transcriptions
    STT-->>Client: emit transcription results
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

🐇 I hopped the nets from US to EU,
A little param to guide my view;
No more hard paths in tunnels deep,
I choose the websocket where secrets sleep. ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title directly and concisely describes the main change: adding support for AssemblyAI's EU streaming endpoint option via the new endpoint_url parameter.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eba2714 and 380d2df.

📒 Files selected for processing (1)
  • livekit-plugins/livekit-plugins-assemblyai/livekit/plugins/assemblyai/stt.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-plugins/livekit-plugins-assemblyai/livekit/plugins/assemblyai/stt.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: unit-tests
  • GitHub Check: type-check (3.9)
🔇 Additional comments (2)
livekit-plugins/livekit-plugins-assemblyai/livekit/plugins/assemblyai/stt.py (2)

80-87: LGTM! The endpoint_url parameter and docstring are well-implemented.

The docstring clearly documents the EU endpoint option and the default value. Consider optionally adding basic URL validation (e.g., checking for wss:// scheme) to fail fast on misconfiguration, but this is not critical.


201-208: LGTM!

The endpoint_url parameter is properly added to SpeechStream as a required parameter (appropriate since it's an internal class instantiated by STT.stream()), and stored correctly.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@longcw longcw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me! something nit

conn_options: APIConnectOptions,
api_key: str,
http_session: aiohttp.ClientSession,
endpoint_url: str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move this to the conn_options?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not quite sure what you mean.
The conn_options is a shared object not in the AssemblyAI module. So i can't just add an other property here: https://github.com/livekit/agents/blob/main/livekit-agents/livekit/agents/types.py#L54

Could you suggest how/what to change?
Thank you!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In
`@livekit-plugins/livekit-plugins-assemblyai/livekit/plugins/assemblyai/stt.py`:
- Around line 80-86: The docstring for the parameter endpoint_url is incomplete;
update the parameter description (in the function/method docstring that includes
endpoint_url) to explicitly state the default endpoint, e.g. "Defaults to the US
endpoint (wss://streaming.assemblyai.com/v3/ws)." Ensure the sentence replaces
the dangling "Defaults to US" phrase and clearly names the default URL.
- Around line 359-365: The URL building currently concatenates
self._endpoint_url and a new query string which breaks if self._endpoint_url
already contains query params; instead parse self._endpoint_url (using
urllib.parse.urlparse), merge its existing query params (urllib.parse.parse_qs)
with the filtered_config dict (ensuring boolean -> "true"/"false" conversion
already in filtered_config), flatten/encode the combined params with
urllib.parse.urlencode(doseq=True), reconstruct the full URL with
urllib.parse.urlunparse, and then call await self._session.ws_connect(url,
headers=headers); update the logic around filtered_config, self._endpoint_url,
urlencode, and the ws_connect call to use the merged-safe URL.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between acabf18 and eba2714.

📒 Files selected for processing (1)
  • livekit-plugins/livekit-plugins-assemblyai/livekit/plugins/assemblyai/stt.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-plugins/livekit-plugins-assemblyai/livekit/plugins/assemblyai/stt.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: type-check (3.9)
  • GitHub Check: unit-tests
  • GitHub Check: type-check (3.13)
🔇 Additional comments (2)
livekit-plugins/livekit-plugins-assemblyai/livekit/plugins/assemblyai/stt.py (2)

95-156: Endpoint URL is correctly threaded into new streams.

This cleanly ensures all streams use the configured endpoint.


200-208: SpeechStream stores endpoint URL cleanly.

No concerns with the propagation pattern here.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines 359 to 365
filtered_config = {
k: ("true" if v else "false") if isinstance(v, bool) else v
for k, v in live_config.items()
if v is not None
}
url = f"{ws_url}?{urlencode(filtered_config)}"
url = f"{self._endpoint_url}?{urlencode(filtered_config)}"
ws = await self._session.ws_connect(url, headers=headers)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Build the WS URL safely if endpoint_url already has query params.

f"{self._endpoint_url}?..." will produce an invalid URL if the caller includes existing query parameters (double ?). Safer to merge queries.

🛠️ Proposed fix
-from urllib.parse import urlencode
+from urllib.parse import parse_qsl, urlencode, urlsplit, urlunsplit
@@
-        url = f"{self._endpoint_url}?{urlencode(filtered_config)}"
+        split = urlsplit(self._endpoint_url)
+        query = dict(parse_qsl(split.query))
+        query.update(filtered_config)
+        url = urlunsplit((split.scheme, split.netloc, split.path, urlencode(query), split.fragment))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
filtered_config = {
k: ("true" if v else "false") if isinstance(v, bool) else v
for k, v in live_config.items()
if v is not None
}
url = f"{ws_url}?{urlencode(filtered_config)}"
url = f"{self._endpoint_url}?{urlencode(filtered_config)}"
ws = await self._session.ws_connect(url, headers=headers)
filtered_config = {
k: ("true" if v else "false") if isinstance(v, bool) else v
for k, v in live_config.items()
if v is not None
}
split = urlsplit(self._endpoint_url)
query = dict(parse_qsl(split.query))
query.update(filtered_config)
url = urlunsplit((split.scheme, split.netloc, split.path, urlencode(query), split.fragment))
ws = await self._session.ws_connect(url, headers=headers)
🤖 Prompt for AI Agents
In `@livekit-plugins/livekit-plugins-assemblyai/livekit/plugins/assemblyai/stt.py`
around lines 359 - 365, The URL building currently concatenates
self._endpoint_url and a new query string which breaks if self._endpoint_url
already contains query params; instead parse self._endpoint_url (using
urllib.parse.urlparse), merge its existing query params (urllib.parse.parse_qs)
with the filtered_config dict (ensuring boolean -> "true"/"false" conversion
already in filtered_config), flatten/encode the combined params with
urllib.parse.urlencode(doseq=True), reconstruct the full URL with
urllib.parse.urlunparse, and then call await self._session.ws_connect(url,
headers=headers); update the logic around filtered_config, self._endpoint_url,
urlencode, and the ws_connect call to use the merged-safe URL.

keyterms_prompt: NotGivenOr[list[str]] = NOT_GIVEN,
http_session: aiohttp.ClientSession | None = None,
buffer_size_seconds: float = 0.05,
endpoint_url: str = "wss://streaming.assemblyai.com/v3/ws",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you want this to be a base_url instead? do you expect the /v3/ws parameter to be modifiable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants