-
Notifications
You must be signed in to change notification settings - Fork 2.6k
AssemblyAI added EU streaming endpoint option #4571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughAdds an Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant STT
participant SpeechStream
participant AssemblyAI_WS
Client->>STT: request stream(start, optional endpoint_url)
STT->>SpeechStream: create stream(..., endpoint_url)
SpeechStream->>AssemblyAI_WS: open WebSocket at endpoint_url
SpeechStream->>AssemblyAI_WS: send audio frames
AssemblyAI_WS-->>SpeechStream: deliver transcription events
SpeechStream-->>STT: forward transcriptions
STT-->>Client: emit transcription results
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: Organization UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (1)**/*.py📄 CodeRabbit inference engine (AGENTS.md)
Files:
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
🔇 Additional comments (2)
✏️ Tip: You can disable this entire section by setting Comment |
longcw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me! something nit
livekit-plugins/livekit-plugins-assemblyai/livekit/plugins/assemblyai/stt.py
Outdated
Show resolved
Hide resolved
| conn_options: APIConnectOptions, | ||
| api_key: str, | ||
| http_session: aiohttp.ClientSession, | ||
| endpoint_url: str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you move this to the conn_options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm not quite sure what you mean.
The conn_options is a shared object not in the AssemblyAI module. So i can't just add an other property here: https://github.com/livekit/agents/blob/main/livekit-agents/livekit/agents/types.py#L54
Could you suggest how/what to change?
Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In
`@livekit-plugins/livekit-plugins-assemblyai/livekit/plugins/assemblyai/stt.py`:
- Around line 80-86: The docstring for the parameter endpoint_url is incomplete;
update the parameter description (in the function/method docstring that includes
endpoint_url) to explicitly state the default endpoint, e.g. "Defaults to the US
endpoint (wss://streaming.assemblyai.com/v3/ws)." Ensure the sentence replaces
the dangling "Defaults to US" phrase and clearly names the default URL.
- Around line 359-365: The URL building currently concatenates
self._endpoint_url and a new query string which breaks if self._endpoint_url
already contains query params; instead parse self._endpoint_url (using
urllib.parse.urlparse), merge its existing query params (urllib.parse.parse_qs)
with the filtered_config dict (ensuring boolean -> "true"/"false" conversion
already in filtered_config), flatten/encode the combined params with
urllib.parse.urlencode(doseq=True), reconstruct the full URL with
urllib.parse.urlunparse, and then call await self._session.ws_connect(url,
headers=headers); update the logic around filtered_config, self._endpoint_url,
urlencode, and the ws_connect call to use the merged-safe URL.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
livekit-plugins/livekit-plugins-assemblyai/livekit/plugins/assemblyai/stt.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings
Files:
livekit-plugins/livekit-plugins-assemblyai/livekit/plugins/assemblyai/stt.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: type-check (3.9)
- GitHub Check: unit-tests
- GitHub Check: type-check (3.13)
🔇 Additional comments (2)
livekit-plugins/livekit-plugins-assemblyai/livekit/plugins/assemblyai/stt.py (2)
95-156: Endpoint URL is correctly threaded into new streams.This cleanly ensures all streams use the configured endpoint.
200-208: SpeechStream stores endpoint URL cleanly.No concerns with the propagation pattern here.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
livekit-plugins/livekit-plugins-assemblyai/livekit/plugins/assemblyai/stt.py
Show resolved
Hide resolved
| filtered_config = { | ||
| k: ("true" if v else "false") if isinstance(v, bool) else v | ||
| for k, v in live_config.items() | ||
| if v is not None | ||
| } | ||
| url = f"{ws_url}?{urlencode(filtered_config)}" | ||
| url = f"{self._endpoint_url}?{urlencode(filtered_config)}" | ||
| ws = await self._session.ws_connect(url, headers=headers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Build the WS URL safely if endpoint_url already has query params.
f"{self._endpoint_url}?..." will produce an invalid URL if the caller includes existing query parameters (double ?). Safer to merge queries.
🛠️ Proposed fix
-from urllib.parse import urlencode
+from urllib.parse import parse_qsl, urlencode, urlsplit, urlunsplit
@@
- url = f"{self._endpoint_url}?{urlencode(filtered_config)}"
+ split = urlsplit(self._endpoint_url)
+ query = dict(parse_qsl(split.query))
+ query.update(filtered_config)
+ url = urlunsplit((split.scheme, split.netloc, split.path, urlencode(query), split.fragment))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| filtered_config = { | |
| k: ("true" if v else "false") if isinstance(v, bool) else v | |
| for k, v in live_config.items() | |
| if v is not None | |
| } | |
| url = f"{ws_url}?{urlencode(filtered_config)}" | |
| url = f"{self._endpoint_url}?{urlencode(filtered_config)}" | |
| ws = await self._session.ws_connect(url, headers=headers) | |
| filtered_config = { | |
| k: ("true" if v else "false") if isinstance(v, bool) else v | |
| for k, v in live_config.items() | |
| if v is not None | |
| } | |
| split = urlsplit(self._endpoint_url) | |
| query = dict(parse_qsl(split.query)) | |
| query.update(filtered_config) | |
| url = urlunsplit((split.scheme, split.netloc, split.path, urlencode(query), split.fragment)) | |
| ws = await self._session.ws_connect(url, headers=headers) |
🤖 Prompt for AI Agents
In `@livekit-plugins/livekit-plugins-assemblyai/livekit/plugins/assemblyai/stt.py`
around lines 359 - 365, The URL building currently concatenates
self._endpoint_url and a new query string which breaks if self._endpoint_url
already contains query params; instead parse self._endpoint_url (using
urllib.parse.urlparse), merge its existing query params (urllib.parse.parse_qs)
with the filtered_config dict (ensuring boolean -> "true"/"false" conversion
already in filtered_config), flatten/encode the combined params with
urllib.parse.urlencode(doseq=True), reconstruct the full URL with
urllib.parse.urlunparse, and then call await self._session.ws_connect(url,
headers=headers); update the logic around filtered_config, self._endpoint_url,
urlencode, and the ws_connect call to use the merged-safe URL.
| keyterms_prompt: NotGivenOr[list[str]] = NOT_GIVEN, | ||
| http_session: aiohttp.ClientSession | None = None, | ||
| buffer_size_seconds: float = 0.05, | ||
| endpoint_url: str = "wss://streaming.assemblyai.com/v3/ws", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you want this to be a base_url instead? do you expect the /v3/ws parameter to be modifiable.
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.