Skip to content

feat: blog twilio + vonage + Gladia Solaria + Flask/FastAPI + Python + Go#65

Open
jqueguiner wants to merge 9 commits into
mainfrom
feat/blogs
Open

feat: blog twilio + vonage + Gladia Solaria + Flask/FastAPI + Python + Go#65
jqueguiner wants to merge 9 commits into
mainfrom
feat/blogs

Conversation

@jqueguiner
Copy link
Copy Markdown
Contributor

@jqueguiner jqueguiner commented May 7, 2025

✨ (twilio-solaria-python-flask): add a new blog post on real-time Twilio call transcription using Flask and Gladia's STT API

📝 (twilio-solaria-python-flask): create .gitignore file to exclude environment and virtual environment files
📝 (twilio-solaria-python-flask): add README.md for project setup and usage instructions
📝 (twilio-solaria-python-flask): add env_setup.txt for environment variable setup instructions
✅ (twilio-solaria-python-flask): add requirements.txt for project dependencies
♻️ (twilio-solaria-python-flask): implement server.py for handling WebSocket connections and transcription logic
📝 (twilio-solaria-python-flask): add TwiML example for configuring Twilio to stream audio to the server

Summary by CodeRabbit

  • New Features
    • Added detailed tutorials and implementations for real-time call transcription using Vonage's Voice APIs with FastAPI and Gladia’s speech-to-text API.
    • Introduced Python FastAPI servers that proxy Vonage audio streams to Gladia’s API with native μ-law and linear PCM audio support.
    • Provided Vonage NCCO examples to configure call audio streaming to transcription proxy endpoints.
  • Documentation
    • Included comprehensive README files and setup instructions for Vonage projects covering environment setup, dependencies, and usage.
    • Added environment setup files and requirements for Vonage Python FastAPI projects.
    • Supplied Vonage NCCO XML configuration examples for integration.

…lio call transcription using Flask and Gladia's STT API

📝 (twilio-solaria-python-flask): create .gitignore file to exclude environment and virtual environment files
📝 (twilio-solaria-python-flask): add README.md for project setup and usage instructions
📝 (twilio-solaria-python-flask): add env_setup.txt for environment variable setup instructions
✅ (twilio-solaria-python-flask): add requirements.txt for project dependencies
♻️ (twilio-solaria-python-flask): implement server.py for handling WebSocket connections and transcription logic
📝 (twilio-solaria-python-flask): add TwiML example for configuring Twilio to stream audio to the server
@jqueguiner jqueguiner self-assigned this May 7, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 7, 2025

Walkthrough

Multiple new projects were introduced for real-time transcription of Twilio calls using Gladia's Speech-to-Text API with native μ-law audio support. These include implementations in Python (Flask and FastAPI), Go, JavaScript, and TypeScript. Each project adds comprehensive documentation, environment setup instructions, dependency specifications, WebSocket proxy servers to handle Twilio audio streams, example TwiML configurations, and .gitignore files.

Changes

File(s) Change Summary
blogs/twilio-solaria-python-flask/.gitignore Added .gitignore to exclude .env and .venv from version control.
blogs/twilio-solaria-python-flask/blog.md Added a detailed blog post explaining the architecture, setup, and step-by-step guide for building a Twilio-to-Gladia real-time transcription proxy with Flask and Python. Introduces and documents new Python functions, WebSocket handlers, and the overall workflow.
blogs/twilio-solaria-python-flask/src/README.md Added project README with prerequisites, setup instructions, workflow explanation, and next steps for extending the solution.
blogs/twilio-solaria-python-flask/src/env_setup.txt Added environment setup instructions specifying required and optional environment variables for running the server.
blogs/twilio-solaria-python-flask/src/requirements.txt Added Python dependencies for Flask, Flask-Sock, Websockets, Requests, python-dotenv, and Greenlet with specific versions.
blogs/twilio-solaria-python-flask/src/server.py Added Flask application with WebSocket support to proxy Twilio audio streams to Gladia for transcription. Implements session creation, audio forwarding, transcript handling, health check, and error logging. Introduces multiple new functions and WebSocket routes.
blogs/twilio-solaria-python-flask/src/twiml_example.xml Added example TwiML XML file to configure Twilio to start a media stream to the proxy and continue the call flow by dialing a number.
blogs/twilio-solaria-go/.gitignore Added .gitignore to exclude .env and .venv from version control.
blogs/twilio-solaria-go/blog.md Added a detailed blog post explaining real-time Twilio call transcription using Go and Gladia's API, including session creation, WebSocket proxy server implementation with goroutines, message processing, error handling, and TwiML configuration.
blogs/twilio-solaria-go/src/README.md Added project README with prerequisites, setup instructions, workflow explanation, and next steps for extending the Go solution.
blogs/twilio-solaria-go/src/env_setup.txt Added environment setup instructions for the Go project, including .env file creation and environment variable configuration for Gladia API key and HTTP port.
blogs/twilio-solaria-go/src/go.mod Added Go module file specifying dependencies on Gorilla WebSocket and godotenv packages with Go version 1.21.
blogs/twilio-solaria-go/src/main.go Added Go server implementing a WebSocket proxy bridging Twilio media streams to Gladia's transcription API, including session creation, concurrent message handling with goroutines, health check, and error logging.
blogs/twilio-solaria-go/src/twiml_example.xml Added example TwiML XML file to configure Twilio to start a media stream to the Go server and continue the call flow by dialing a number.
blogs/twilio-solaria-python-fastapi/.gitignore Added .gitignore to exclude .env and .venv from version control.
blogs/twilio-solaria-python-fastapi/blog.md Added a detailed blog post explaining real-time Twilio call transcription using FastAPI and Gladia's API, covering architecture, session creation, WebSocket proxy implementation, TwiML configuration, and testing instructions.
blogs/twilio-solaria-python-fastapi/src/README.md Added project README with prerequisites, setup instructions, workflow explanation, and next steps for extending the FastAPI solution.
blogs/twilio-solaria-python-fastapi/src/env_setup.txt Added environment setup instructions specifying required and optional environment variables for running the FastAPI server.
blogs/twilio-solaria-python-fastapi/src/requirements.txt Added Python dependencies for FastAPI, Uvicorn, Websockets, Requests, python-dotenv, and Greenlet with specific versions.
blogs/twilio-solaria-python-fastapi/src/server.py Added FastAPI server implementing WebSocket proxy for Twilio audio streams to Gladia's transcription API, including session creation, message forwarding, transcript handling, health check, and error logging. Introduces multiple new functions and WebSocket routes.
blogs/twilio-solaria-python-fastapi/src/twiml_example.xml Added example TwiML XML file to configure Twilio to start a media stream to the FastAPI server and continue the call flow by dialing a number.
blogs/twilio-solaria-javascript/.gitignore Added .gitignore to exclude node_modules, log files, environment files, and runtime data files from version control.
blogs/twilio-solaria-javascript/blog.md Added a detailed blog post explaining real-time Twilio call transcription using JavaScript and Gladia's API, including session creation, WebSocket proxy server implementation, message processing, error handling, and TwiML configuration.
blogs/twilio-solaria-javascript/package.json Added package.json defining project metadata, dependencies (dotenv, node-fetch, ws), scripts, and engine requirements for the JavaScript project.
blogs/twilio-solaria-javascript/src/README.md Added project README with prerequisites, setup instructions, workflow explanation, and next steps for extending the JavaScript solution.
blogs/twilio-solaria-javascript/src/env_setup.txt Added environment setup instructions specifying required and optional environment variables for running the JavaScript server.
blogs/twilio-solaria-javascript/src/main.js Added Node.js server implementing a WebSocket proxy bridging Twilio media streams to Gladia's transcription API, including session creation, message forwarding, transcript handling, health check, and error logging. Introduces multiple new functions for session management and WebSocket handling.
blogs/twilio-solaria-javascript/src/twiml_example.xml Added example TwiML XML file to configure Twilio to start a media stream to the JavaScript server and continue the call flow by dialing a number.
blogs/twilio-solaria-typescript/.gitignore Added .gitignore to exclude node_modules, dist, environment files, logs, editor configs, and OS-specific files from version control.
blogs/twilio-solaria-typescript/blog.md Added a detailed blog post explaining real-time Twilio call transcription using TypeScript and Gladia's API, including session creation, WebSocket proxy server implementation, message processing, error handling, TwiML configuration, and testing instructions.
blogs/twilio-solaria-typescript/package.json Added package.json defining project metadata, scripts, dependencies, and devDependencies for the TypeScript project.
blogs/twilio-solaria-typescript/src/README.md Added project README with prerequisites, setup instructions, workflow explanation, and next steps for extending the TypeScript solution.
blogs/twilio-solaria-typescript/src/app/gladiaClient.ts Added TypeScript module to create a Gladia live transcription session with robust error handling and timeout.
blogs/twilio-solaria-typescript/src/app/handlers.ts Added TypeScript module with functions to process Twilio WebSocket messages and handle Gladia transcription messages, including base64 decoding and transcript extraction.
blogs/twilio-solaria-typescript/src/app/server.ts Added TypeScript server implementing a WebSocket proxy bridging Twilio media streams to Gladia's transcription API, with session management, bidirectional message forwarding, error handling, and graceful shutdown.
blogs/twilio-solaria-typescript/src/app/types.ts Added TypeScript interfaces defining data structures for Gladia session, Twilio messages, and Gladia messages.
blogs/twilio-solaria-typescript/src/env_setup.txt Added environment setup instructions specifying required and optional environment variables for running the TypeScript server.
blogs/twilio-solaria-typescript/src/twiml_example.xml Added example TwiML XML file to configure Twilio to start a media stream to the TypeScript server and continue the call flow by dialing a number.
blogs/twilio-solaria-typescript/tsconfig.json Added TypeScript configuration file specifying compiler options targeting ES2022 and NodeNext, with strict type checking and module interoperability.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant Twilio
    participant ProxyServer
    participant GladiaAPI

    Caller->>Twilio: Initiates call
    Twilio->>ProxyServer: Opens WebSocket /media, streams base64 μ-law audio
    ProxyServer->>GladiaAPI: Creates transcription session (HTTP POST)
    ProxyServer->>GladiaAPI: Forwards decoded raw audio (WebSocket)
    GladiaAPI-->>ProxyServer: Sends transcription results (WebSocket)
    ProxyServer-->>Console: Prints/display transcripts
Loading

Suggested reviewers

  • tnesztler

Poem

🐇 In the burrow of code, I hop with delight,
Streaming Twilio calls through the day and the night.
Flask, FastAPI, Go, or JavaScript too,
Gladia transcribes with a speed that's true.
Base64 to μ-law, the bytes swiftly flow,
From voice to text, watch the transcripts grow!
Hop, hop, hop — to the future we go! ✨🐰

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@jqueguiner jqueguiner requested a review from sboudouk May 7, 2025 22:04
Comment thread blogs/twilio-solaria-python-flask/src/server.py Dismissed
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (10)
blogs/twilio-solaria-python-flask/.gitignore (1)

1-2: Enhance .gitignore with standard Python patterns.
Ignoring .env and .venv is great. To prevent tracking compiled files and cache artifacts, consider adding:

 .env
 .venv
+__pycache__/
+*.py[cod]
+*.egg-info/
+venv/
blogs/twilio-solaria-python-flask/src/env_setup.txt (1)

1-7: Clarify setup order and dotenv loading.
Good instructions on creating the .env. To streamline onboarding, you might remind users to install dependencies first (pip install -r requirements.txt) and note that calling load_dotenv() in server.py (via python-dotenv) will automatically load these variables at runtime.

blogs/twilio-solaria-python-flask/src/README.md (4)

1-4: Add working-directory context.
Since all commands assume you’re in src (where server.py and requirements.txt live), add a note at the top:

+# Navigate to the project source directory
+```bash
+cd blogs/twilio-solaria-python-flask/src
+```

This ensures relative paths resolve correctly.


14-32: Simplify Python environment setup.
The detailed pyenv workflow might be overkill for users with Python 3.8+. Consider replacing it with a standard venv flow:

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

This lowers the barrier for contributors who don’t use pyenv.


33-38: Include optional HTTP_PORT example.
Your env_setup.txt shows an optional HTTP_PORT, but the README’s environment step omits it. To maintain consistency, include:

GLADIA_API_KEY=your_gladia_api_key_here
# HTTP_PORT=5001

This helps users customize the server port.


63-69: Emphasize secure WebSocket scheme for ngrok.
Twilio media streams require wss://. While ngrok http exposes an HTTPS endpoint, clarify the WebSocket URL should use wss://. For example:

ngrok http 5000 --bind-tls=true
# and then connect via wss://<your-ngrok-id>.ngrok.io/media
blogs/twilio-solaria-python-flask/blog.md (2)

181-182: Fix article grammar: “a ngrok URL”, not “an ngrok URL”
“ngrok” begins with a consonant sound, so the correct indefinite article is “a”.

- | **Public URL**                            | Expose a WebSocket endpoint with ngrok or a cloud VM.                |
- | **8 kHz, 8-bit μ-law audio**              | Exactly what Twilio streams – and what Gladia now consumes natively. |
+ | **Public URL**                            | Expose a WebSocket endpoint with ngrok or a cloud VM.                |
+ | **8 kHz, 8-bit μ-law audio**              | Exactly what Twilio streams – and what Gladia now consumes natively. |
🧰 Tools
🪛 LanguageTool

[misspelling] ~181-~181: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...

(EN_A_VS_AN)


172-190: TwiML exposition: bullet punctuation renders incorrectly in Markdown
The leading “- ” inside paragraphs creates “loose punctuation” warnings and renders as plain hyphens instead of list items. Convert the descriptive lines to a proper unordered list for readability.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~174-~174: Loose punctuation mark.
Context: ...his TwiML configuration: - <Response>: The root element of any TwiML document....

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~176-~176: Loose punctuation mark.
Context: ...ions for handling the call. - <Start>: This element initiates Twilio's Media S...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~178-~178: Loose punctuation mark.
Context: ...the rest of the call flow. - <Stream>: A child element of <Start> that confi...

(UNLIKELY_OPENING_PUNCTUATION)


[misspelling] ~181-~181: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...

(EN_A_VS_AN)


[uncategorized] ~185-~185: Loose punctuation mark.
Context: ...connection to this endpoint. - <Dial>: After starting the media stream, this e...

(UNLIKELY_OPENING_PUNCTUATION)

blogs/twilio-solaria-python-flask/src/server.py (2)

152-155: Suppress exceptions more cleanly & avoid bare except
Use contextlib.suppress(Exception) or catch specific exceptions to satisfy linters (SIM105/E722) and avoid accidental masking of critical errors.

-        try:
-            await gladia_ws.close()
-        except:
-            pass
+        import contextlib
+        with contextlib.suppress(Exception):
+            await gladia_ws.close()
🧰 Tools
🪛 Ruff (0.8.2)

152-155: Use contextlib.suppress(Exception) instead of try-except-pass

Replace with contextlib.suppress(Exception)

(SIM105)


154-154: Do not use bare except

(E722)


160-167: Creating a new event loop per connection is heavy & error-prone
Spawning an event loop in each thread complicates shutdown and resource usage. Prefer running Flask-Sock with an ASGI server (e.g., Hypercorn/Uvicorn) and use the single loop it provides.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9047b33 and 1a888a0.

📒 Files selected for processing (7)
  • blogs/twilio-solaria-python-flask/.gitignore (1 hunks)
  • blogs/twilio-solaria-python-flask/blog.md (1 hunks)
  • blogs/twilio-solaria-python-flask/src/README.md (1 hunks)
  • blogs/twilio-solaria-python-flask/src/env_setup.txt (1 hunks)
  • blogs/twilio-solaria-python-flask/src/requirements.txt (1 hunks)
  • blogs/twilio-solaria-python-flask/src/server.py (1 hunks)
  • blogs/twilio-solaria-python-flask/src/twiml_example.xml (1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
blogs/twilio-solaria-python-flask/src/server.py

152-155: Use contextlib.suppress(Exception) instead of try-except-pass

Replace with contextlib.suppress(Exception)

(SIM105)


154-154: Do not use bare except

(E722)


171-174: Use contextlib.suppress(Exception) instead of try-except-pass

Replace with contextlib.suppress(Exception)

(SIM105)


173-173: Do not use bare except

(E722)

🪛 ast-grep (0.31.1)
blogs/twilio-solaria-python-flask/src/server.py

[warning] 183-183: Running flask app with host 0.0.0.0 could expose the server publicly.
Context: app.run(host="0.0.0.0", port=HTTP_PORT, debug=True)
Note: [CWE-668]: Exposure of Resource to Wrong Sphere [OWASP A01:2021]: Broken Access Control [REFERENCES]
https://owasp.org/Top10/A01_2021-Broken_Access_Control

(avoid_app_run_with_bad_host-python)


[warning] 183-183: Detected Flask app with debug=True. Do not deploy to production with this flag enabled as it will leak sensitive information. Instead, consider using Flask configuration variables or setting 'debug' using system environment variables.
Context: app.run(host="0.0.0.0", port=HTTP_PORT, debug=True)
Note: [CWE-489] Active Debug Code. [REFERENCES]
- https://labs.detectify.com/2015/10/02/how-patreon-got-hacked-publicly-exposed-werkzeug-debugger/

(debug-enabled-python)

🪛 GitHub Check: CodeQL
blogs/twilio-solaria-python-flask/src/server.py

[failure] 184-184: Flask app is run in debug mode
A Flask app appears to be run in debug mode. This may allow an attacker to run arbitrary code through the debugger.

🪛 LanguageTool
blogs/twilio-solaria-python-flask/blog.md

[uncategorized] ~174-~174: Loose punctuation mark.
Context: ...his TwiML configuration: - <Response>: The root element of any TwiML document....

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~176-~176: Loose punctuation mark.
Context: ...ions for handling the call. - <Start>: This element initiates Twilio's Media S...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~178-~178: Loose punctuation mark.
Context: ...the rest of the call flow. - <Stream>: A child element of <Start> that confi...

(UNLIKELY_OPENING_PUNCTUATION)


[misspelling] ~181-~181: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...

(EN_A_VS_AN)


[uncategorized] ~185-~185: Loose punctuation mark.
Context: ...connection to this endpoint. - <Dial>: After starting the media stream, this e...

(UNLIKELY_OPENING_PUNCTUATION)

🔇 Additional comments (2)
blogs/twilio-solaria-python-flask/src/requirements.txt (1)

1-6:

Details

✅ Verification successful

Verify the Flask package version.
You’ve pinned flask==3.1.0, but Flask’s stable releases are still in the 2.x series (e.g., 2.3.x). Please confirm that version 3.1.0 exists on PyPI or update to a valid release such as flask==2.3.4.


🏁 Script executed:

#!/usr/bin/env bash
# Check if Flask 3.1.0 exists on PyPI
curl -s https://pypi.org/pypi/Flask/json | jq '.releases | has("3.1.0")'

Length of output: 79


Flask 3.1.0 is available on PyPI
Confirmed that flask==3.1.0 exists on PyPI, so the version pin is correct.

blogs/twilio-solaria-python-flask/src/twiml_example.xml (1)

1-9: TwiML example is correct.
The <Start><Stream> and <Dial> verbs align with Twilio Media Streams. Remember to replace the url placeholder (wss://jl.mydomain.com/media) and the phone number with your actual endpoint and destination before deploying.

Comment thread blogs/twilio-solaria-python-flask/src/server.py
Comment thread blogs/twilio-solaria-python-flask/src/server.py
Comment thread blogs/twilio-solaria-python-flask/src/server.py
Comment thread blogs/twilio-solaria-python-flask/src/server.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (9)
blogs/twilio-solaria-python-flask/blog.md (9)

1-6: Refine Title and Intro Formatting

Consider replacing the ampersand (&) with “and” for clarity and accessibility in the title. You may also clarify the parenthetical “(μ-law Native)” phrasing to improve readability.


9-18: Enhance Prerequisites Table

It would be helpful to specify exact version requirements for key dependencies (flask-sock, websockets, etc.) and note any required Twilio SDK or CLI versions. This will reduce friction when readers set up the environment.


21-56: Clarify Session Initialization Return Values

The create_session() snippet prints the session ID and returns the WebSocket URL, but readers may wonder how to capture and reuse both values. Consider expanding the example to show assigning data["id"] and the returned URL to variables (or a dict) so they can be used downstream.


62-92: Improve Mermaid Diagram Clarity

The sequence diagram effectively illustrates flow, but adding labels for the WebSocket URL and the “media” event would boost comprehension. Also verify that your blogging platform supports Mermaid; if not, include a rendered image or fallback code block.


138-143: Persist Gladia Session Details

You call create_session() at startup but don’t store its return value. To handle reconnections or expiration, assign both the id and the URL into the gladia_session dict (or similar) so you can reuse or refresh them as needed.


194-199: Mention TwiML Content-Type Requirement

Add a reminder that the TwiML endpoint must return Content-Type: application/xml for Twilio to correctly parse the instructions. This small detail can prevent runtime errors.


231-239: Document .env File Setup

Since .env is git-ignored, explicitly instruct readers to create it with GLADIA_API_KEY (and optional HTTP_PORT) before running pip install and python server.py.


260-265: Add Post-Call Transcript Sample

You mention fetching full transcripts via GET /v2/live/:id. Consider including a brief requests.get example demonstrating how to call that endpoint and process the JSON response.


269-272: Link to Example Repository

At the wrap-up, include a direct link to the GitHub repo or branch so readers can easily clone and run the complete example end-to-end.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1a888a0 and 1b2b57e.

📒 Files selected for processing (1)
  • blogs/twilio-solaria-python-flask/blog.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
blogs/twilio-solaria-python-flask/blog.md

[uncategorized] ~208-~208: Loose punctuation mark.
Context: ...his TwiML configuration: - <Response>: The root element of any TwiML document....

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~210-~210: Loose punctuation mark.
Context: ...ions for handling the call. - <Start>: This element initiates Twilio's Media S...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~212-~212: Loose punctuation mark.
Context: ...the rest of the call flow. - <Stream>: A child element of <Start> that confi...

(UNLIKELY_OPENING_PUNCTUATION)


[misspelling] ~215-~215: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...

(EN_A_VS_AN)


[uncategorized] ~219-~219: Loose punctuation mark.
Context: ...connection to this endpoint. - <Dial>: After starting the media stream, this e...

(UNLIKELY_OPENING_PUNCTUATION)

Comment thread blogs/twilio-solaria-python-flask/blog.md
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
blogs/twilio-solaria-python-flask/blog.md (1)

105-151: 🛠️ Refactor suggestion

Include handle_websocket implementation and correct session storage

  1. The snippet invokes handle_websocket(ws) but that function’s code isn’t shown—readers need it to understand how Twilio frames are forwarded to Gladia.
  2. create_session() returns a WebSocket URL but isn’t assigned to gladia_session["url"], so the Gladia connection can’t be established later. Please capture and store the returned URL.
🧹 Nitpick comments (2)
blogs/twilio-solaria-python-flask/blog.md (2)

1-1: Use a single # for the main title and hyphenate “Real-Time”
Currently the title is written as a level-2 heading (##) and uses “Real Time” without a hyphen. For consistency and accessibility, consider:

# How to Transcribe Twilio Calls in Real-Time with Flask, Python & Gladia (μ-law Native)

2-4: Refine the opening paragraph for clarity and flow
The introduction is informative but could be tightened. For example:

Twilio’s Voice Media Streams deliver 8 kHz, 8-bit μ-law audio. Gladia’s real-time STT API ingests it natively—no resampling or decoding required—while maintaining sub-300 ms latency.

This version is slightly shorter and emphasizes the key takeaway.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1b2b57e and 5aae0cb.

📒 Files selected for processing (1)
  • blogs/twilio-solaria-python-flask/blog.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
blogs/twilio-solaria-python-flask/blog.md

[uncategorized] ~208-~208: Loose punctuation mark.
Context: ...his TwiML configuration: - <Response>: The root element of any TwiML document....

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~210-~210: Loose punctuation mark.
Context: ...ions for handling the call. - <Start>: This element initiates Twilio's Media S...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~212-~212: Loose punctuation mark.
Context: ...the rest of the call flow. - <Stream>: A child element of <Start> that confi...

(UNLIKELY_OPENING_PUNCTUATION)


[misspelling] ~215-~215: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...

(EN_A_VS_AN)


[uncategorized] ~219-~219: Loose punctuation mark.
Context: ...connection to this endpoint. - <Dial>: After starting the media stream, this e...

(UNLIKELY_OPENING_PUNCTUATION)

🔇 Additional comments (10)
blogs/twilio-solaria-python-flask/blog.md (10)

5-5: Approve the /v2/live endpoint description
The explanation of Gladia’s encoding: "wav/ulaw", bit_depth: 8, and sample_rate: 8000 parameters is accurate and succinct.


9-18: Approve prerequisites table
The table clearly conveys all required components (Gladia API key, Twilio account, Python 3.12+, etc.) along with their rationale. Markdown formatting is correct.


58-59: Approve the “Why no resample / decode?” note
This callout effectively emphasizes the performance advantage of forwarding raw μ-law frames.


67-92: Approve system architecture Mermaid diagram
The sequenceDiagram is well-formed, the participants are clear, and the flow accurately represents the end-to-end interaction.


194-204: Approve TwiML <Start><Stream> example
The XML is valid, the comments annotate each element clearly, and it matches Twilio’s requirements for secure WebSocket streams.


206-224: Approve TwiML explanation bullets
Each element (<Response>, <Start>, <Stream>, <Dial>) is described accurately and with appropriate detail.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~208-~208: Loose punctuation mark.
Context: ...his TwiML configuration: - <Response>: The root element of any TwiML document....

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~210-~210: Loose punctuation mark.
Context: ...ions for handling the call. - <Start>: This element initiates Twilio's Media S...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~212-~212: Loose punctuation mark.
Context: ...the rest of the call flow. - <Stream>: A child element of <Start> that confi...

(UNLIKELY_OPENING_PUNCTUATION)


[misspelling] ~215-~215: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...

(EN_A_VS_AN)


[uncategorized] ~219-~219: Loose punctuation mark.
Context: ...connection to this endpoint. - <Dial>: After starting the media stream, this e...

(UNLIKELY_OPENING_PUNCTUATION)


231-243: Approve “Expose & test” instructions
The shell commands for installing dependencies, running the proxy, and tunneling (ngrok) are clear and ready to copy/paste.


248-256: Approve sample output
The console-log snippet demonstrates the expected live-transcription output, which will help readers validate their setup.


260-266: Approve next steps suggestions
The proposed extensions (add-ons, dual channels, post-call JSON, scaling) are on point and encourage readers to explore further.


269-273: Approve wrap-up and emphasis on simplicity
The closing statement succinctly reinforces that live transcription is now literally “base64-decode and forward.” Great call-to-action.

Comment thread blogs/twilio-solaria-python-flask/blog.md
Comment thread blogs/twilio-solaria-python-flask/blog.md
@jqueguiner jqueguiner changed the title feat: blog twilio + Gladia Solaria + Flask + Python feat: blog twilio + Gladia Solaria + Flask/FastAPI + Python May 7, 2025
jqueguiner added 2 commits May 8, 2025 00:17
…iption of Twilio calls using FastAPI and Gladia

📝 (twilio-solaria-python-fastapi): create README and environment setup instructions for the project
🔧 (twilio-solaria-python-fastapi): add .gitignore file to exclude environment and virtual environment files
✅ (twilio-solaria-python-fastapi): include requirements.txt for project dependencies
♻️ (twilio-solaria-python-fastapi): implement server.py for handling WebSocket connections and transcription logic
🌐 (twilio-solaria-python-fastapi): add TwiML example for configuring Twilio to stream audio to the server
…ilio calls using Go and Gladia, along with environment setup instructions, .gitignore, and necessary Go module files
@jqueguiner jqueguiner changed the title feat: blog twilio + Gladia Solaria + Flask/FastAPI + Python feat: blog twilio + Gladia Solaria + Flask/FastAPI + Python + Go May 8, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (19)
blogs/twilio-solaria-go/src/README.md (3)

5-11: Improve prerequisites bullet phrasing
The second bullet reads “Twilio account + voice-enabled number.” For clarity and consistency, consider rephrasing to explicitly mention “a Twilio account with a voice-enabled phone number.”

- - **Twilio account + voice-enabled number**
+ - **A Twilio account with a voice-enabled phone number**

29-34: Clarify the location of twiml_example.xml
It may not be obvious where to find twiml_example.xml. Recommend adding a relative path or a link to the file in the repository.

- - Use the contents of `twiml_example.xml` as your TwiML
+ - Use the contents of `src/twiml_example.xml` (or link to its path) as your TwiML

35-40: Enhance technical notes grammar
Missing an article in the first bullet. Consider adding “a” before “standard Go HTTP server.”

- - The server uses standard Go HTTP server with gorilla/websocket for WebSocket support
+ - The server uses a standard Go HTTP server with gorilla/websocket for WebSocket support
🧰 Tools
🪛 LanguageTool

[uncategorized] ~37-~37: You might be missing the article “a” here.
Context: ... ## Technical Notes - The server uses standard Go HTTP server with gorilla/websocket f...

(AI_EN_LECTOR_MISSING_DETERMINER_A)

blogs/twilio-solaria-python-fastapi/src/server.py (3)

114-128: Tight 1 ms polling loop can peg a CPU core

await asyncio.wait_for(gladia_ws.recv(), 0.001) inside a while True loop results in 1 000 polling iterations per second when no data is available, needlessly burning CPU.

Consider either:

  1. Running a separate listener task that does a blocking await gladia_ws.recv() and await websocket.send_text() (if needed), or
  2. Increasing the timeout to a sensible value (e.g. 0.1 – 0.25 s) and breaking the inner while after one successful receive.

This keeps latency low while preventing a busy-wait scenario.
[performance]


151-154: Replace bare except … pass with contextlib.suppress or explicit exception handling

A blanket except: masks all exceptions, including KeyboardInterrupt and bugs you’d really want to know about. Ruff (SIM105 / E722) already flags this.

-from contextlib import suppress
-with suppress(Exception):
-    await gladia_ws.close()
+from contextlib import suppress
+
+with suppress(Exception):
+    await gladia_ws.close()
🧰 Tools
🪛 Ruff (0.8.2)

151-154: Use contextlib.suppress(Exception) instead of try-except-pass

Replace with contextlib.suppress(Exception)

(SIM105)


153-153: Do not use bare except

(E722)


71-78: Prefer logger over print() for transcript output

Using logger.info() keeps output format consistent, respects log levels, and allows redirection to structured log sinks.

-        print(f"📝 Transcript: {transcript}")
+        logger.info("📝 Transcript: %s", transcript)
blogs/twilio-solaria-python-fastapi/blog.md (1)

248-251: Minor wording nit – “an ngrok URL” → “a ngrok URL”

The word “ngrok” begins with a consonant sound, so the indefinite article should be “a”.

-  - The domain should be your public domain (e.g., an ngrok URL or a custom domain).
+  - The domain should be your public domain (e.g., a ngrok URL or a custom domain).
🧰 Tools
🪛 LanguageTool

[misspelling] ~250-~250: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...

(EN_A_VS_AN)

blogs/twilio-solaria-go/src/main.go (6)

73-76: Add error handling for w.Write() call

The healthCheck function doesn't check the error returned by w.Write(). Even though write errors are rare in this context, it's good practice to check them.

func healthCheck(w http.ResponseWriter, r *http.Request) {
	w.Header().Set("Content-Type", "application/json")
-	w.Write([]byte(`{"status":"ok","service":"twilio-gladia-transcription"}`))
+	_, err := w.Write([]byte(`{"status":"ok","service":"twilio-gladia-transcription"}`))
+	if err != nil {
+		log.Printf("Error writing health check response: %v", err)
+	}
}
🧰 Tools
🪛 golangci-lint (1.64.8)

75-75: Error return value of w.Write is not checked

(errcheck)


211-215: Consider restricting WebSocket origin if appropriate

The WebSocket upgrader allows connections from any origin (CheckOrigin always returns true). While this might be necessary for your use case with Twilio, consider if a more restrictive policy would be appropriate for security.

// Configure WebSocket upgrader
upgrader := websocket.Upgrader{
-	CheckOrigin: func(r *http.Request) bool { return true },
+	CheckOrigin: func(r *http.Request) bool {
+		// Allow Twilio domains and your allowed domains
+		origin := r.Header.Get("Origin")
+		allowedOrigins := []string{"https://your-domain.com", "https://twilio.com"}
+		for _, allowed := range allowedOrigins {
+			if allowed == origin {
+				return true
+			}
+		}
+		// For development or if you absolutely need to allow all origins
+		// return true
+		log.Printf("Rejected WebSocket connection from origin: %s", origin)
+		return false
+	},
	ReadBufferSize:  1024,
	WriteBufferSize: 1024,
}

64-69: Consider verifying response structure before using

The function assumes the API response will always contain the expected fields. Consider adding validation to ensure the response contains the required fields before using them.

var data GladiaSession
if err := json.NewDecoder(resp.Body).Decode(&data); err != nil {
	return GladiaSession{}, fmt.Errorf("failed to decode response: %w", err)
}
+// Validate response data
+if data.ID == "" || data.URL == "" {
+	return GladiaSession{}, fmt.Errorf("invalid session data: missing ID or URL")
+}
log.Printf("🛰  Gladia session ID: %s", data.ID)
return data, nil

239-241: Add error handling for w.Write() call

Similar to the healthCheck function, this code doesn't check the error returned by w.Write().

// For regular HTTP requests to root, return a simple info page
w.Header().Set("Content-Type", "text/plain")
-w.Write([]byte("Twilio-Gladia Transcription Server\n\nAvailable endpoints:\n- /media (WebSocket): Connect Twilio Media Streams\n- /health (HTTP): Health check endpoint"))
+_, err := w.Write([]byte("Twilio-Gladia Transcription Server\n\nAvailable endpoints:\n- /media (WebSocket): Connect Twilio Media Streams\n- /health (HTTP): Health check endpoint"))
+if err != nil {
+	log.Printf("Error writing response: %v", err)
+}
🧰 Tools
🪛 golangci-lint (1.64.8)

240-240: Error return value of w.Write is not checked

(errcheck)


98-116: Consider adding metrics for audio processing

This function processes audio data but doesn't track metrics like the number of audio frames processed or the amount of data. Adding simple counters could help with monitoring the system's performance.

// At the top of the file, with other var declarations
var (
	gladiaAPIKey string
	session      GladiaSession
+	stats struct {
+		audioFramesProcessed int
+		audioBytesSent       int64
+		sync.Mutex
+	}
)

// In the processMessage function
func processMessage(message []byte, gladiaConn *websocket.Conn) {
	// ...existing code...
	if err := gladiaConn.WriteMessage(websocket.BinaryMessage, mulaw); err != nil {
		log.Printf("Error sending to Gladia: %v", err)
	}
+	// Update metrics
+	stats.Lock()
+	stats.audioFramesProcessed++
+	stats.audioBytesSent += int64(len(mulaw))
+	stats.Unlock()
}

// Add a stats endpoint in main()
+http.HandleFunc("/stats", func(w http.ResponseWriter, r *http.Request) {
+	w.Header().Set("Content-Type", "application/json")
+	stats.Lock()
+	data, err := json.Marshal(map[string]interface{}{
+		"audio_frames_processed": stats.audioFramesProcessed,
+		"audio_bytes_sent": stats.audioBytesSent,
+	})
+	stats.Unlock()
+	if err != nil {
+		http.Error(w, err.Error(), http.StatusInternalServerError)
+		return
+	}
+	_, err = w.Write(data)
+	if err != nil {
+		log.Printf("Error writing stats response: %v", err)
+	}
+})

119-131: Consider adding transcript persistence option

Currently, transcripts are only logged. Consider adding an option to persist them (database, file, or webhook) for later retrieval or processing.

If you decide to implement this, you could add configuration through environment variables and implement a simple interface for different storage backends.

// Example interface for transcript storage
type TranscriptStore interface {
    Save(transcript string) error
}

// Example implementation for file storage
type FileStore struct {
    file *os.File
}

func NewFileStore(path string) (*FileStore, error) {
    f, err := os.OpenFile(path, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644)
    if err != nil {
        return nil, err
    }
    return &FileStore{file: f}, nil
}

func (fs *FileStore) Save(transcript string) error {
    _, err := fmt.Fprintf(fs.file, "%s\t%s\n", time.Now().Format(time.RFC3339), transcript)
    return err
}

Then modify handleGladia to use the store when a transcript is finalized.

blogs/twilio-solaria-go/blog.md (6)

130-198: Provide full proxy implementation context
The code snippets illustrate message processing but omit the HTTP route registration and WebSocket upgrade needed to accept Twilio connections. Readers may be unsure how to wire these pieces together. Consider adding a minimal example using http.HandleFunc("/media", ...) with websocket.Upgrader.


184-197: Align function signature with usage
handleGladia returns the final transcript string, but its caller in handleWebSocket ignores this value. Either remove the return value (and log inside) or have the caller utilize it (e.g., broadcast transcripts or send to a channel).


199-247: Enhance cancellation and error propagation
Each goroutine exits on an I/O error but doesn’t signal the other, which can leave wg.Wait() blocked. Consider using a context.Context with cancellation or closing one WebSocket connection upon error to ensure both loops terminate cleanly.


248-275: Show complete server startup
After loading env vars and initializing the session, the snippet doesn’t show how to register routes or start the HTTP server. For a runnable example, append something like:

 func main() {
     // … existing setup …
+    http.HandleFunc("/media", func(w http.ResponseWriter, r *http.Request) {
+        upgrader := websocket.Upgrader{}
+        conn, err := upgrader.Upgrade(w, r, nil)
+        if err != nil {
+            log.Printf("WebSocket upgrade error: %v", err)
+            return
+        }
+        go handleWebSocket(conn)
+    })
+
+    log.Printf("🚀 Starting server on 0.0.0.0:%s", port)
+    if err := http.ListenAndServe(":"+port, nil); err != nil {
+        log.Fatal("Server failed:", err)
+    }
 }

This completes the end-to-end example.


298-316: Nitpick: standardize list formatting
Some bullet items have two spaces before the marker, which can render inconsistently across Markdown parsers. Consider removing trailing spaces and using a single space before each -:

- ... call flow.  - `<Start>`: ...
+ ... call flow.
+ - `<Start>`: ...

This will improve cross-platform readability.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~300-~300: Loose punctuation mark.
Context: ...his TwiML configuration: - <Response>: The root element of any TwiML document....

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~302-~302: Loose punctuation mark.
Context: ...ions for handling the call. - <Start>: This element initiates Twilio's Media S...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~304-~304: Loose punctuation mark.
Context: ...the rest of the call flow. - <Stream>: A child element of <Start> that confi...

(UNLIKELY_OPENING_PUNCTUATION)


[misspelling] ~307-~307: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...

(EN_A_VS_AN)


[uncategorized] ~311-~311: Loose punctuation mark.
Context: ...connection to this endpoint. - <Dial>: After starting the media stream, this e...

(UNLIKELY_OPENING_PUNCTUATION)


321-341: Recommend initializing Go modules
Before running go build, users should initialize a Go module if they haven’t:

go mod init github.com/your-org/your-project
go mod tidy

Including this helps avoid module resolution errors.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5aae0cb and a34b99b.

⛔ Files ignored due to path filters (1)
  • blogs/twilio-solaria-go/src/go.sum is excluded by !**/*.sum
📒 Files selected for processing (14)
  • blogs/twilio-solaria-go/.gitignore (1 hunks)
  • blogs/twilio-solaria-go/blog.md (1 hunks)
  • blogs/twilio-solaria-go/src/README.md (1 hunks)
  • blogs/twilio-solaria-go/src/env_setup.txt (1 hunks)
  • blogs/twilio-solaria-go/src/go.mod (1 hunks)
  • blogs/twilio-solaria-go/src/main.go (1 hunks)
  • blogs/twilio-solaria-go/src/twiml_example.xml (1 hunks)
  • blogs/twilio-solaria-python-fastapi/.gitignore (1 hunks)
  • blogs/twilio-solaria-python-fastapi/blog.md (1 hunks)
  • blogs/twilio-solaria-python-fastapi/src/README.md (1 hunks)
  • blogs/twilio-solaria-python-fastapi/src/env_setup.txt (1 hunks)
  • blogs/twilio-solaria-python-fastapi/src/requirements.txt (1 hunks)
  • blogs/twilio-solaria-python-fastapi/src/server.py (1 hunks)
  • blogs/twilio-solaria-python-fastapi/src/twiml_example.xml (1 hunks)
✅ Files skipped from review due to trivial changes (9)
  • blogs/twilio-solaria-go/.gitignore
  • blogs/twilio-solaria-python-fastapi/src/requirements.txt
  • blogs/twilio-solaria-python-fastapi/src/env_setup.txt
  • blogs/twilio-solaria-go/src/env_setup.txt
  • blogs/twilio-solaria-go/src/go.mod
  • blogs/twilio-solaria-python-fastapi/src/twiml_example.xml
  • blogs/twilio-solaria-go/src/twiml_example.xml
  • blogs/twilio-solaria-python-fastapi/.gitignore
  • blogs/twilio-solaria-python-fastapi/src/README.md
🧰 Additional context used
🪛 golangci-lint (1.64.8)
blogs/twilio-solaria-go/src/main.go

75-75: Error return value of w.Write is not checked

(errcheck)


240-240: Error return value of w.Write is not checked

(errcheck)

🪛 Ruff (0.8.2)
blogs/twilio-solaria-python-fastapi/src/server.py

151-154: Use contextlib.suppress(Exception) instead of try-except-pass

Replace with contextlib.suppress(Exception)

(SIM105)


153-153: Do not use bare except

(E722)

🪛 LanguageTool
blogs/twilio-solaria-go/blog.md

[uncategorized] ~300-~300: Loose punctuation mark.
Context: ...his TwiML configuration: - <Response>: The root element of any TwiML document....

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~302-~302: Loose punctuation mark.
Context: ...ions for handling the call. - <Start>: This element initiates Twilio's Media S...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~304-~304: Loose punctuation mark.
Context: ...the rest of the call flow. - <Stream>: A child element of <Start> that confi...

(UNLIKELY_OPENING_PUNCTUATION)


[misspelling] ~307-~307: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...

(EN_A_VS_AN)


[uncategorized] ~311-~311: Loose punctuation mark.
Context: ...connection to this endpoint. - <Dial>: After starting the media stream, this e...

(UNLIKELY_OPENING_PUNCTUATION)

blogs/twilio-solaria-python-fastapi/blog.md

[uncategorized] ~243-~243: Loose punctuation mark.
Context: ...his TwiML configuration: - <Response>: The root element of any TwiML document....

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~245-~245: Loose punctuation mark.
Context: ...ions for handling the call. - <Start>: This element initiates Twilio's Media S...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~247-~247: Loose punctuation mark.
Context: ...the rest of the call flow. - <Stream>: A child element of <Start> that confi...

(UNLIKELY_OPENING_PUNCTUATION)


[misspelling] ~250-~250: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...

(EN_A_VS_AN)


[uncategorized] ~254-~254: Loose punctuation mark.
Context: ...connection to this endpoint. - <Dial>: After starting the media stream, this e...

(UNLIKELY_OPENING_PUNCTUATION)

blogs/twilio-solaria-go/src/README.md

[uncategorized] ~37-~37: You might be missing the article “a” here.
Context: ... ## Technical Notes - The server uses standard Go HTTP server with gorilla/websocket f...

(AI_EN_LECTOR_MISSING_DETERMINER_A)

🔇 Additional comments (14)
blogs/twilio-solaria-go/src/README.md (8)

1-4: Clear and descriptive title/overview
The title and introduction succinctly describe the project’s purpose and scope.


12-21: Setup: Go dependencies instructions look good
The instructions for installing Go and downloading dependencies are clear and accurate.


22-28: Environment variables setup is clear
Creating a .env next to main.go with the required keys is well explained.


41-56: Build and run instructions are solid
The steps for building and running the server (with and without specifying a port) are clear.


58-68: Exposing the server via ngrok is well explained
Instructions for exposing your local port with ngrok (including custom domains) are accurate.


70-76: Twilio webhook update and test steps are clear
Updating the TwiML URL and testing the call flow is straightforward and complete.


77-84: How it works section is informative
The workflow description accurately captures the end-to-end streaming and transcription process.


85-90: Next steps suggestions add value
Future enhancements are well scoped and provide a clear roadmap for expanding functionality.

blogs/twilio-solaria-go/src/main.go (1)

195-199: LGTM: Good fallback to default port

Good practice to provide a default port when the environment variable is not set.

blogs/twilio-solaria-go/blog.md (5)

1-2: Inconsistent PR description vs. file content
The PR title and objectives mention a Python/Flask + FastAPI sample, but this blog post covers a Go implementation under twilio-solaria-go. Please verify whether this file belongs in this PR or if the PR description needs updating to include the Go sample.

Likely an incorrect or invalid review comment.


87-117: Clear system architecture diagram
The Mermaid sequence diagram accurately represents the Twilio → GoServer → Gladia flow and greatly aids comprehension. Well done!


284-296: TwiML example is accurate and clear
The XML snippet properly demonstrates how to start Twilio Media Streams and matches the /media endpoint. It’s ready for readers to copy and use.


356-362: Useful next steps outlined
The suggestions for add-ons, dual-channel audio, post-call JSON retrieval, and scaling are practical and give readers clear directions for extending the solution. Nice work!


365-369: Strong wrap-up
The conclusion succinctly highlights the “base64-decode and forward” advantage and reinforces the ease and performance of the approach. It’s an engaging way to close the post.

Comment on lines +97 to +104
@app.websocket("/{remaining_path:path}")
async def catch_all_websocket(websocket: WebSocket, remaining_path: str):
"""Catch-all handler for WebSocket connections."""
await websocket.accept()
logger.info(f"🔌 Catch-all WebSocket connected to /{remaining_path}")

await handle_websocket(websocket)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Catch-all WebSocket route exposes the service to unintended traffic

/{remaining_path:path} happily accepts any WebSocket path and forwards raw bytes to Gladia:

• Anyone on the internet can push arbitrary data to your Gladia quota.
• Malformed frames may crash the proxy.
• Twilio only connects to the exact /media endpoint.

Recommendation: delete the catch-all or restrict it (e.g., to authenticated admin clients).
[security]

Comment on lines +63 to +68
# Create initial Gladia session
try:
create_session()
except Exception as e:
logger.error("Failed to create initial Gladia session: %s", e)
raise
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Create a fresh Gladia session per call to avoid cross-call mixing & token-expiry issues

create_session() is invoked only once at startup, so every concurrent Twilio call re-uses the same session ID & WebSocket URL.
• Audio from different callers will be multiplexed into a single Gladia session, producing garbled transcripts.
• Gladia session tokens are short-lived; once the token expires every new call will fail to connect.

Move the session creation inside handle_websocket() (or a dedicated connection factory) so that each call gets its own isolated, fresh session.

@@ async def handle_websocket(websocket: WebSocket):
-    # Connect to Gladia for this connection
-    try:
-        gladia_ws = await websockets.connect(gladia_session["url"])
-        logger.info(f"Connected to Gladia session {gladia_session['id']}")
+    # Create a dedicated Gladia session for this call
+    try:
+        session_url = create_session()
+        gladia_ws = await websockets.connect(session_url)
+        logger.info(f"Connected to new Gladia session {gladia_session['id']}")

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +153 to +180
// Twilio -> Gladia
go func() {
defer wg.Done()
for {
_, msg, err := twilioConn.ReadMessage()
if err != nil {
log.Printf("Error reading from Twilio: %v", err)
return
}
processMessage(msg, gladiaConn)
}
}()

// Gladia -> transcripts
go func() {
defer wg.Done()
for {
_, msg, err := gladiaConn.ReadMessage()
if err != nil {
log.Printf("Error reading from Gladia: %v", err)
return
}
handleGladia(msg)
}
}()

wg.Wait()
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use context for cancellation between goroutines

The current implementation creates two goroutines that run independently. If one fails, the other will continue running until it encounters its own error. Consider using a context with cancellation to coordinate the shutdown of both goroutines when either one fails.

func handleWebSocket(twilioConn *websocket.Conn) {
	clientInfo := twilioConn.RemoteAddr().String()
	log.Printf("🔌 Twilio WebSocket connected from %s", clientInfo)
	
	defer twilioConn.Close()
	
	// Connect to Gladia
	dialer := websocket.DefaultDialer
	gladiaConn, _, err := dialer.Dial(session.URL, nil)
	if err != nil {
		log.Printf("Failed to connect to Gladia: %v", err)
		return
	}
	defer gladiaConn.Close()
	log.Printf("Connected to Gladia session %s", session.ID)

+	ctx, cancel := context.WithCancel(context.Background())
+	defer cancel()
	
	var wg sync.WaitGroup
	wg.Add(2)

	// Twilio -> Gladia
	go func() {
		defer wg.Done()
+		defer cancel() // Signal other goroutine to stop when this one exits
		for {
+			select {
+			case <-ctx.Done():
+				return
+			default:
+			}
			_, msg, err := twilioConn.ReadMessage()
			if err != nil {
				log.Printf("Error reading from Twilio: %v", err)
				return
			}
			processMessage(msg, gladiaConn)
		}
	}()

	// Gladia -> transcripts
	go func() {
		defer wg.Done()
+		defer cancel() // Signal other goroutine to stop when this one exits
		for {
+			select {
+			case <-ctx.Done():
+				return
+			default:
+			}
			_, msg, err := gladiaConn.ReadMessage()
			if err != nil {
				log.Printf("Error reading from Gladia: %v", err)
				return
			}
			handleGladia(msg)
		}
	}()

	wg.Wait()
}

Don't forget to add "context" to your imports:

import (
	"bytes"
+	"context"
	"encoding/base64"
	// ...
)

Comment on lines +46 to +56
req, err := http.NewRequest("POST", gladiaInitURL, bytes.NewReader(body))
if err != nil {
return GladiaSession{}, fmt.Errorf("failed to create request: %w", err)
}
req.Header.Set("X-Gladia-Key", gladiaAPIKey)
req.Header.Set("Content-Type", "application/json")
client := &http.Client{Timeout: 10 * time.Second}
resp, err := client.Do(req)
if err != nil {
return GladiaSession{}, fmt.Errorf("session init request failed: %w", err)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add retry logic for Gladia session initialization

The current implementation makes a single attempt to create a Gladia session. Consider adding retry logic to handle temporary network issues or API unavailability.

func createSession() (GladiaSession, error) {
+	maxRetries := 3
+	retryDelay := 2 * time.Second
+
	payload := map[string]interface{}{ // μ-law, 8-bit, 8 kHz, mono
		"encoding":    "wav/ulaw",
		"bit_depth":   8,
		"sample_rate": 8000,
		"channels":    1,
	}
	body, err := json.Marshal(payload)
	if err != nil {
		return GladiaSession{}, fmt.Errorf("failed to marshal payload: %w", err)
	}
-	req, err := http.NewRequest("POST", gladiaInitURL, bytes.NewReader(body))
-	if err != nil {
-		return GladiaSession{}, fmt.Errorf("failed to create request: %w", err)
-	}
-	req.Header.Set("X-Gladia-Key", gladiaAPIKey)
-	req.Header.Set("Content-Type", "application/json")
-	client := &http.Client{Timeout: 10 * time.Second}
-	resp, err := client.Do(req)
-	if err != nil {
-		return GladiaSession{}, fmt.Errorf("session init request failed: %w", err)
-	}
+
+	var lastErr error
+	for attempt := 1; attempt <= maxRetries; attempt++ {
+		req, err := http.NewRequest("POST", gladiaInitURL, bytes.NewReader(body))
+		if err != nil {
+			return GladiaSession{}, fmt.Errorf("failed to create request: %w", err)
+		}
+		req.Header.Set("X-Gladia-Key", gladiaAPIKey)
+		req.Header.Set("Content-Type", "application/json")
+		client := &http.Client{Timeout: 10 * time.Second}
+		resp, err := client.Do(req)
+		if err != nil {
+			lastErr = err
+			log.Printf("Attempt %d: session init request failed: %v", attempt, err)
+			if attempt < maxRetries {
+				time.Sleep(retryDelay)
+				// Exponential backoff
+				retryDelay *= 2
+				continue
+			}
+			return GladiaSession{}, fmt.Errorf("session init request failed after %d attempts: %w", maxRetries, err)
+		}
+		defer resp.Body.Close()
+		
+		if resp.StatusCode < 200 || resp.StatusCode >= 300 {
+			bodyBytes, _ := io.ReadAll(resp.Body)
+			lastErr = fmt.Errorf("bad status code: %d - %s", resp.StatusCode, string(bodyBytes))
+			log.Printf("Attempt %d: %v", attempt, lastErr)
+			if attempt < maxRetries {
+				time.Sleep(retryDelay)
+				// Exponential backoff
+				retryDelay *= 2
+				continue
+			}
+			return GladiaSession{}, lastErr
+		}
+		
+		var data GladiaSession
+		if err := json.NewDecoder(resp.Body).Decode(&data); err != nil {
+			return GladiaSession{}, fmt.Errorf("failed to decode response: %w", err)
+		}
+		log.Printf("🛰  Gladia session ID: %s", data.ID)
+		return data, nil
+	}
+	
+	return GladiaSession{}, lastErr
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
req, err := http.NewRequest("POST", gladiaInitURL, bytes.NewReader(body))
if err != nil {
return GladiaSession{}, fmt.Errorf("failed to create request: %w", err)
}
req.Header.Set("X-Gladia-Key", gladiaAPIKey)
req.Header.Set("Content-Type", "application/json")
client := &http.Client{Timeout: 10 * time.Second}
resp, err := client.Do(req)
if err != nil {
return GladiaSession{}, fmt.Errorf("session init request failed: %w", err)
}
func createSession() (GladiaSession, error) {
maxRetries := 3
retryDelay := 2 * time.Second
payload := map[string]interface{}{ // μ-law, 8-bit, 8 kHz, mono
"encoding": "wav/ulaw",
"bit_depth": 8,
"sample_rate": 8000,
"channels": 1,
}
body, err := json.Marshal(payload)
if err != nil {
return GladiaSession{}, fmt.Errorf("failed to marshal payload: %w", err)
}
var lastErr error
for attempt := 1; attempt <= maxRetries; attempt++ {
req, err := http.NewRequest("POST", gladiaInitURL, bytes.NewReader(body))
if err != nil {
return GladiaSession{}, fmt.Errorf("failed to create request: %w", err)
}
req.Header.Set("X-Gladia-Key", gladiaAPIKey)
req.Header.Set("Content-Type", "application/json")
client := &http.Client{Timeout: 10 * time.Second}
resp, err := client.Do(req)
if err != nil {
lastErr = err
log.Printf("Attempt %d: session init request failed: %v", attempt, err)
if attempt < maxRetries {
time.Sleep(retryDelay)
// Exponential backoff
retryDelay *= 2
continue
}
return GladiaSession{}, fmt.Errorf("session init request failed after %d attempts: %w", maxRetries, err)
}
defer resp.Body.Close()
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
bodyBytes, _ := io.ReadAll(resp.Body)
lastErr = fmt.Errorf("bad status code: %d - %s", resp.StatusCode, string(bodyBytes))
log.Printf("Attempt %d: %v", attempt, lastErr)
if attempt < maxRetries {
time.Sleep(retryDelay)
// Exponential backoff
retryDelay *= 2
continue
}
return GladiaSession{}, lastErr
}
var data GladiaSession
if err := json.NewDecoder(resp.Body).Decode(&data); err != nil {
return GladiaSession{}, fmt.Errorf("failed to decode response: %w", err)
}
log.Printf("🛰 Gladia session ID: %s", data.ID)
return data, nil
}
return GladiaSession{}, lastErr
}

Comment on lines +21 to +81
### 1 — Initiate a Gladia live session

```go
import (
"bytes"
"encoding/json"
"fmt"
"io"
"log"
"net/http"
"time"
)

const (
gladiaInitURL = "https://api.gladia.io/v2/live"
)

// GladiaSession stores session information
type GladiaSession struct {
ID string `json:"id"`
URL string `json:"url"`
}

// createSession initializes a Gladia real-time transcription session and returns the WebSocket URL.
func createSession() (GladiaSession, error) {
payload := map[string]interface{}{ // μ-law, 8-bit, 8 kHz, mono
"encoding": "wav/ulaw",
"bit_depth": 8,
"sample_rate": 8000,
"channels": 1,
}
body, err := json.Marshal(payload)
if err != nil {
return GladiaSession{}, fmt.Errorf("failed to marshal payload: %w", err)
}
req, err := http.NewRequest("POST", gladiaInitURL, bytes.NewReader(body))
if err != nil {
return GladiaSession{}, fmt.Errorf("failed to create request: %w", err)
}
req.Header.Set("X-Gladia-Key", gladiaAPIKey)
req.Header.Set("Content-Type", "application/json")
client := &http.Client{Timeout: 10 * time.Second}
resp, err := client.Do(req)
if err != nil {
return GladiaSession{}, fmt.Errorf("session init request failed: %w", err)
}
defer resp.Body.Close()

if resp.StatusCode < 200 || resp.StatusCode >= 300 {
bodyBytes, _ := io.ReadAll(resp.Body)
return GladiaSession{}, fmt.Errorf("bad status code: %d - %s", resp.StatusCode, string(bodyBytes))
}

var data GladiaSession
if err := json.NewDecoder(resp.Body).Decode(&data); err != nil {
return GladiaSession{}, fmt.Errorf("failed to decode response: %w", err)
}
log.Printf("🛰 Gladia session ID: %s", data.ID)
return data, nil
}
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Missing context in createSession snippet
The example references gladiaAPIKey and session package-level variables that aren’t declared in the snippet, which may confuse readers and lead to compilation errors. Also, using a generic map[string]interface{} for the payload reduces type safety.

Consider adding these declarations and a strongly typed request struct:

 import (
     "bytes"
     "encoding/json"
     "fmt"
     "io"
     "log"
     "net/http"
     "time"
 )

+// Package-level variables for API key and session
+var (
+    gladiaAPIKey string
+    session      GladiaSession
+)

 // createSession initializes a Gladia real-time transcription session...
 func createSession() (GladiaSession, error) {
-    payload := map[string]interface{}{ // μ-law, 8-bit, 8 kHz, mono
-        "encoding":    "wav/ulaw",
-        "bit_depth":   8,
-        "sample_rate": 8000,
-        "channels":    1,
-    }
+    // Use a typed struct for payload
+    type initPayload struct {
+        Encoding   string `json:"encoding"`
+        BitDepth   int    `json:"bit_depth"`
+        SampleRate int    `json:"sample_rate"`
+        Channels   int    `json:"channels"`
+    }
+    payload := initPayload{
+        Encoding:   "wav/ulaw",
+        BitDepth:   8,
+        SampleRate: 8000,
+        Channels:   1,
+    }
     body, err := json.Marshal(payload)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
### 1 — Initiate a Gladia live session
```go
import (
"bytes"
"encoding/json"
"fmt"
"io"
"log"
"net/http"
"time"
)
const (
gladiaInitURL = "https://api.gladia.io/v2/live"
)
// GladiaSession stores session information
type GladiaSession struct {
ID string `json:"id"`
URL string `json:"url"`
}
// createSession initializes a Gladia real-time transcription session and returns the WebSocket URL.
func createSession() (GladiaSession, error) {
payload := map[string]interface{}{ // μ-law, 8-bit, 8 kHz, mono
"encoding": "wav/ulaw",
"bit_depth": 8,
"sample_rate": 8000,
"channels": 1,
}
body, err := json.Marshal(payload)
if err != nil {
return GladiaSession{}, fmt.Errorf("failed to marshal payload: %w", err)
}
req, err := http.NewRequest("POST", gladiaInitURL, bytes.NewReader(body))
if err != nil {
return GladiaSession{}, fmt.Errorf("failed to create request: %w", err)
}
req.Header.Set("X-Gladia-Key", gladiaAPIKey)
req.Header.Set("Content-Type", "application/json")
client := &http.Client{Timeout: 10 * time.Second}
resp, err := client.Do(req)
if err != nil {
return GladiaSession{}, fmt.Errorf("session init request failed: %w", err)
}
defer resp.Body.Close()
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
bodyBytes, _ := io.ReadAll(resp.Body)
return GladiaSession{}, fmt.Errorf("bad status code: %d - %s", resp.StatusCode, string(bodyBytes))
}
var data GladiaSession
if err := json.NewDecoder(resp.Body).Decode(&data); err != nil {
return GladiaSession{}, fmt.Errorf("failed to decode response: %w", err)
}
log.Printf("🛰 Gladia session ID: %s", data.ID)
return data, nil
}
```
import (
"bytes"
"encoding/json"
"fmt"
"io"
"log"
"net/http"
"time"
)
// Package-level variables for API key and session
var (
gladiaAPIKey string
session GladiaSession
)
const (
gladiaInitURL = "https://api.gladia.io/v2/live"
)
// GladiaSession stores session information
type GladiaSession struct {
ID string `json:"id"`
URL string `json:"url"`
}
// createSession initializes a Gladia real-time transcription session and returns the WebSocket URL.
func createSession() (GladiaSession, error) {
// Use a typed struct for payload
type initPayload struct {
Encoding string `json:"encoding"`
BitDepth int `json:"bit_depth"`
SampleRate int `json:"sample_rate"`
Channels int `json:"channels"`
}
payload := initPayload{
Encoding: "wav/ulaw",
BitDepth: 8,
SampleRate: 8000,
Channels: 1,
}
body, err := json.Marshal(payload)
if err != nil {
return GladiaSession{}, fmt.Errorf("failed to marshal payload: %w", err)
}
req, err := http.NewRequest("POST", gladiaInitURL, bytes.NewReader(body))
if err != nil {
return GladiaSession{}, fmt.Errorf("failed to create request: %w", err)
}
req.Header.Set("X-Gladia-Key", gladiaAPIKey)
req.Header.Set("Content-Type", "application/json")
client := &http.Client{Timeout: 10 * time.Second}
resp, err := client.Do(req)
if err != nil {
return GladiaSession{}, fmt.Errorf("session init request failed: %w", err)
}
defer resp.Body.Close()
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
bodyBytes, _ := io.ReadAll(resp.Body)
return GladiaSession{}, fmt.Errorf("bad status code: %d - %s", resp.StatusCode, string(bodyBytes))
}
var data GladiaSession
if err := json.NewDecoder(resp.Body).Decode(&data); err != nil {
return GladiaSession{}, fmt.Errorf("failed to decode response: %w", err)
}
log.Printf("🛰 Gladia session ID: %s", data.ID)
return data, nil
}

…on of Twilio calls using Gladia, including environment setup instructions, .gitignore, package.json, and necessary JavaScript files for server implementation and TwiML configuration
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (9)
blogs/twilio-solaria-javascript/src/main.js (6)

16-17: Consider using const for immutable variables.

The variable gladiaAPIKey is initialized once and never reassigned, making it a good candidate for using const instead of let. This helps prevent accidental reassignment and follows JavaScript best practices.

-let gladiaAPIKey = process.env.GLADIA_API_KEY;
+const gladiaAPIKey = process.env.GLADIA_API_KEY;

29-51: Consider adding retry logic for API initialization.

The API initialization is a critical step, but there's no retry mechanism if it fails due to temporary network issues. In a production environment, adding retry logic with exponential backoff would improve resilience.

You could implement a simple retry mechanism like this:

async function createSession() {
  // μ-law, 8-bit, 8 kHz, mono
  const payload = {
    encoding: 'wav/ulaw',
    bit_depth: 8,
    sample_rate: 8000,
    channels: 1
  };

+ let retries = 3;
+ const delay = ms => new Promise(resolve => setTimeout(resolve, ms));
  
+ while (retries > 0) {
    try {
      const response = await fetch(GLADIA_INIT_URL, {
        method: 'POST',
        headers: {
          'X-Gladia-Key': gladiaAPIKey,
          'Content-Type': 'application/json'
        },
        body: JSON.stringify(payload),
        timeout: 10000
      });

      if (!response.ok) {
        const errorBody = await response.text();
        throw new Error(`Bad status code: ${response.status} - ${errorBody}`);
      }

      const data = await response.json();
      console.log(`🛰 Gladia session ID: ${data.id}`);
      return data;
    } catch (error) {
+     retries--;
+     if (retries === 0) {
        throw new Error(`Failed to create session: ${error.message}`);
+     }
+     console.log(`Session creation failed, retrying (${retries} attempts left): ${error.message}`);
+     await delay(2000 * (4 - retries)); // Exponential backoff
    }
+ }
}

105-105: Message handling should include rate limiting.

The current implementation forwards all messages without any rate limiting, which could potentially lead to overwhelming the Gladia API in high-volume scenarios.

Consider implementing a simple token bucket rate limiter to ensure the system remains stable under heavy load.


152-159: Add content security policy headers for HTTP responses.

For added security, consider adding Content-Security-Policy headers to protect against XSS attacks, especially for the health check endpoint.

      if (req.url === '/health') {
-       res.writeHead(200, { 'Content-Type': 'application/json' });
+       res.writeHead(200, { 
+         'Content-Type': 'application/json',
+         'Content-Security-Policy': "default-src 'self'",
+         'X-Content-Type-Options': 'nosniff'
+       });
        res.end(JSON.stringify({ 
          status: 'ok', 
          service: 'twilio-gladia-transcription' 
        }));

164-166: Enhance security headers for default HTTP responses.

Similar to the health check endpoint, add security headers to other HTTP responses as well.

      // Default message for HTTP requests
-     res.writeHead(200, { 'Content-Type': 'text/plain' });
+     res.writeHead(200, { 
+       'Content-Type': 'text/plain',
+       'Content-Security-Policy': "default-src 'self'",
+       'X-Content-Type-Options': 'nosniff'
+     });
      res.end('Twilio-Gladia Transcription Server\n\nAvailable endpoints:\n- /media (WebSocket): Connect Twilio Media Streams\n- /health (HTTP): Health check endpoint');

136-186: Add graceful shutdown handling.

The server should handle process signals (SIGTERM, SIGINT) to ensure graceful shutdown, properly closing WebSocket connections and cleaning up resources.

Add the following code to the main function to handle graceful shutdowns:

    // Start the server
    server.listen(port, () => {
      console.log(`🚀 Starting server on 0.0.0.0:${port}`);
    });
    
+   // Handle graceful shutdown
+   const shutdown = (signal) => {
+     console.log(`Received ${signal}. Shutting down gracefully...`);
+     
+     // Close the HTTP server
+     server.close(() => {
+       console.log('HTTP server closed.');
+     });
+     
+     // Close all WebSocket connections
+     wss.clients.forEach((client) => {
+       client.close(1000, 'Server shutting down');
+     });
+     
+     // Exit after a timeout
+     setTimeout(() => {
+       console.log('Exiting process...');
+       process.exit(0);
+     }, 3000);
+   };
+   
+   // Register signal handlers
+   process.on('SIGTERM', () => shutdown('SIGTERM'));
+   process.on('SIGINT', () => shutdown('SIGINT'));
    
  } catch (error) {
    console.error(`Server initialization failed: ${error}`);
    process.exit(1);
  }
blogs/twilio-solaria-javascript/blog.md (3)

207-218: Include security considerations for TwiML endpoint.

The TwiML example is correct, but there's no mention of securing the endpoint to prevent unauthorized configuration of Twilio numbers.

Consider adding a brief section on securing your TwiML endpoint with Twilio's request validation to ensure only legitimate Twilio requests can configure call handling.


220-236: Fix punctuation in TwiML explanation list.

There are some punctuation issues in this section with loose punctuation marks at the beginning of list items.

-  - `<Response>`: The root element of any TwiML document. It contains all the TwiML instructions for handling the call.
+* `<Response>`: The root element of any TwiML document. It contains all the TwiML instructions for handling the call.

-  - `<Start>`: This element initiates Twilio's Media Streams feature, which allows streaming of audio in real-time while the call is in progress. It tells Twilio to begin capturing and streaming media before executing the rest of the call flow.
+* `<Start>`: This element initiates Twilio's Media Streams feature, which allows streaming of audio in real-time while the call is in progress. It tells Twilio to begin capturing and streaming media before executing the rest of the call flow.

-  - `<Stream>`: A child element of `<Start>` that configures the media stream:
+* `<Stream>`: A child element of `<Start>` that configures the media stream:
🧰 Tools
🪛 LanguageTool

[uncategorized] ~221-~221: Loose punctuation mark.
Context: ...his TwiML configuration: - <Response>: The root element of any TwiML document....

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~223-~223: Loose punctuation mark.
Context: ...ions for handling the call. - <Start>: This element initiates Twilio's Media S...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~225-~225: Loose punctuation mark.
Context: ...the rest of the call flow. - <Stream>: A child element of <Start> that confi...

(UNLIKELY_OPENING_PUNCTUATION)


[misspelling] ~228-~228: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...

(EN_A_VS_AN)


[uncategorized] ~232-~232: Loose punctuation mark.
Context: ...connection to this endpoint. - <Dial>: After starting the media stream, this e...

(UNLIKELY_OPENING_PUNCTUATION)


228-228: Correct article usage.

The phrase "an ngrok URL" uses the wrong article for "ngrok" which starts with a consonant sound.

-  - The domain should be your public domain (e.g., an ngrok URL or a custom domain).
+  - The domain should be your public domain (e.g., a ngrok URL or a custom domain).
🧰 Tools
🪛 LanguageTool

[misspelling] ~228-~228: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...

(EN_A_VS_AN)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a34b99b and 00c16f8.

⛔ Files ignored due to path filters (1)
  • blogs/twilio-solaria-javascript/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (7)
  • blogs/twilio-solaria-javascript/.gitignore (1 hunks)
  • blogs/twilio-solaria-javascript/blog.md (1 hunks)
  • blogs/twilio-solaria-javascript/package.json (1 hunks)
  • blogs/twilio-solaria-javascript/src/README.md (1 hunks)
  • blogs/twilio-solaria-javascript/src/env_setup.txt (1 hunks)
  • blogs/twilio-solaria-javascript/src/main.js (1 hunks)
  • blogs/twilio-solaria-javascript/src/twiml_example.xml (1 hunks)
✅ Files skipped from review due to trivial changes (5)
  • blogs/twilio-solaria-javascript/src/env_setup.txt
  • blogs/twilio-solaria-javascript/.gitignore
  • blogs/twilio-solaria-javascript/src/twiml_example.xml
  • blogs/twilio-solaria-javascript/package.json
  • blogs/twilio-solaria-javascript/src/README.md
🧰 Additional context used
🪛 LanguageTool
blogs/twilio-solaria-javascript/blog.md

[uncategorized] ~221-~221: Loose punctuation mark.
Context: ...his TwiML configuration: - <Response>: The root element of any TwiML document....

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~223-~223: Loose punctuation mark.
Context: ...ions for handling the call. - <Start>: This element initiates Twilio's Media S...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~225-~225: Loose punctuation mark.
Context: ...the rest of the call flow. - <Stream>: A child element of <Start> that confi...

(UNLIKELY_OPENING_PUNCTUATION)


[misspelling] ~228-~228: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...

(EN_A_VS_AN)


[uncategorized] ~232-~232: Loose punctuation mark.
Context: ...connection to this endpoint. - <Dial>: After starting the media stream, this e...

(UNLIKELY_OPENING_PUNCTUATION)

🔇 Additional comments (8)
blogs/twilio-solaria-javascript/src/main.js (2)

30-38: Code looks good and follows proper API configuration.

The payload is correctly configured to handle Twilio's μ-law audio format (8-bit, 8 kHz, mono), which is essential for proper transcription.


189-192: LGTM: Proper error handling.

Good implementation of a global error handler to catch unhandled promise rejections and exit gracefully.

blogs/twilio-solaria-javascript/blog.md (6)

1-3: Great headline and introduction.

The headline and introduction effectively communicate the value proposition: Gladia's native μ-law support allows for direct processing of Twilio audio without conversion steps.


9-18: Well-structured prerequisites table.

The prerequisites table clearly outlines what users need and why each component is necessary. This helps readers ensure they have everything ready before attempting implementation.


23-61: Code example matches implementation perfectly.

The createSession function example correctly shows how to initialize a Gladia session with the appropriate μ-law audio parameters, matching the implementation in the main.js file.


71-97: Excellent system architecture diagram.

The Mermaid sequence diagram clearly illustrates the data flow between all components (Caller, Twilio, JavaScript Server, and Gladia API), making the system architecture easy to understand.


244-262: Great installation and deployment instructions.

The command examples are clear, concise, and cover development, production, and tunneling scenarios, making it easy for users to get started.


279-284: Excellent "Next steps" section.

The next steps section provides valuable guidance for users to extend the basic implementation with advanced features like diarization, dual-channel support, and scaling considerations.

Comment on lines +167 to +175

// Create WebSocket server
const wss = new WebSocket.Server({ server });

// Handle WebSocket connections
wss.on('connection', (ws, req) => {
console.log(`🔌 WebSocket connected to ${req.url}`);
handleWebSocket(ws);
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add WebSocket connection validation.

The server accepts any WebSocket connection without validation. Consider adding basic validation to ensure connections are coming from Twilio or authorized sources.

    // Create WebSocket server
    const wss = new WebSocket.Server({ server });
    
    // Handle WebSocket connections
    wss.on('connection', (ws, req) => {
      console.log(`🔌 WebSocket connected to ${req.url}`);
+     
+     // Basic validation - check for Twilio headers or implement token-based auth
+     const twilioSignature = req.headers['x-twilio-signature'];
+     if (!twilioSignature) {
+       console.log('Warning: Connection without Twilio signature detected');
+       // In production, you might want to reject unauthorized connections
+       // ws.close(1008, 'Unauthorized');
+       // return;
+     }
+     
      handleWebSocket(ws, req);
    });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Create WebSocket server
const wss = new WebSocket.Server({ server });
// Handle WebSocket connections
wss.on('connection', (ws, req) => {
console.log(`🔌 WebSocket connected to ${req.url}`);
handleWebSocket(ws);
});
// Create WebSocket server
const wss = new WebSocket.Server({ server });
// Handle WebSocket connections
wss.on('connection', (ws, req) => {
console.log(`🔌 WebSocket connected to ${req.url}`);
// Basic validation - check for Twilio headers or implement token-based auth
const twilioSignature = req.headers['x-twilio-signature'];
if (!twilioSignature) {
console.log('Warning: Connection without Twilio signature detected');
// In production, you might want to reject unauthorized connections
// ws.close(1008, 'Unauthorized');
// return;
}
handleWebSocket(ws, req);
});

Comment on lines +93 to +94
const clientInfo = twilioConn._socket.remoteAddress;
console.log(`🔌 Twilio WebSocket connected from ${clientInfo}`);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid using internal socket properties.

Using _socket.remoteAddress accesses internal WebSocket implementation details which may change across library versions. Consider using the standard req.connection.remoteAddress passed from the connection handler instead.

-function handleWebSocket(twilioConn) {
-  const clientInfo = twilioConn._socket.remoteAddress;
+function handleWebSocket(twilioConn, req) {
+  const clientInfo = req.connection.remoteAddress;

Then update the call in the main connection handler (line 174):

-      handleWebSocket(ws);
+      handleWebSocket(ws, req);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const clientInfo = twilioConn._socket.remoteAddress;
console.log(`🔌 Twilio WebSocket connected from ${clientInfo}`);
// In blogs/twilio-solaria-javascript/src/main.js around lines 93–94
-function handleWebSocket(twilioConn) {
- const clientInfo = twilioConn._socket.remoteAddress;
+function handleWebSocket(twilioConn, req) {
+ const clientInfo = req.connection.remoteAddress;
console.log(`🔌 Twilio WebSocket connected from ${clientInfo}`);
// …rest of your logic
}
Suggested change
const clientInfo = twilioConn._socket.remoteAddress;
console.log(`🔌 Twilio WebSocket connected from ${clientInfo}`);
// In the main connection handler (around line 174)
- handleWebSocket(ws);
+ handleWebSocket(ws, req);

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
blogs/twilio-solaria-typescript/src/app/handlers.ts (1)

45-62: Consider handling partial transcripts for more interactive applications.

The function currently only logs and returns final transcripts. For applications requiring real-time feedback, you might want to also process partial transcripts.

export function handleGladia(message: Buffer): string {
  try {
    // Parse the message from Gladia
    const msg: GladiaMessage = JSON.parse(message.toString());
    
    // Check if this is a final transcript
    if (msg.type === 'transcript' && msg.data?.is_final) {
      const transcript = msg.data.utterance.text;
      console.log(`📝 Transcript: ${transcript}`);
      return transcript;
+   } else if (msg.type === 'transcript' && msg.data?.utterance?.text) {
+     // Handle partial transcripts
+     const partialTranscript = msg.data.utterance.text;
+     console.log(`🔄 Partial: ${partialTranscript}`);
+     return partialTranscript;
    }
    
    return '';
  } catch (error) {
    console.error(`Error parsing Gladia message: ${error}`);
    return '';
  }
}
blogs/twilio-solaria-typescript/src/app/server.ts (1)

63-120: Consider implementing WebSocket reconnection logic.

The server currently doesn't attempt to reconnect if the Gladia connection is interrupted. For production use, implementing reconnection logic would improve resilience.

wss.on('connection', async (twilioConn: WebSocket, req: http.IncomingMessage) => {
  const clientInfo = req.socket.remoteAddress || 'unknown';
  console.log(`🔌 Twilio WebSocket connected from ${clientInfo} on path ${req.url}`);

  try {
+   // Function to establish Gladia connection with retry logic
+   const connectToGladia = async (retries = 3, delay = 1000): Promise<WebSocket> => {
+     let lastError: Error | undefined;
+     
+     for (let attempt = 1; attempt <= retries; attempt++) {
+       try {
+         // Connect to Gladia
+         const gladiaConn = new WebSocket(session.url);
+         
+         // Wait for connection to open
+         await new Promise<void>((resolve, reject) => {
+           gladiaConn.on('open', () => {
+             console.log(`Connected to Gladia session ${session.id} (attempt ${attempt}/${retries})`);
+             resolve();
+           });
+           gladiaConn.on('error', reject);
+         });
+         
+         return gladiaConn;
+       } catch (error) {
+         lastError = error as Error;
+         console.error(`Connection attempt ${attempt}/${retries} failed: ${error}`);
+         
+         if (attempt < retries) {
+           console.log(`Retrying in ${delay}ms...`);
+           await new Promise(resolve => setTimeout(resolve, delay));
+           // Exponential backoff
+           delay *= 2;
+         }
+       }
+     }
+     
+     throw lastError || new Error('Failed to connect to Gladia after multiple attempts');
+   };

-   // Connect to Gladia
-   const gladiaConn = new WebSocket(session.url);
-   
-   // Handle connection errors
-   gladiaConn.on('error', (error) => {
-     console.error(`Error with Gladia connection: ${error}`);
-     twilioConn.close();
-   });
-
-   // Wait for Gladia connection to open
-   await new Promise<void>((resolve, reject) => {
-     gladiaConn.on('open', () => {
-       console.log(`Connected to Gladia session ${session.id}`);
-       resolve();
-     });
-     gladiaConn.on('error', reject);
-   });

+   // Establish connection with retry logic
+   const gladiaConn = await connectToGladia();
+   
+   // Handle connection errors
+   gladiaConn.on('error', (error) => {
+     console.error(`Error with Gladia connection: ${error}`);
+     twilioConn.close();
+   });
blogs/twilio-solaria-typescript/blog.md (1)

320-336: Fix formatting in markdown list.

The list items have loose punctuation marks that should be fixed for better rendering.

Let's examine each element in this TwiML configuration:

- `<Response>`: The root element of any TwiML document. It contains all the TwiML instructions for handling the call.

- `<Start>`: This element initiates Twilio's Media Streams feature, which allows streaming of audio in real-time while the call is in progress. It tells Twilio to begin capturing and streaming media before executing the rest of the call flow.

- `<Stream>`: A child element of `<Start>` that configures the media stream:
  - `url` attribute: Specifies the WebSocket endpoint where Twilio will send the audio data.
  - The URL must use secure WebSockets (`wss://`).
-  - The domain should be your public domain (e.g., an ngrok URL or a custom domain).
+  - The domain should be your public domain (e.g., a ngrok URL or a custom domain).
  - The path (`/media`) must match the WebSocket route in your TypeScript application.
  - Each call will create a new WebSocket connection to this endpoint.

- `<Dial>`: After starting the media stream, this element connects the caller to another phone number. During this connection:
🧰 Tools
🪛 LanguageTool

[uncategorized] ~320-~320: Loose punctuation mark.
Context: ...his TwiML configuration: - <Response>: The root element of any TwiML document....

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~322-~322: Loose punctuation mark.
Context: ...ions for handling the call. - <Start>: This element initiates Twilio's Media S...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~324-~324: Loose punctuation mark.
Context: ...the rest of the call flow. - <Stream>: A child element of <Start> that confi...

(UNLIKELY_OPENING_PUNCTUATION)


[misspelling] ~327-~327: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...

(EN_A_VS_AN)


[uncategorized] ~331-~331: Loose punctuation mark.
Context: ...connection to this endpoint. - <Dial>: After starting the media stream, this e...

(UNLIKELY_OPENING_PUNCTUATION)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 00c16f8 and 6837317.

⛔ Files ignored due to path filters (1)
  • blogs/twilio-solaria-typescript/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (11)
  • blogs/twilio-solaria-typescript/.gitignore (1 hunks)
  • blogs/twilio-solaria-typescript/blog.md (1 hunks)
  • blogs/twilio-solaria-typescript/package.json (1 hunks)
  • blogs/twilio-solaria-typescript/src/README.md (1 hunks)
  • blogs/twilio-solaria-typescript/src/app/gladiaClient.ts (1 hunks)
  • blogs/twilio-solaria-typescript/src/app/handlers.ts (1 hunks)
  • blogs/twilio-solaria-typescript/src/app/server.ts (1 hunks)
  • blogs/twilio-solaria-typescript/src/app/types.ts (1 hunks)
  • blogs/twilio-solaria-typescript/src/env_setup.txt (1 hunks)
  • blogs/twilio-solaria-typescript/src/twiml_example.xml (1 hunks)
  • blogs/twilio-solaria-typescript/tsconfig.json (1 hunks)
✅ Files skipped from review due to trivial changes (7)
  • blogs/twilio-solaria-typescript/src/twiml_example.xml
  • blogs/twilio-solaria-typescript/tsconfig.json
  • blogs/twilio-solaria-typescript/package.json
  • blogs/twilio-solaria-typescript/src/env_setup.txt
  • blogs/twilio-solaria-typescript/src/app/types.ts
  • blogs/twilio-solaria-typescript/.gitignore
  • blogs/twilio-solaria-typescript/src/README.md
🧰 Additional context used
🧬 Code Graph Analysis (3)
blogs/twilio-solaria-typescript/src/app/handlers.ts (2)
blogs/twilio-solaria-javascript/src/main.js (3)
  • gladiaConn (97-97)
  • mulaw (63-63)
  • transcript (80-80)
blogs/twilio-solaria-typescript/src/app/types.ts (2)
  • TwilioMessage (8-13)
  • GladiaMessage (15-23)
blogs/twilio-solaria-typescript/src/app/gladiaClient.ts (1)
blogs/twilio-solaria-typescript/src/app/types.ts (1)
  • GladiaSession (3-6)
blogs/twilio-solaria-typescript/src/app/server.ts (3)
blogs/twilio-solaria-typescript/src/app/types.ts (1)
  • GladiaSession (3-6)
blogs/twilio-solaria-typescript/src/app/gladiaClient.ts (1)
  • createSession (12-87)
blogs/twilio-solaria-typescript/src/app/handlers.ts (2)
  • processMessage (9-38)
  • handleGladia (45-62)
🪛 LanguageTool
blogs/twilio-solaria-typescript/blog.md

[uncategorized] ~320-~320: Loose punctuation mark.
Context: ...his TwiML configuration: - <Response>: The root element of any TwiML document....

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~322-~322: Loose punctuation mark.
Context: ...ions for handling the call. - <Start>: This element initiates Twilio's Media S...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~324-~324: Loose punctuation mark.
Context: ...the rest of the call flow. - <Stream>: A child element of <Start> that confi...

(UNLIKELY_OPENING_PUNCTUATION)


[misspelling] ~327-~327: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...

(EN_A_VS_AN)


[uncategorized] ~331-~331: Loose punctuation mark.
Context: ...connection to this endpoint. - <Dial>: After starting the media stream, this e...

(UNLIKELY_OPENING_PUNCTUATION)

🔇 Additional comments (4)
blogs/twilio-solaria-typescript/src/app/gladiaClient.ts (1)

12-87: LGTM! Well-implemented session creation with robust error handling.

The createSession function is well-structured with comprehensive error handling for HTTP status codes, request errors, timeouts, and JSON parsing. The audio encoding parameters correctly match Twilio's μ-law format (8 kHz, 8-bit, mono).

blogs/twilio-solaria-typescript/src/app/handlers.ts (1)

9-38: LGTM! Effective processing of Twilio messages with proper error handling.

The function correctly parses JSON messages, filters for media events, validates payloads, and efficiently decodes base64 μ-law audio before binary transmission to Gladia.

blogs/twilio-solaria-typescript/src/app/server.ts (1)

18-140: Well-structured server with clean WebSocket handling.

The main function orchestrates the application flow nicely with proper error handling and logging.

blogs/twilio-solaria-typescript/blog.md (1)

127-146: Good use of a sequence diagram to illustrate the architecture.

The Mermaid diagram effectively illustrates the data flow and interactions between system components, making it easy to understand the overall architecture.

Comment on lines +63 to +120
wss.on('connection', async (twilioConn: WebSocket, req: http.IncomingMessage) => {
const clientInfo = req.socket.remoteAddress || 'unknown';
console.log(`🔌 Twilio WebSocket connected from ${clientInfo} on path ${req.url}`);

try {
// Connect to Gladia
const gladiaConn = new WebSocket(session.url);

// Handle connection errors
gladiaConn.on('error', (error) => {
console.error(`Error with Gladia connection: ${error}`);
twilioConn.close();
});

// Wait for Gladia connection to open
await new Promise<void>((resolve, reject) => {
gladiaConn.on('open', () => {
console.log(`Connected to Gladia session ${session.id}`);
resolve();
});
gladiaConn.on('error', reject);
});

// Handle messages from Twilio
twilioConn.on('message', (message: Buffer) => {
try {
processMessage(message, gladiaConn);
} catch (error) {
console.error(`Error processing Twilio message: ${error}`);
}
});

// Handle messages from Gladia
gladiaConn.on('message', (message: Buffer) => {
try {
handleGladia(message);
} catch (error) {
console.error(`Error processing Gladia message: ${error}`);
}
});

// Handle Twilio connection close
twilioConn.on('close', () => {
console.log(`Twilio connection closed from ${clientInfo}`);
gladiaConn.close();
});

// Handle Gladia connection close
gladiaConn.on('close', () => {
console.log('Gladia connection closed');
twilioConn.close();
});

} catch (error) {
console.error(`Failed to establish connection to Gladia: ${error}`);
twilioConn.close();
}
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add path validation for WebSocket connections.

The server accepts WebSocket connections on any path. To prevent unexpected connections, validate that req.url matches the expected path ('/media').

wss.on('connection', async (twilioConn: WebSocket, req: http.IncomingMessage) => {
  const clientInfo = req.socket.remoteAddress || 'unknown';
  console.log(`🔌 Twilio WebSocket connected from ${clientInfo} on path ${req.url}`);

+  // Validate that the WebSocket connection is to the expected path
+  if (req.url !== '/media') {
+    console.error(`Rejecting connection to unexpected path: ${req.url}`);
+    twilioConn.close(1008, 'Path not supported');
+    return;
+  }

  try {
    // Connect to Gladia
    const gladiaConn = new WebSocket(session.url);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
wss.on('connection', async (twilioConn: WebSocket, req: http.IncomingMessage) => {
const clientInfo = req.socket.remoteAddress || 'unknown';
console.log(`🔌 Twilio WebSocket connected from ${clientInfo} on path ${req.url}`);
try {
// Connect to Gladia
const gladiaConn = new WebSocket(session.url);
// Handle connection errors
gladiaConn.on('error', (error) => {
console.error(`Error with Gladia connection: ${error}`);
twilioConn.close();
});
// Wait for Gladia connection to open
await new Promise<void>((resolve, reject) => {
gladiaConn.on('open', () => {
console.log(`Connected to Gladia session ${session.id}`);
resolve();
});
gladiaConn.on('error', reject);
});
// Handle messages from Twilio
twilioConn.on('message', (message: Buffer) => {
try {
processMessage(message, gladiaConn);
} catch (error) {
console.error(`Error processing Twilio message: ${error}`);
}
});
// Handle messages from Gladia
gladiaConn.on('message', (message: Buffer) => {
try {
handleGladia(message);
} catch (error) {
console.error(`Error processing Gladia message: ${error}`);
}
});
// Handle Twilio connection close
twilioConn.on('close', () => {
console.log(`Twilio connection closed from ${clientInfo}`);
gladiaConn.close();
});
// Handle Gladia connection close
gladiaConn.on('close', () => {
console.log('Gladia connection closed');
twilioConn.close();
});
} catch (error) {
console.error(`Failed to establish connection to Gladia: ${error}`);
twilioConn.close();
}
});
wss.on('connection', async (twilioConn: WebSocket, req: http.IncomingMessage) => {
const clientInfo = req.socket.remoteAddress || 'unknown';
console.log(`🔌 Twilio WebSocket connected from ${clientInfo} on path ${req.url}`);
// Validate that the WebSocket connection is to the expected path
if (req.url !== '/media') {
console.error(`Rejecting connection to unexpected path: ${req.url}`);
twilioConn.close(1008, 'Path not supported');
return;
}
try {
// Connect to Gladia
const gladiaConn = new WebSocket(session.url);
// Handle connection errors
gladiaConn.on('error', (error) => {
console.error(`Error with Gladia connection: ${error}`);
twilioConn.close();
});
// Wait for Gladia connection to open
await new Promise<void>((resolve, reject) => {
gladiaConn.on('open', () => {
console.log(`Connected to Gladia session ${session.id}`);
resolve();
});
gladiaConn.on('error', reject);
});
// Handle messages from Twilio
twilioConn.on('message', (message: Buffer) => {
try {
processMessage(message, gladiaConn);
} catch (error) {
console.error(`Error processing Twilio message: ${error}`);
}
});
// Handle messages from Gladia
gladiaConn.on('message', (message: Buffer) => {
try {
handleGladia(message);
} catch (error) {
console.error(`Error processing Gladia message: ${error}`);
}
});
// Handle Twilio connection close
twilioConn.on('close', () => {
console.log(`Twilio connection closed from ${clientInfo}`);
gladiaConn.close();
});
// Handle Gladia connection close
gladiaConn.on('close', () => {
console.log('Gladia connection closed');
twilioConn.close();
});
} catch (error) {
console.error(`Failed to establish connection to Gladia: ${error}`);
twilioConn.close();
}
});

Comment on lines +381 to +385
* **Add-ons** – enable diarization, sentiment, keywords, etc., by including the flags when you create the session.
* **Dual-channel** – Twilio can stream separate channels; Gladia preserves them so you always know who's speaking.
* **Post-call JSON** – store the session `id` and hit `GET /v2/live/:id` for the full, punctuated transcript when the call ends.
* **Scale it** – TypeScript/Node.js's event-driven, non-blocking I/O model makes it easy to scale for high loads. Consider deploying with a load balancer for horizontal scaling.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add security considerations to the Next Steps section.

Consider adding security recommendations for production deployment, such as using TLS/HTTPS, implementing authentication for the WebSocket endpoint, and securely managing API keys.

### 5 — Next steps

* **Add-ons** – enable diarization, sentiment, keywords, etc., by including the flags when you create the session.
* **Dual-channel** – Twilio can stream separate channels; Gladia preserves them so you always know who's speaking.
* **Post-call JSON** – store the session `id` and hit `GET /v2/live/:id` for the full, punctuated transcript when the call ends.
* **Scale it** – TypeScript/Node.js's event-driven, non-blocking I/O model makes it easy to scale for high loads. Consider deploying with a load balancer for horizontal scaling.
+ * **Security** – For production, ensure you're using TLS/HTTPS, implement authentication for your WebSocket endpoint, and securely manage API keys using environment variables or a secret management solution.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
* **Add-ons** – enable diarization, sentiment, keywords, etc., by including the flags when you create the session.
* **Dual-channel** – Twilio can stream separate channels; Gladia preserves them so you always know who's speaking.
* **Post-call JSON** – store the session `id` and hit `GET /v2/live/:id` for the full, punctuated transcript when the call ends.
* **Scale it** – TypeScript/Node.js's event-driven, non-blocking I/O model makes it easy to scale for high loads. Consider deploying with a load balancer for horizontal scaling.
### 5 — Next steps
* **Add-ons** – enable diarization, sentiment, keywords, etc., by including the flags when you create the session.
* **Dual-channel** – Twilio can stream separate channels; Gladia preserves them so you always know who's speaking.
* **Post-call JSON** – store the session `id` and hit `GET /v2/live/:id` for the full, punctuated transcript when the call ends.
* **Scale it** – TypeScript/Node.js's event-driven, non-blocking I/O model makes it easy to scale for high loads. Consider deploying with a load balancer for horizontal scaling.
* **Security** – For production, ensure you're using TLS/HTTPS, implement authentication for your WebSocket endpoint, and securely manage API keys using environment variables or a secret management solution.

…iption of Vonage calls using Gladia, including environment setup instructions, .gitignore, and necessary Python files for server implementation and NCCO configuration
@jqueguiner jqueguiner changed the title feat: blog twilio + Gladia Solaria + Flask/FastAPI + Python + Go feat: blog twilio + vonage + Gladia Solaria + Flask/FastAPI + Python + Go May 9, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (9)
blogs/vonage-solaria-python-fastapi/src/server.py (2)

10-11: Remove the unused HTTPException import

HTTPException is imported but never referenced anywhere in the module. Keeping unused symbols creates noise and can hide real-world linting issues.

-from fastapi import FastAPI, WebSocket, Response, HTTPException, Request
+from fastapi import FastAPI, WebSocket, Response, Request
🧰 Tools
🪛 Ruff (0.8.2)

10-10: fastapi.HTTPException imported but unused

Remove unused import: fastapi.HTTPException

(F401)


67-67: Drop superfluous f-prefixes

Both lines are plain strings without interpolation; the f adds no value and triggers Ruff (F541).

-logger.debug(f"Rate limiting: waiting before retrying Gladia session creation")
+logger.debug("Rate limiting: waiting before retrying Gladia session creation")

-logger.debug(f"Serving NCCO JSON")
+logger.debug("Serving NCCO JSON")

Also applies to: 222-222

🧰 Tools
🪛 Ruff (0.8.2)

67-67: f-string without any placeholders

Remove extraneous f prefix

(F541)

blogs/vonage-solaria-python-fastapi/blog.md (2)

57-58: Minor punctuation improvement

Add a comma before “or” to separate the independent clauses.

-Vonage WebSockets typically send L16 PCM audio by default. Gladia can process this directly or you can configure Vonage to send other formats.
+Vonage WebSockets typically send L16 PCM audio by default. Gladia can process this directly, or you can configure Vonage to send other formats.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~57-~57: Use a comma before ‘or’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...efault. Gladia can process this directly or you can configure Vonage to send other ...

(COMMA_COMPOUND_SENTENCE)


248-257: Markdown list rendering issue

The leading hyphens are rendered as bullets, but the preceding “:” causes markdown-lint warnings and produces a hanging colon. Drop the extra punctuation for cleaner output:

-  - `<ncco>`: The root element of any Vonage NCCO document. It contains all the instructions for handling the call.
+* `<ncco>` – The root element of any Vonage NCCO document. It contains all the instructions for handling the call.

(Apply to the other list items in this block.)

🧰 Tools
🪛 LanguageTool

[uncategorized] ~248-~248: Loose punctuation mark.
Context: ... in this NCCO configuration: - <ncco>: The root element of any Vonage NCCO doc...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~250-~250: Loose punctuation mark.
Context: ... for handling the call. - <websocket>: This element configures a WebSocket con...

(UNLIKELY_OPENING_PUNCTUATION)


[misspelling] ~253-~253: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...

(EN_A_VS_AN)


[uncategorized] ~256-~256: Loose punctuation mark.
Context: ...on to this endpoint. - <contentType>: Specifies the audio format that Vonage ...

(UNLIKELY_OPENING_PUNCTUATION)

blogs/vonage-solaria-python-fastapi/src/README.md (2)

34-38: Specify a language for fenced code blocks

Tools such as GitHub’s renderer and syntax highlighters benefit from an explicit language.

-```
+```bash
 GLADIA_API_KEY=your_gladia_api_key_here
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

35-35: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


43-45: Avoid bare URLs to improve readability

Embed the URL in markdown syntax:

-In your answer URL configuration, use the content of `vonage_example.xml` as your NCCO (Nexmo Call Control Object) (https://jl.gladia.dev/media)
+In your answer URL configuration, use the content of `vonage_example.xml` as your NCCO (Nexmo Call Control Object) (<https://jl.gladia.dev/media>)
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

43-43: Bare URL used
null

(MD034, no-bare-urls)

blogs/telnyx-solaria-python-fastapi/blog.md (1)

11-18: Minor copy-editing for clarity & grammar

A few small punctuation tweaks will improve readability and avoid LanguageTool warnings (“COMMA_COMPOUND_SENTENCE”, “EN_A_VS_AN”, etc.):

-Vonage WebSockets typically send L16 PCM audio by default. Gladia can process this directly or you can configure Vonage to send other formats.
+Vonage WebSockets typically send L16 PCM audio by default. Gladia can process this directly, or you can configure Vonage to send other formats.

and

-... expose a WebSocket endpoint with ngrok or a cloud VM.
+... expose a WebSocket endpoint with ngrok or a cloud VM.

Likewise, replace “an ngrok URL” with “a ngrok URL”.
These are purely stylistic; no functional impact.

blogs/telnyx-solaria-python-fastapi/src/server.py (2)

146-154: Slow transcript polling & potential starvation

Reading transcripts only inside the message-receive loop (while True … receive_text()) risks missing Gladia messages that arrive during network idle periods. Consider spawning a second asyncio.create_task that continuously await gladia_ws.recv() and queues/prints results, decoupling ingress and egress.


159-165: Replace bare except … pass with contextlib.suppress

Static analysis (SIM105 / E722) flags the silent swallow below. Suppressing specific exceptions is clearer:

-import contextlib
-...
-        try:
-            await gladia_ws.close()
-        except:
-            pass
+import contextlib
+...
+        with contextlib.suppress(Exception):
+            await gladia_ws.close()

Avoiding bare except prevents masking unexpected errors.

🧰 Tools
🪛 Ruff (0.8.2)

162-165: Use contextlib.suppress(Exception) instead of try-except-pass

Replace with contextlib.suppress(Exception)

(SIM105)


164-164: Do not use bare except

(E722)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6837317 and 79e242b.

📒 Files selected for processing (15)
  • .gitignore (1 hunks)
  • blogs/telnyx-solaria-python-fastapi/.gitignore (1 hunks)
  • blogs/telnyx-solaria-python-fastapi/blog.md (1 hunks)
  • blogs/telnyx-solaria-python-fastapi/src/README.md (1 hunks)
  • blogs/telnyx-solaria-python-fastapi/src/env_setup.txt (1 hunks)
  • blogs/telnyx-solaria-python-fastapi/src/requirements.txt (1 hunks)
  • blogs/telnyx-solaria-python-fastapi/src/server.py (1 hunks)
  • blogs/telnyx-solaria-python-fastapi/src/vonage_example.xml (1 hunks)
  • blogs/vonage-solaria-python-fastapi/.gitignore (1 hunks)
  • blogs/vonage-solaria-python-fastapi/blog.md (1 hunks)
  • blogs/vonage-solaria-python-fastapi/src/README.md (1 hunks)
  • blogs/vonage-solaria-python-fastapi/src/env_setup.txt (1 hunks)
  • blogs/vonage-solaria-python-fastapi/src/requirements.txt (1 hunks)
  • blogs/vonage-solaria-python-fastapi/src/server.py (1 hunks)
  • blogs/vonage-solaria-python-fastapi/src/vonage_example.xml (1 hunks)
✅ Files skipped from review due to trivial changes (10)
  • blogs/telnyx-solaria-python-fastapi/.gitignore
  • blogs/vonage-solaria-python-fastapi/.gitignore
  • .gitignore
  • blogs/vonage-solaria-python-fastapi/src/env_setup.txt
  • blogs/telnyx-solaria-python-fastapi/src/env_setup.txt
  • blogs/vonage-solaria-python-fastapi/src/vonage_example.xml
  • blogs/telnyx-solaria-python-fastapi/src/requirements.txt
  • blogs/telnyx-solaria-python-fastapi/src/vonage_example.xml
  • blogs/vonage-solaria-python-fastapi/src/requirements.txt
  • blogs/telnyx-solaria-python-fastapi/src/README.md
🧰 Additional context used
🪛 LanguageTool
blogs/telnyx-solaria-python-fastapi/blog.md

[uncategorized] ~57-~57: Use a comma before ‘or’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...efault. Gladia can process this directly or you can configure Vonage to send other ...

(COMMA_COMPOUND_SENTENCE)


[uncategorized] ~248-~248: Loose punctuation mark.
Context: ... in this NCCO configuration: - <ncco>: The root element of any Vonage NCCO doc...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~250-~250: Loose punctuation mark.
Context: ... for handling the call. - <websocket>: This element configures a WebSocket con...

(UNLIKELY_OPENING_PUNCTUATION)


[misspelling] ~253-~253: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...

(EN_A_VS_AN)


[uncategorized] ~256-~256: Loose punctuation mark.
Context: ...on to this endpoint. - <contentType>: Specifies the audio format that Vonage ...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~258-~258: Loose punctuation mark.
Context: ...CM at 8kHz in this case). - <connect>: This element connects the caller to ano...

(UNLIKELY_OPENING_PUNCTUATION)

blogs/vonage-solaria-python-fastapi/blog.md

[uncategorized] ~57-~57: Use a comma before ‘or’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...efault. Gladia can process this directly or you can configure Vonage to send other ...

(COMMA_COMPOUND_SENTENCE)


[uncategorized] ~248-~248: Loose punctuation mark.
Context: ... in this NCCO configuration: - <ncco>: The root element of any Vonage NCCO doc...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~250-~250: Loose punctuation mark.
Context: ... for handling the call. - <websocket>: This element configures a WebSocket con...

(UNLIKELY_OPENING_PUNCTUATION)


[misspelling] ~253-~253: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...

(EN_A_VS_AN)


[uncategorized] ~256-~256: Loose punctuation mark.
Context: ...on to this endpoint. - <contentType>: Specifies the audio format that Vonage ...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~258-~258: Loose punctuation mark.
Context: ...CM at 8kHz in this case). - <connect>: This element connects the caller to ano...

(UNLIKELY_OPENING_PUNCTUATION)

🪛 Ruff (0.8.2)
blogs/telnyx-solaria-python-fastapi/src/server.py

162-165: Use contextlib.suppress(Exception) instead of try-except-pass

Replace with contextlib.suppress(Exception)

(SIM105)


164-164: Do not use bare except

(E722)

blogs/vonage-solaria-python-fastapi/src/server.py

10-10: fastapi.HTTPException imported but unused

Remove unused import: fastapi.HTTPException

(F401)


67-67: f-string without any placeholders

Remove extraneous f prefix

(F541)


85-85: Dictionary key literal "sample_rate" repeated

Remove repeated key literal "sample_rate"

(F601)


136-136: Do not use bare except

(E722)


222-222: f-string without any placeholders

Remove extraneous f prefix

(F541)

🪛 markdownlint-cli2 (0.17.2)
blogs/vonage-solaria-python-fastapi/src/README.md

35-35: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


43-43: Bare URL used
null

(MD034, no-bare-urls)

🔇 Additional comments (1)
blogs/telnyx-solaria-python-fastapi/src/server.py (1)

168-174: Environment variable guard won’t trigger under Gunicorn/Uvicorn

The if __name__ == "__main__": block is bypassed when the app is run with uvicorn module:app. Move the API-key check to module import time (or FastAPI startup event) to fail fast in every deployment mode.

Comment on lines +133 to +137
try:
resp_json = r.json()
logger.debug(f"Gladia API response: {json.dumps(resp_json)}")
except:
logger.warning(f"Could not parse Gladia API response as JSON: {r.text[:200]}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Replace bare except: with a specific exception

Catching everything obscures real bugs and kills tracebacks. requests.Response.json() raises ValueError / json.JSONDecodeError for invalid JSON.

-        except:
-            logger.warning(f"Could not parse Gladia API response as JSON: {r.text[:200]}")
+        except (ValueError, json.JSONDecodeError):
+            logger.warning("Could not parse Gladia API response as JSON: %s", r.text[:200])
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
resp_json = r.json()
logger.debug(f"Gladia API response: {json.dumps(resp_json)}")
except:
logger.warning(f"Could not parse Gladia API response as JSON: {r.text[:200]}")
try:
resp_json = r.json()
logger.debug(f"Gladia API response: {json.dumps(resp_json)}")
except (ValueError, json.JSONDecodeError):
logger.warning("Could not parse Gladia API response as JSON: %s", r.text[:200])
🧰 Tools
🪛 Ruff (0.8.2)

136-136: Do not use bare except

(E722)

Comment on lines +83 to +87
"sample_rate": 8000,
"encoding": "wav/pcm",
"sample_rate": 8000,
"bit_depth": 16,
"channels": 1,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix duplicate key and ordering in payload dict

"sample_rate" is declared twice. Only the last occurrence is retained by Python, which can lead to silent mis-configuration. Also, keeping "encoding" first improves readability.

-        "sample_rate": 8000,
-        "encoding": "wav/pcm",
-        "sample_rate": 8000,
+        "encoding": "wav/pcm",
+        "sample_rate": 8000,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"sample_rate": 8000,
"encoding": "wav/pcm",
"sample_rate": 8000,
"bit_depth": 16,
"channels": 1,
"encoding": "wav/pcm",
"sample_rate": 8000,
"bit_depth": 16,
"channels": 1,
🧰 Tools
🪛 Ruff (0.8.2)

85-85: Dictionary key literal "sample_rate" repeated

Remove repeated key literal "sample_rate"

(F601)

Comment on lines +64 to +79
current_time = time.time()
time_since_last_attempt = current_time - gladia_session["last_init_attempt"]
if time_since_last_attempt < 2 and gladia_session["last_init_attempt"] > 0:
logger.debug(f"Rate limiting: waiting before retrying Gladia session creation")
time.sleep(2 - time_since_last_attempt)

gladia_session["last_init_attempt"] = time.time()

# If we've tried too many times recently, back off
if gladia_session["retry_count"] >= MAX_RETRIES:
delay = min(MAX_RETRY_DELAY, INITIAL_RETRY_DELAY * (2 ** (gladia_session["retry_count"] - MAX_RETRIES)))
# Add jitter
delay = delay * (0.5 + random.random())
logger.warning(f"Too many Gladia session creation attempts. Backing off for {delay:.2f} seconds")
time.sleep(delay)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid blocking the event-loop with time.sleep

create_session() is invoked from async contexts (handle_websocket) but calls time.sleep, blocking all other coroutines during the wait. Replace it with asyncio.sleep when an event-loop is running; fall back to time.sleep otherwise:

-import time
+import time, inspect
 ...
-        time.sleep(2 - time_since_last_attempt)
+        delay = 2 - time_since_last_attempt
+        if inspect.iscoroutinefunction(asyncio.sleep):
+            await asyncio.sleep(delay)
+        else:
+            time.sleep(delay)

A smaller helper (e.g. async_sleep(delay)) can encapsulate this pattern.

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 Ruff (0.8.2)

67-67: f-string without any placeholders

Remove extraneous f prefix

(F541)

Comment on lines +63 to +69
# Create initial Gladia session
try:
create_session()
except Exception as e:
logger.error("Failed to create initial Gladia session: %s", e)
raise

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

One Gladia session per process ⇒ cross-call audio mixing

create_session() is run once at startup and its URL is reused for every incoming WebSocket. If two callers connect concurrently, their audio is pushed into the same Gladia session, producing blended transcripts.

Refactor so each handle_websocket() invocation spins up its own Gladia session:

-# Create initial Gladia session
-try:
-    create_session()
-except Exception as e:
-    ...
+async def get_fresh_gladia_ws():
+    url = create_session()
+    return await websockets.connect(url)

…and use await get_fresh_gladia_ws() inside handle_websocket.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Create initial Gladia session
try:
create_session()
except Exception as e:
logger.error("Failed to create initial Gladia session: %s", e)
raise
async def get_fresh_gladia_ws():
url = create_session()
return await websockets.connect(url)

Comment on lines +34 to +41
"""Initialize a Gladia real-time transcription session."""
payload = {
"encoding": "wav/ulaw", # μ-law!
"bit_depth": 8, # 8-bit μ-law
"sample_rate": 8000, # matches Vonage
"channels": 1,
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Encoding mismatch will break recognition

create_session() registers the session with encoding="wav/ulaw", but later the code accepts frames labelled audio/l16;rate=8000 (PCM). Forwarding raw L16 bytes to a μ-law session yields garbled speech.

Either:

  1. Change the payload to "encoding": "wav/pcm" (to match L16), or
  2. Convert the incoming L16 stream to μ-law before sending.

Example quick fix (option 1):

-    payload = {
-        "encoding": "wav/ulaw",  # μ-law!
+    payload = {
+        "encoding": "wav/pcm",   # Linear PCM 16-bit

Failing to align the formats will silently degrade transcription quality.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"""Initialize a Gladia real-time transcription session."""
payload = {
"encoding": "wav/ulaw", # μ-law!
"bit_depth": 8, # 8-bit μ-law
"sample_rate": 8000, # matches Vonage
"channels": 1,
}
"""Initialize a Gladia real-time transcription session."""
payload = {
"encoding": "wav/pcm", # Linear PCM 16-bit
"bit_depth": 8, # 8-bit μ-law
"sample_rate": 8000, # matches Vonage
"channels": 1,
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants