feat: blog twilio + vonage + Gladia Solaria + Flask/FastAPI + Python + Go#65
feat: blog twilio + vonage + Gladia Solaria + Flask/FastAPI + Python + Go#65jqueguiner wants to merge 9 commits into
Conversation
…lio call transcription using Flask and Gladia's STT API 📝 (twilio-solaria-python-flask): create .gitignore file to exclude environment and virtual environment files 📝 (twilio-solaria-python-flask): add README.md for project setup and usage instructions 📝 (twilio-solaria-python-flask): add env_setup.txt for environment variable setup instructions ✅ (twilio-solaria-python-flask): add requirements.txt for project dependencies ♻️ (twilio-solaria-python-flask): implement server.py for handling WebSocket connections and transcription logic 📝 (twilio-solaria-python-flask): add TwiML example for configuring Twilio to stream audio to the server
WalkthroughMultiple new projects were introduced for real-time transcription of Twilio calls using Gladia's Speech-to-Text API with native μ-law audio support. These include implementations in Python (Flask and FastAPI), Go, JavaScript, and TypeScript. Each project adds comprehensive documentation, environment setup instructions, dependency specifications, WebSocket proxy servers to handle Twilio audio streams, example TwiML configurations, and Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant Twilio
participant ProxyServer
participant GladiaAPI
Caller->>Twilio: Initiates call
Twilio->>ProxyServer: Opens WebSocket /media, streams base64 μ-law audio
ProxyServer->>GladiaAPI: Creates transcription session (HTTP POST)
ProxyServer->>GladiaAPI: Forwards decoded raw audio (WebSocket)
GladiaAPI-->>ProxyServer: Sends transcription results (WebSocket)
ProxyServer-->>Console: Prints/display transcripts
Suggested reviewers
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
…flow in the application
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (10)
blogs/twilio-solaria-python-flask/.gitignore (1)
1-2: Enhance.gitignorewith standard Python patterns.
Ignoring.envand.venvis great. To prevent tracking compiled files and cache artifacts, consider adding:.env .venv +__pycache__/ +*.py[cod] +*.egg-info/ +venv/blogs/twilio-solaria-python-flask/src/env_setup.txt (1)
1-7: Clarify setup order and dotenv loading.
Good instructions on creating the.env. To streamline onboarding, you might remind users to install dependencies first (pip install -r requirements.txt) and note that callingload_dotenv()inserver.py(viapython-dotenv) will automatically load these variables at runtime.blogs/twilio-solaria-python-flask/src/README.md (4)
1-4: Add working-directory context.
Since all commands assume you’re insrc(whereserver.pyandrequirements.txtlive), add a note at the top:+# Navigate to the project source directory +```bash +cd blogs/twilio-solaria-python-flask/src +```This ensures relative paths resolve correctly.
14-32: Simplify Python environment setup.
The detailedpyenvworkflow might be overkill for users with Python 3.8+. Consider replacing it with a standardvenvflow:python3 -m venv venv source venv/bin/activate pip install -r requirements.txtThis lowers the barrier for contributors who don’t use
pyenv.
33-38: Include optionalHTTP_PORTexample.
Yourenv_setup.txtshows an optionalHTTP_PORT, but the README’s environment step omits it. To maintain consistency, include:GLADIA_API_KEY=your_gladia_api_key_here # HTTP_PORT=5001This helps users customize the server port.
63-69: Emphasize secure WebSocket scheme for ngrok.
Twilio media streams requirewss://. Whilengrok httpexposes an HTTPS endpoint, clarify the WebSocket URL should usewss://. For example:ngrok http 5000 --bind-tls=true # and then connect via wss://<your-ngrok-id>.ngrok.io/mediablogs/twilio-solaria-python-flask/blog.md (2)
181-182: Fix article grammar: “a ngrok URL”, not “an ngrok URL”
“ngrok” begins with a consonant sound, so the correct indefinite article is “a”.- | **Public URL** | Expose a WebSocket endpoint with ngrok or a cloud VM. | - | **8 kHz, 8-bit μ-law audio** | Exactly what Twilio streams – and what Gladia now consumes natively. | + | **Public URL** | Expose a WebSocket endpoint with ngrok or a cloud VM. | + | **8 kHz, 8-bit μ-law audio** | Exactly what Twilio streams – and what Gladia now consumes natively. |🧰 Tools
🪛 LanguageTool
[misspelling] ~181-~181: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...(EN_A_VS_AN)
172-190: TwiML exposition: bullet punctuation renders incorrectly in Markdown
The leading “- ” inside paragraphs creates “loose punctuation” warnings and renders as plain hyphens instead of list items. Convert the descriptive lines to a proper unordered list for readability.🧰 Tools
🪛 LanguageTool
[uncategorized] ~174-~174: Loose punctuation mark.
Context: ...his TwiML configuration: -<Response>: The root element of any TwiML document....(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~176-~176: Loose punctuation mark.
Context: ...ions for handling the call. -<Start>: This element initiates Twilio's Media S...(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~178-~178: Loose punctuation mark.
Context: ...the rest of the call flow. -<Stream>: A child element of<Start>that confi...(UNLIKELY_OPENING_PUNCTUATION)
[misspelling] ~181-~181: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...(EN_A_VS_AN)
[uncategorized] ~185-~185: Loose punctuation mark.
Context: ...connection to this endpoint. -<Dial>: After starting the media stream, this e...(UNLIKELY_OPENING_PUNCTUATION)
blogs/twilio-solaria-python-flask/src/server.py (2)
152-155: Suppress exceptions more cleanly & avoid bareexcept
Usecontextlib.suppress(Exception)or catch specific exceptions to satisfy linters (SIM105/E722) and avoid accidental masking of critical errors.- try: - await gladia_ws.close() - except: - pass + import contextlib + with contextlib.suppress(Exception): + await gladia_ws.close()🧰 Tools
🪛 Ruff (0.8.2)
152-155: Use
contextlib.suppress(Exception)instead oftry-except-passReplace with
contextlib.suppress(Exception)(SIM105)
154-154: Do not use bare
except(E722)
160-167: Creating a new event loop per connection is heavy & error-prone
Spawning an event loop in each thread complicates shutdown and resource usage. Prefer running Flask-Sock with an ASGI server (e.g., Hypercorn/Uvicorn) and use the single loop it provides.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
blogs/twilio-solaria-python-flask/.gitignore(1 hunks)blogs/twilio-solaria-python-flask/blog.md(1 hunks)blogs/twilio-solaria-python-flask/src/README.md(1 hunks)blogs/twilio-solaria-python-flask/src/env_setup.txt(1 hunks)blogs/twilio-solaria-python-flask/src/requirements.txt(1 hunks)blogs/twilio-solaria-python-flask/src/server.py(1 hunks)blogs/twilio-solaria-python-flask/src/twiml_example.xml(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
blogs/twilio-solaria-python-flask/src/server.py
152-155: Use contextlib.suppress(Exception) instead of try-except-pass
Replace with contextlib.suppress(Exception)
(SIM105)
154-154: Do not use bare except
(E722)
171-174: Use contextlib.suppress(Exception) instead of try-except-pass
Replace with contextlib.suppress(Exception)
(SIM105)
173-173: Do not use bare except
(E722)
🪛 ast-grep (0.31.1)
blogs/twilio-solaria-python-flask/src/server.py
[warning] 183-183: Running flask app with host 0.0.0.0 could expose the server publicly.
Context: app.run(host="0.0.0.0", port=HTTP_PORT, debug=True)
Note: [CWE-668]: Exposure of Resource to Wrong Sphere [OWASP A01:2021]: Broken Access Control [REFERENCES]
https://owasp.org/Top10/A01_2021-Broken_Access_Control
(avoid_app_run_with_bad_host-python)
[warning] 183-183: Detected Flask app with debug=True. Do not deploy to production with this flag enabled as it will leak sensitive information. Instead, consider using Flask configuration variables or setting 'debug' using system environment variables.
Context: app.run(host="0.0.0.0", port=HTTP_PORT, debug=True)
Note: [CWE-489] Active Debug Code. [REFERENCES]
- https://labs.detectify.com/2015/10/02/how-patreon-got-hacked-publicly-exposed-werkzeug-debugger/
(debug-enabled-python)
🪛 GitHub Check: CodeQL
blogs/twilio-solaria-python-flask/src/server.py
[failure] 184-184: Flask app is run in debug mode
A Flask app appears to be run in debug mode. This may allow an attacker to run arbitrary code through the debugger.
🪛 LanguageTool
blogs/twilio-solaria-python-flask/blog.md
[uncategorized] ~174-~174: Loose punctuation mark.
Context: ...his TwiML configuration: - <Response>: The root element of any TwiML document....
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~176-~176: Loose punctuation mark.
Context: ...ions for handling the call. - <Start>: This element initiates Twilio's Media S...
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~178-~178: Loose punctuation mark.
Context: ...the rest of the call flow. - <Stream>: A child element of <Start> that confi...
(UNLIKELY_OPENING_PUNCTUATION)
[misspelling] ~181-~181: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...
(EN_A_VS_AN)
[uncategorized] ~185-~185: Loose punctuation mark.
Context: ...connection to this endpoint. - <Dial>: After starting the media stream, this e...
(UNLIKELY_OPENING_PUNCTUATION)
🔇 Additional comments (2)
blogs/twilio-solaria-python-flask/src/requirements.txt (1)
1-6:Details
✅ Verification successful
Verify the Flask package version.
You’ve pinnedflask==3.1.0, but Flask’s stable releases are still in the 2.x series (e.g., 2.3.x). Please confirm that version 3.1.0 exists on PyPI or update to a valid release such asflask==2.3.4.
🏁 Script executed:
#!/usr/bin/env bash # Check if Flask 3.1.0 exists on PyPI curl -s https://pypi.org/pypi/Flask/json | jq '.releases | has("3.1.0")'Length of output: 79
Flask 3.1.0 is available on PyPI
Confirmed thatflask==3.1.0exists on PyPI, so the version pin is correct.blogs/twilio-solaria-python-flask/src/twiml_example.xml (1)
1-9: TwiML example is correct.
The<Start><Stream>and<Dial>verbs align with Twilio Media Streams. Remember to replace theurlplaceholder (wss://jl.mydomain.com/media) and the phone number with your actual endpoint and destination before deploying.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (9)
blogs/twilio-solaria-python-flask/blog.md (9)
1-6: Refine Title and Intro FormattingConsider replacing the ampersand (
&) with “and” for clarity and accessibility in the title. You may also clarify the parenthetical “(μ-law Native)” phrasing to improve readability.
9-18: Enhance Prerequisites TableIt would be helpful to specify exact version requirements for key dependencies (
flask-sock,websockets, etc.) and note any required Twilio SDK or CLI versions. This will reduce friction when readers set up the environment.
21-56: Clarify Session Initialization Return ValuesThe
create_session()snippet prints the session ID and returns the WebSocket URL, but readers may wonder how to capture and reuse both values. Consider expanding the example to show assigningdata["id"]and the returned URL to variables (or a dict) so they can be used downstream.
62-92: Improve Mermaid Diagram ClarityThe sequence diagram effectively illustrates flow, but adding labels for the WebSocket URL and the “media” event would boost comprehension. Also verify that your blogging platform supports Mermaid; if not, include a rendered image or fallback code block.
138-143: Persist Gladia Session DetailsYou call
create_session()at startup but don’t store its return value. To handle reconnections or expiration, assign both theidand the URL into thegladia_sessiondict (or similar) so you can reuse or refresh them as needed.
194-199: Mention TwiML Content-Type RequirementAdd a reminder that the TwiML endpoint must return
Content-Type: application/xmlfor Twilio to correctly parse the instructions. This small detail can prevent runtime errors.
231-239: Document.envFile SetupSince
.envis git-ignored, explicitly instruct readers to create it withGLADIA_API_KEY(and optionalHTTP_PORT) before runningpip installandpython server.py.
260-265: Add Post-Call Transcript SampleYou mention fetching full transcripts via
GET /v2/live/:id. Consider including a briefrequests.getexample demonstrating how to call that endpoint and process the JSON response.
269-272: Link to Example RepositoryAt the wrap-up, include a direct link to the GitHub repo or branch so readers can easily clone and run the complete example end-to-end.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
blogs/twilio-solaria-python-flask/blog.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
blogs/twilio-solaria-python-flask/blog.md
[uncategorized] ~208-~208: Loose punctuation mark.
Context: ...his TwiML configuration: - <Response>: The root element of any TwiML document....
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~210-~210: Loose punctuation mark.
Context: ...ions for handling the call. - <Start>: This element initiates Twilio's Media S...
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~212-~212: Loose punctuation mark.
Context: ...the rest of the call flow. - <Stream>: A child element of <Start> that confi...
(UNLIKELY_OPENING_PUNCTUATION)
[misspelling] ~215-~215: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...
(EN_A_VS_AN)
[uncategorized] ~219-~219: Loose punctuation mark.
Context: ...connection to this endpoint. - <Dial>: After starting the media stream, this e...
(UNLIKELY_OPENING_PUNCTUATION)
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
blogs/twilio-solaria-python-flask/blog.md (1)
105-151: 🛠️ Refactor suggestionInclude
handle_websocketimplementation and correct session storage
- The snippet invokes
handle_websocket(ws)but that function’s code isn’t shown—readers need it to understand how Twilio frames are forwarded to Gladia.create_session()returns a WebSocket URL but isn’t assigned togladia_session["url"], so the Gladia connection can’t be established later. Please capture and store the returned URL.
🧹 Nitpick comments (2)
blogs/twilio-solaria-python-flask/blog.md (2)
1-1: Use a single#for the main title and hyphenate “Real-Time”
Currently the title is written as a level-2 heading (##) and uses “Real Time” without a hyphen. For consistency and accessibility, consider:# How to Transcribe Twilio Calls in Real-Time with Flask, Python & Gladia (μ-law Native)
2-4: Refine the opening paragraph for clarity and flow
The introduction is informative but could be tightened. For example:Twilio’s Voice Media Streams deliver 8 kHz, 8-bit μ-law audio. Gladia’s real-time STT API ingests it natively—no resampling or decoding required—while maintaining sub-300 ms latency.
This version is slightly shorter and emphasizes the key takeaway.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
blogs/twilio-solaria-python-flask/blog.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
blogs/twilio-solaria-python-flask/blog.md
[uncategorized] ~208-~208: Loose punctuation mark.
Context: ...his TwiML configuration: - <Response>: The root element of any TwiML document....
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~210-~210: Loose punctuation mark.
Context: ...ions for handling the call. - <Start>: This element initiates Twilio's Media S...
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~212-~212: Loose punctuation mark.
Context: ...the rest of the call flow. - <Stream>: A child element of <Start> that confi...
(UNLIKELY_OPENING_PUNCTUATION)
[misspelling] ~215-~215: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...
(EN_A_VS_AN)
[uncategorized] ~219-~219: Loose punctuation mark.
Context: ...connection to this endpoint. - <Dial>: After starting the media stream, this e...
(UNLIKELY_OPENING_PUNCTUATION)
🔇 Additional comments (10)
blogs/twilio-solaria-python-flask/blog.md (10)
5-5: Approve the/v2/liveendpoint description
The explanation of Gladia’sencoding: "wav/ulaw",bit_depth: 8, andsample_rate: 8000parameters is accurate and succinct.
9-18: Approve prerequisites table
The table clearly conveys all required components (Gladia API key,Twilio account,Python 3.12+, etc.) along with their rationale. Markdown formatting is correct.
58-59: Approve the “Why no resample / decode?” note
This callout effectively emphasizes the performance advantage of forwarding raw μ-law frames.
67-92: Approve system architecture Mermaid diagram
The sequenceDiagram is well-formed, the participants are clear, and the flow accurately represents the end-to-end interaction.
194-204: Approve TwiML<Start><Stream>example
The XML is valid, the comments annotate each element clearly, and it matches Twilio’s requirements for secure WebSocket streams.
206-224: Approve TwiML explanation bullets
Each element (<Response>,<Start>,<Stream>,<Dial>) is described accurately and with appropriate detail.🧰 Tools
🪛 LanguageTool
[uncategorized] ~208-~208: Loose punctuation mark.
Context: ...his TwiML configuration: -<Response>: The root element of any TwiML document....(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~210-~210: Loose punctuation mark.
Context: ...ions for handling the call. -<Start>: This element initiates Twilio's Media S...(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~212-~212: Loose punctuation mark.
Context: ...the rest of the call flow. -<Stream>: A child element of<Start>that confi...(UNLIKELY_OPENING_PUNCTUATION)
[misspelling] ~215-~215: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...(EN_A_VS_AN)
[uncategorized] ~219-~219: Loose punctuation mark.
Context: ...connection to this endpoint. -<Dial>: After starting the media stream, this e...(UNLIKELY_OPENING_PUNCTUATION)
231-243: Approve “Expose & test” instructions
The shell commands for installing dependencies, running the proxy, and tunneling (ngrok) are clear and ready to copy/paste.
248-256: Approve sample output
The console-log snippet demonstrates the expected live-transcription output, which will help readers validate their setup.
260-266: Approve next steps suggestions
The proposed extensions (add-ons, dual channels, post-call JSON, scaling) are on point and encourage readers to explore further.
269-273: Approve wrap-up and emphasis on simplicity
The closing statement succinctly reinforces that live transcription is now literally “base64-decode and forward.” Great call-to-action.
…iption of Twilio calls using FastAPI and Gladia 📝 (twilio-solaria-python-fastapi): create README and environment setup instructions for the project 🔧 (twilio-solaria-python-fastapi): add .gitignore file to exclude environment and virtual environment files ✅ (twilio-solaria-python-fastapi): include requirements.txt for project dependencies ♻️ (twilio-solaria-python-fastapi): implement server.py for handling WebSocket connections and transcription logic 🌐 (twilio-solaria-python-fastapi): add TwiML example for configuring Twilio to stream audio to the server
…ilio calls using Go and Gladia, along with environment setup instructions, .gitignore, and necessary Go module files
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (19)
blogs/twilio-solaria-go/src/README.md (3)
5-11: Improve prerequisites bullet phrasing
The second bullet reads “Twilio account + voice-enabled number.” For clarity and consistency, consider rephrasing to explicitly mention “a Twilio account with a voice-enabled phone number.”- - **Twilio account + voice-enabled number** + - **A Twilio account with a voice-enabled phone number**
29-34: Clarify the location oftwiml_example.xml
It may not be obvious where to findtwiml_example.xml. Recommend adding a relative path or a link to the file in the repository.- - Use the contents of `twiml_example.xml` as your TwiML + - Use the contents of `src/twiml_example.xml` (or link to its path) as your TwiML
35-40: Enhance technical notes grammar
Missing an article in the first bullet. Consider adding “a” before “standard Go HTTP server.”- - The server uses standard Go HTTP server with gorilla/websocket for WebSocket support + - The server uses a standard Go HTTP server with gorilla/websocket for WebSocket support🧰 Tools
🪛 LanguageTool
[uncategorized] ~37-~37: You might be missing the article “a” here.
Context: ... ## Technical Notes - The server uses standard Go HTTP server with gorilla/websocket f...(AI_EN_LECTOR_MISSING_DETERMINER_A)
blogs/twilio-solaria-python-fastapi/src/server.py (3)
114-128: Tight 1 ms polling loop can peg a CPU core
await asyncio.wait_for(gladia_ws.recv(), 0.001)inside awhile Trueloop results in 1 000 polling iterations per second when no data is available, needlessly burning CPU.Consider either:
- Running a separate listener task that does a blocking
await gladia_ws.recv()andawait websocket.send_text()(if needed), or- Increasing the timeout to a sensible value (e.g. 0.1 – 0.25 s) and breaking the inner
whileafter one successful receive.This keeps latency low while preventing a busy-wait scenario.
[performance]
151-154: Replace bareexcept … passwithcontextlib.suppressor explicit exception handlingA blanket
except:masks all exceptions, includingKeyboardInterruptand bugs you’d really want to know about. Ruff (SIM105 / E722) already flags this.-from contextlib import suppress -with suppress(Exception): - await gladia_ws.close() +from contextlib import suppress + +with suppress(Exception): + await gladia_ws.close()🧰 Tools
🪛 Ruff (0.8.2)
151-154: Use
contextlib.suppress(Exception)instead oftry-except-passReplace with
contextlib.suppress(Exception)(SIM105)
153-153: Do not use bare
except(E722)
71-78: Prefer logger overprint()for transcript outputUsing
logger.info()keeps output format consistent, respects log levels, and allows redirection to structured log sinks.- print(f"📝 Transcript: {transcript}") + logger.info("📝 Transcript: %s", transcript)blogs/twilio-solaria-python-fastapi/blog.md (1)
248-251: Minor wording nit – “an ngrok URL” → “a ngrok URL”The word “ngrok” begins with a consonant sound, so the indefinite article should be “a”.
- - The domain should be your public domain (e.g., an ngrok URL or a custom domain). + - The domain should be your public domain (e.g., a ngrok URL or a custom domain).🧰 Tools
🪛 LanguageTool
[misspelling] ~250-~250: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...(EN_A_VS_AN)
blogs/twilio-solaria-go/src/main.go (6)
73-76: Add error handling for w.Write() callThe
healthCheckfunction doesn't check the error returned byw.Write(). Even though write errors are rare in this context, it's good practice to check them.func healthCheck(w http.ResponseWriter, r *http.Request) { w.Header().Set("Content-Type", "application/json") - w.Write([]byte(`{"status":"ok","service":"twilio-gladia-transcription"}`)) + _, err := w.Write([]byte(`{"status":"ok","service":"twilio-gladia-transcription"}`)) + if err != nil { + log.Printf("Error writing health check response: %v", err) + } }🧰 Tools
🪛 golangci-lint (1.64.8)
75-75: Error return value of
w.Writeis not checked(errcheck)
211-215: Consider restricting WebSocket origin if appropriateThe WebSocket upgrader allows connections from any origin (
CheckOriginalways returns true). While this might be necessary for your use case with Twilio, consider if a more restrictive policy would be appropriate for security.// Configure WebSocket upgrader upgrader := websocket.Upgrader{ - CheckOrigin: func(r *http.Request) bool { return true }, + CheckOrigin: func(r *http.Request) bool { + // Allow Twilio domains and your allowed domains + origin := r.Header.Get("Origin") + allowedOrigins := []string{"https://your-domain.com", "https://twilio.com"} + for _, allowed := range allowedOrigins { + if allowed == origin { + return true + } + } + // For development or if you absolutely need to allow all origins + // return true + log.Printf("Rejected WebSocket connection from origin: %s", origin) + return false + }, ReadBufferSize: 1024, WriteBufferSize: 1024, }
64-69: Consider verifying response structure before usingThe function assumes the API response will always contain the expected fields. Consider adding validation to ensure the response contains the required fields before using them.
var data GladiaSession if err := json.NewDecoder(resp.Body).Decode(&data); err != nil { return GladiaSession{}, fmt.Errorf("failed to decode response: %w", err) } +// Validate response data +if data.ID == "" || data.URL == "" { + return GladiaSession{}, fmt.Errorf("invalid session data: missing ID or URL") +} log.Printf("🛰 Gladia session ID: %s", data.ID) return data, nil
239-241: Add error handling for w.Write() callSimilar to the
healthCheckfunction, this code doesn't check the error returned byw.Write().// For regular HTTP requests to root, return a simple info page w.Header().Set("Content-Type", "text/plain") -w.Write([]byte("Twilio-Gladia Transcription Server\n\nAvailable endpoints:\n- /media (WebSocket): Connect Twilio Media Streams\n- /health (HTTP): Health check endpoint")) +_, err := w.Write([]byte("Twilio-Gladia Transcription Server\n\nAvailable endpoints:\n- /media (WebSocket): Connect Twilio Media Streams\n- /health (HTTP): Health check endpoint")) +if err != nil { + log.Printf("Error writing response: %v", err) +}🧰 Tools
🪛 golangci-lint (1.64.8)
240-240: Error return value of
w.Writeis not checked(errcheck)
98-116: Consider adding metrics for audio processingThis function processes audio data but doesn't track metrics like the number of audio frames processed or the amount of data. Adding simple counters could help with monitoring the system's performance.
// At the top of the file, with other var declarations var ( gladiaAPIKey string session GladiaSession + stats struct { + audioFramesProcessed int + audioBytesSent int64 + sync.Mutex + } ) // In the processMessage function func processMessage(message []byte, gladiaConn *websocket.Conn) { // ...existing code... if err := gladiaConn.WriteMessage(websocket.BinaryMessage, mulaw); err != nil { log.Printf("Error sending to Gladia: %v", err) } + // Update metrics + stats.Lock() + stats.audioFramesProcessed++ + stats.audioBytesSent += int64(len(mulaw)) + stats.Unlock() } // Add a stats endpoint in main() +http.HandleFunc("/stats", func(w http.ResponseWriter, r *http.Request) { + w.Header().Set("Content-Type", "application/json") + stats.Lock() + data, err := json.Marshal(map[string]interface{}{ + "audio_frames_processed": stats.audioFramesProcessed, + "audio_bytes_sent": stats.audioBytesSent, + }) + stats.Unlock() + if err != nil { + http.Error(w, err.Error(), http.StatusInternalServerError) + return + } + _, err = w.Write(data) + if err != nil { + log.Printf("Error writing stats response: %v", err) + } +})
119-131: Consider adding transcript persistence optionCurrently, transcripts are only logged. Consider adding an option to persist them (database, file, or webhook) for later retrieval or processing.
If you decide to implement this, you could add configuration through environment variables and implement a simple interface for different storage backends.
// Example interface for transcript storage type TranscriptStore interface { Save(transcript string) error } // Example implementation for file storage type FileStore struct { file *os.File } func NewFileStore(path string) (*FileStore, error) { f, err := os.OpenFile(path, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644) if err != nil { return nil, err } return &FileStore{file: f}, nil } func (fs *FileStore) Save(transcript string) error { _, err := fmt.Fprintf(fs.file, "%s\t%s\n", time.Now().Format(time.RFC3339), transcript) return err }Then modify
handleGladiato use the store when a transcript is finalized.blogs/twilio-solaria-go/blog.md (6)
130-198: Provide full proxy implementation context
The code snippets illustrate message processing but omit the HTTP route registration and WebSocket upgrade needed to accept Twilio connections. Readers may be unsure how to wire these pieces together. Consider adding a minimal example usinghttp.HandleFunc("/media", ...)withwebsocket.Upgrader.
184-197: Align function signature with usage
handleGladiareturns the final transcript string, but its caller inhandleWebSocketignores this value. Either remove the return value (and log inside) or have the caller utilize it (e.g., broadcast transcripts or send to a channel).
199-247: Enhance cancellation and error propagation
Each goroutine exits on an I/O error but doesn’t signal the other, which can leavewg.Wait()blocked. Consider using acontext.Contextwith cancellation or closing one WebSocket connection upon error to ensure both loops terminate cleanly.
248-275: Show complete server startup
After loading env vars and initializing the session, the snippet doesn’t show how to register routes or start the HTTP server. For a runnable example, append something like:func main() { // … existing setup … + http.HandleFunc("/media", func(w http.ResponseWriter, r *http.Request) { + upgrader := websocket.Upgrader{} + conn, err := upgrader.Upgrade(w, r, nil) + if err != nil { + log.Printf("WebSocket upgrade error: %v", err) + return + } + go handleWebSocket(conn) + }) + + log.Printf("🚀 Starting server on 0.0.0.0:%s", port) + if err := http.ListenAndServe(":"+port, nil); err != nil { + log.Fatal("Server failed:", err) + } }This completes the end-to-end example.
298-316: Nitpick: standardize list formatting
Some bullet items have two spaces before the marker, which can render inconsistently across Markdown parsers. Consider removing trailing spaces and using a single space before each-:- ... call flow. - `<Start>`: ... + ... call flow. + - `<Start>`: ...This will improve cross-platform readability.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~300-~300: Loose punctuation mark.
Context: ...his TwiML configuration: -<Response>: The root element of any TwiML document....(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~302-~302: Loose punctuation mark.
Context: ...ions for handling the call. -<Start>: This element initiates Twilio's Media S...(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~304-~304: Loose punctuation mark.
Context: ...the rest of the call flow. -<Stream>: A child element of<Start>that confi...(UNLIKELY_OPENING_PUNCTUATION)
[misspelling] ~307-~307: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...(EN_A_VS_AN)
[uncategorized] ~311-~311: Loose punctuation mark.
Context: ...connection to this endpoint. -<Dial>: After starting the media stream, this e...(UNLIKELY_OPENING_PUNCTUATION)
321-341: Recommend initializing Go modules
Before runninggo build, users should initialize a Go module if they haven’t:go mod init github.com/your-org/your-project go mod tidyIncluding this helps avoid module resolution errors.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
blogs/twilio-solaria-go/src/go.sumis excluded by!**/*.sum
📒 Files selected for processing (14)
blogs/twilio-solaria-go/.gitignore(1 hunks)blogs/twilio-solaria-go/blog.md(1 hunks)blogs/twilio-solaria-go/src/README.md(1 hunks)blogs/twilio-solaria-go/src/env_setup.txt(1 hunks)blogs/twilio-solaria-go/src/go.mod(1 hunks)blogs/twilio-solaria-go/src/main.go(1 hunks)blogs/twilio-solaria-go/src/twiml_example.xml(1 hunks)blogs/twilio-solaria-python-fastapi/.gitignore(1 hunks)blogs/twilio-solaria-python-fastapi/blog.md(1 hunks)blogs/twilio-solaria-python-fastapi/src/README.md(1 hunks)blogs/twilio-solaria-python-fastapi/src/env_setup.txt(1 hunks)blogs/twilio-solaria-python-fastapi/src/requirements.txt(1 hunks)blogs/twilio-solaria-python-fastapi/src/server.py(1 hunks)blogs/twilio-solaria-python-fastapi/src/twiml_example.xml(1 hunks)
✅ Files skipped from review due to trivial changes (9)
- blogs/twilio-solaria-go/.gitignore
- blogs/twilio-solaria-python-fastapi/src/requirements.txt
- blogs/twilio-solaria-python-fastapi/src/env_setup.txt
- blogs/twilio-solaria-go/src/env_setup.txt
- blogs/twilio-solaria-go/src/go.mod
- blogs/twilio-solaria-python-fastapi/src/twiml_example.xml
- blogs/twilio-solaria-go/src/twiml_example.xml
- blogs/twilio-solaria-python-fastapi/.gitignore
- blogs/twilio-solaria-python-fastapi/src/README.md
🧰 Additional context used
🪛 golangci-lint (1.64.8)
blogs/twilio-solaria-go/src/main.go
75-75: Error return value of w.Write is not checked
(errcheck)
240-240: Error return value of w.Write is not checked
(errcheck)
🪛 Ruff (0.8.2)
blogs/twilio-solaria-python-fastapi/src/server.py
151-154: Use contextlib.suppress(Exception) instead of try-except-pass
Replace with contextlib.suppress(Exception)
(SIM105)
153-153: Do not use bare except
(E722)
🪛 LanguageTool
blogs/twilio-solaria-go/blog.md
[uncategorized] ~300-~300: Loose punctuation mark.
Context: ...his TwiML configuration: - <Response>: The root element of any TwiML document....
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~302-~302: Loose punctuation mark.
Context: ...ions for handling the call. - <Start>: This element initiates Twilio's Media S...
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~304-~304: Loose punctuation mark.
Context: ...the rest of the call flow. - <Stream>: A child element of <Start> that confi...
(UNLIKELY_OPENING_PUNCTUATION)
[misspelling] ~307-~307: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...
(EN_A_VS_AN)
[uncategorized] ~311-~311: Loose punctuation mark.
Context: ...connection to this endpoint. - <Dial>: After starting the media stream, this e...
(UNLIKELY_OPENING_PUNCTUATION)
blogs/twilio-solaria-python-fastapi/blog.md
[uncategorized] ~243-~243: Loose punctuation mark.
Context: ...his TwiML configuration: - <Response>: The root element of any TwiML document....
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~245-~245: Loose punctuation mark.
Context: ...ions for handling the call. - <Start>: This element initiates Twilio's Media S...
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~247-~247: Loose punctuation mark.
Context: ...the rest of the call flow. - <Stream>: A child element of <Start> that confi...
(UNLIKELY_OPENING_PUNCTUATION)
[misspelling] ~250-~250: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...
(EN_A_VS_AN)
[uncategorized] ~254-~254: Loose punctuation mark.
Context: ...connection to this endpoint. - <Dial>: After starting the media stream, this e...
(UNLIKELY_OPENING_PUNCTUATION)
blogs/twilio-solaria-go/src/README.md
[uncategorized] ~37-~37: You might be missing the article “a” here.
Context: ... ## Technical Notes - The server uses standard Go HTTP server with gorilla/websocket f...
(AI_EN_LECTOR_MISSING_DETERMINER_A)
🔇 Additional comments (14)
blogs/twilio-solaria-go/src/README.md (8)
1-4: Clear and descriptive title/overview
The title and introduction succinctly describe the project’s purpose and scope.
12-21: Setup: Go dependencies instructions look good
The instructions for installing Go and downloading dependencies are clear and accurate.
22-28: Environment variables setup is clear
Creating a.envnext tomain.gowith the required keys is well explained.
41-56: Build and run instructions are solid
The steps for building and running the server (with and without specifying a port) are clear.
58-68: Exposing the server via ngrok is well explained
Instructions for exposing your local port with ngrok (including custom domains) are accurate.
70-76: Twilio webhook update and test steps are clear
Updating the TwiML URL and testing the call flow is straightforward and complete.
77-84: How it works section is informative
The workflow description accurately captures the end-to-end streaming and transcription process.
85-90: Next steps suggestions add value
Future enhancements are well scoped and provide a clear roadmap for expanding functionality.blogs/twilio-solaria-go/src/main.go (1)
195-199: LGTM: Good fallback to default portGood practice to provide a default port when the environment variable is not set.
blogs/twilio-solaria-go/blog.md (5)
1-2: Inconsistent PR description vs. file content
The PR title and objectives mention a Python/Flask + FastAPI sample, but this blog post covers a Go implementation undertwilio-solaria-go. Please verify whether this file belongs in this PR or if the PR description needs updating to include the Go sample.Likely an incorrect or invalid review comment.
87-117: Clear system architecture diagram
The Mermaid sequence diagram accurately represents the Twilio → GoServer → Gladia flow and greatly aids comprehension. Well done!
284-296: TwiML example is accurate and clear
The XML snippet properly demonstrates how to start Twilio Media Streams and matches the/mediaendpoint. It’s ready for readers to copy and use.
356-362: Useful next steps outlined
The suggestions for add-ons, dual-channel audio, post-call JSON retrieval, and scaling are practical and give readers clear directions for extending the solution. Nice work!
365-369: Strong wrap-up
The conclusion succinctly highlights the “base64-decode and forward” advantage and reinforces the ease and performance of the approach. It’s an engaging way to close the post.
| @app.websocket("/{remaining_path:path}") | ||
| async def catch_all_websocket(websocket: WebSocket, remaining_path: str): | ||
| """Catch-all handler for WebSocket connections.""" | ||
| await websocket.accept() | ||
| logger.info(f"🔌 Catch-all WebSocket connected to /{remaining_path}") | ||
|
|
||
| await handle_websocket(websocket) | ||
|
|
There was a problem hiding this comment.
Catch-all WebSocket route exposes the service to unintended traffic
/{remaining_path:path} happily accepts any WebSocket path and forwards raw bytes to Gladia:
• Anyone on the internet can push arbitrary data to your Gladia quota.
• Malformed frames may crash the proxy.
• Twilio only connects to the exact /media endpoint.
Recommendation: delete the catch-all or restrict it (e.g., to authenticated admin clients).
[security]
| # Create initial Gladia session | ||
| try: | ||
| create_session() | ||
| except Exception as e: | ||
| logger.error("Failed to create initial Gladia session: %s", e) | ||
| raise |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Create a fresh Gladia session per call to avoid cross-call mixing & token-expiry issues
create_session() is invoked only once at startup, so every concurrent Twilio call re-uses the same session ID & WebSocket URL.
• Audio from different callers will be multiplexed into a single Gladia session, producing garbled transcripts.
• Gladia session tokens are short-lived; once the token expires every new call will fail to connect.
Move the session creation inside handle_websocket() (or a dedicated connection factory) so that each call gets its own isolated, fresh session.
@@ async def handle_websocket(websocket: WebSocket):
- # Connect to Gladia for this connection
- try:
- gladia_ws = await websockets.connect(gladia_session["url"])
- logger.info(f"Connected to Gladia session {gladia_session['id']}")
+ # Create a dedicated Gladia session for this call
+ try:
+ session_url = create_session()
+ gladia_ws = await websockets.connect(session_url)
+ logger.info(f"Connected to new Gladia session {gladia_session['id']}")Committable suggestion skipped: line range outside the PR's diff.
| // Twilio -> Gladia | ||
| go func() { | ||
| defer wg.Done() | ||
| for { | ||
| _, msg, err := twilioConn.ReadMessage() | ||
| if err != nil { | ||
| log.Printf("Error reading from Twilio: %v", err) | ||
| return | ||
| } | ||
| processMessage(msg, gladiaConn) | ||
| } | ||
| }() | ||
|
|
||
| // Gladia -> transcripts | ||
| go func() { | ||
| defer wg.Done() | ||
| for { | ||
| _, msg, err := gladiaConn.ReadMessage() | ||
| if err != nil { | ||
| log.Printf("Error reading from Gladia: %v", err) | ||
| return | ||
| } | ||
| handleGladia(msg) | ||
| } | ||
| }() | ||
|
|
||
| wg.Wait() | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Use context for cancellation between goroutines
The current implementation creates two goroutines that run independently. If one fails, the other will continue running until it encounters its own error. Consider using a context with cancellation to coordinate the shutdown of both goroutines when either one fails.
func handleWebSocket(twilioConn *websocket.Conn) {
clientInfo := twilioConn.RemoteAddr().String()
log.Printf("🔌 Twilio WebSocket connected from %s", clientInfo)
defer twilioConn.Close()
// Connect to Gladia
dialer := websocket.DefaultDialer
gladiaConn, _, err := dialer.Dial(session.URL, nil)
if err != nil {
log.Printf("Failed to connect to Gladia: %v", err)
return
}
defer gladiaConn.Close()
log.Printf("Connected to Gladia session %s", session.ID)
+ ctx, cancel := context.WithCancel(context.Background())
+ defer cancel()
var wg sync.WaitGroup
wg.Add(2)
// Twilio -> Gladia
go func() {
defer wg.Done()
+ defer cancel() // Signal other goroutine to stop when this one exits
for {
+ select {
+ case <-ctx.Done():
+ return
+ default:
+ }
_, msg, err := twilioConn.ReadMessage()
if err != nil {
log.Printf("Error reading from Twilio: %v", err)
return
}
processMessage(msg, gladiaConn)
}
}()
// Gladia -> transcripts
go func() {
defer wg.Done()
+ defer cancel() // Signal other goroutine to stop when this one exits
for {
+ select {
+ case <-ctx.Done():
+ return
+ default:
+ }
_, msg, err := gladiaConn.ReadMessage()
if err != nil {
log.Printf("Error reading from Gladia: %v", err)
return
}
handleGladia(msg)
}
}()
wg.Wait()
}Don't forget to add "context" to your imports:
import (
"bytes"
+ "context"
"encoding/base64"
// ...
)| req, err := http.NewRequest("POST", gladiaInitURL, bytes.NewReader(body)) | ||
| if err != nil { | ||
| return GladiaSession{}, fmt.Errorf("failed to create request: %w", err) | ||
| } | ||
| req.Header.Set("X-Gladia-Key", gladiaAPIKey) | ||
| req.Header.Set("Content-Type", "application/json") | ||
| client := &http.Client{Timeout: 10 * time.Second} | ||
| resp, err := client.Do(req) | ||
| if err != nil { | ||
| return GladiaSession{}, fmt.Errorf("session init request failed: %w", err) | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Add retry logic for Gladia session initialization
The current implementation makes a single attempt to create a Gladia session. Consider adding retry logic to handle temporary network issues or API unavailability.
func createSession() (GladiaSession, error) {
+ maxRetries := 3
+ retryDelay := 2 * time.Second
+
payload := map[string]interface{}{ // μ-law, 8-bit, 8 kHz, mono
"encoding": "wav/ulaw",
"bit_depth": 8,
"sample_rate": 8000,
"channels": 1,
}
body, err := json.Marshal(payload)
if err != nil {
return GladiaSession{}, fmt.Errorf("failed to marshal payload: %w", err)
}
- req, err := http.NewRequest("POST", gladiaInitURL, bytes.NewReader(body))
- if err != nil {
- return GladiaSession{}, fmt.Errorf("failed to create request: %w", err)
- }
- req.Header.Set("X-Gladia-Key", gladiaAPIKey)
- req.Header.Set("Content-Type", "application/json")
- client := &http.Client{Timeout: 10 * time.Second}
- resp, err := client.Do(req)
- if err != nil {
- return GladiaSession{}, fmt.Errorf("session init request failed: %w", err)
- }
+
+ var lastErr error
+ for attempt := 1; attempt <= maxRetries; attempt++ {
+ req, err := http.NewRequest("POST", gladiaInitURL, bytes.NewReader(body))
+ if err != nil {
+ return GladiaSession{}, fmt.Errorf("failed to create request: %w", err)
+ }
+ req.Header.Set("X-Gladia-Key", gladiaAPIKey)
+ req.Header.Set("Content-Type", "application/json")
+ client := &http.Client{Timeout: 10 * time.Second}
+ resp, err := client.Do(req)
+ if err != nil {
+ lastErr = err
+ log.Printf("Attempt %d: session init request failed: %v", attempt, err)
+ if attempt < maxRetries {
+ time.Sleep(retryDelay)
+ // Exponential backoff
+ retryDelay *= 2
+ continue
+ }
+ return GladiaSession{}, fmt.Errorf("session init request failed after %d attempts: %w", maxRetries, err)
+ }
+ defer resp.Body.Close()
+
+ if resp.StatusCode < 200 || resp.StatusCode >= 300 {
+ bodyBytes, _ := io.ReadAll(resp.Body)
+ lastErr = fmt.Errorf("bad status code: %d - %s", resp.StatusCode, string(bodyBytes))
+ log.Printf("Attempt %d: %v", attempt, lastErr)
+ if attempt < maxRetries {
+ time.Sleep(retryDelay)
+ // Exponential backoff
+ retryDelay *= 2
+ continue
+ }
+ return GladiaSession{}, lastErr
+ }
+
+ var data GladiaSession
+ if err := json.NewDecoder(resp.Body).Decode(&data); err != nil {
+ return GladiaSession{}, fmt.Errorf("failed to decode response: %w", err)
+ }
+ log.Printf("🛰 Gladia session ID: %s", data.ID)
+ return data, nil
+ }
+
+ return GladiaSession{}, lastErr
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| req, err := http.NewRequest("POST", gladiaInitURL, bytes.NewReader(body)) | |
| if err != nil { | |
| return GladiaSession{}, fmt.Errorf("failed to create request: %w", err) | |
| } | |
| req.Header.Set("X-Gladia-Key", gladiaAPIKey) | |
| req.Header.Set("Content-Type", "application/json") | |
| client := &http.Client{Timeout: 10 * time.Second} | |
| resp, err := client.Do(req) | |
| if err != nil { | |
| return GladiaSession{}, fmt.Errorf("session init request failed: %w", err) | |
| } | |
| func createSession() (GladiaSession, error) { | |
| maxRetries := 3 | |
| retryDelay := 2 * time.Second | |
| payload := map[string]interface{}{ // μ-law, 8-bit, 8 kHz, mono | |
| "encoding": "wav/ulaw", | |
| "bit_depth": 8, | |
| "sample_rate": 8000, | |
| "channels": 1, | |
| } | |
| body, err := json.Marshal(payload) | |
| if err != nil { | |
| return GladiaSession{}, fmt.Errorf("failed to marshal payload: %w", err) | |
| } | |
| var lastErr error | |
| for attempt := 1; attempt <= maxRetries; attempt++ { | |
| req, err := http.NewRequest("POST", gladiaInitURL, bytes.NewReader(body)) | |
| if err != nil { | |
| return GladiaSession{}, fmt.Errorf("failed to create request: %w", err) | |
| } | |
| req.Header.Set("X-Gladia-Key", gladiaAPIKey) | |
| req.Header.Set("Content-Type", "application/json") | |
| client := &http.Client{Timeout: 10 * time.Second} | |
| resp, err := client.Do(req) | |
| if err != nil { | |
| lastErr = err | |
| log.Printf("Attempt %d: session init request failed: %v", attempt, err) | |
| if attempt < maxRetries { | |
| time.Sleep(retryDelay) | |
| // Exponential backoff | |
| retryDelay *= 2 | |
| continue | |
| } | |
| return GladiaSession{}, fmt.Errorf("session init request failed after %d attempts: %w", maxRetries, err) | |
| } | |
| defer resp.Body.Close() | |
| if resp.StatusCode < 200 || resp.StatusCode >= 300 { | |
| bodyBytes, _ := io.ReadAll(resp.Body) | |
| lastErr = fmt.Errorf("bad status code: %d - %s", resp.StatusCode, string(bodyBytes)) | |
| log.Printf("Attempt %d: %v", attempt, lastErr) | |
| if attempt < maxRetries { | |
| time.Sleep(retryDelay) | |
| // Exponential backoff | |
| retryDelay *= 2 | |
| continue | |
| } | |
| return GladiaSession{}, lastErr | |
| } | |
| var data GladiaSession | |
| if err := json.NewDecoder(resp.Body).Decode(&data); err != nil { | |
| return GladiaSession{}, fmt.Errorf("failed to decode response: %w", err) | |
| } | |
| log.Printf("🛰 Gladia session ID: %s", data.ID) | |
| return data, nil | |
| } | |
| return GladiaSession{}, lastErr | |
| } |
| ### 1 — Initiate a Gladia live session | ||
|
|
||
| ```go | ||
| import ( | ||
| "bytes" | ||
| "encoding/json" | ||
| "fmt" | ||
| "io" | ||
| "log" | ||
| "net/http" | ||
| "time" | ||
| ) | ||
|
|
||
| const ( | ||
| gladiaInitURL = "https://api.gladia.io/v2/live" | ||
| ) | ||
|
|
||
| // GladiaSession stores session information | ||
| type GladiaSession struct { | ||
| ID string `json:"id"` | ||
| URL string `json:"url"` | ||
| } | ||
|
|
||
| // createSession initializes a Gladia real-time transcription session and returns the WebSocket URL. | ||
| func createSession() (GladiaSession, error) { | ||
| payload := map[string]interface{}{ // μ-law, 8-bit, 8 kHz, mono | ||
| "encoding": "wav/ulaw", | ||
| "bit_depth": 8, | ||
| "sample_rate": 8000, | ||
| "channels": 1, | ||
| } | ||
| body, err := json.Marshal(payload) | ||
| if err != nil { | ||
| return GladiaSession{}, fmt.Errorf("failed to marshal payload: %w", err) | ||
| } | ||
| req, err := http.NewRequest("POST", gladiaInitURL, bytes.NewReader(body)) | ||
| if err != nil { | ||
| return GladiaSession{}, fmt.Errorf("failed to create request: %w", err) | ||
| } | ||
| req.Header.Set("X-Gladia-Key", gladiaAPIKey) | ||
| req.Header.Set("Content-Type", "application/json") | ||
| client := &http.Client{Timeout: 10 * time.Second} | ||
| resp, err := client.Do(req) | ||
| if err != nil { | ||
| return GladiaSession{}, fmt.Errorf("session init request failed: %w", err) | ||
| } | ||
| defer resp.Body.Close() | ||
|
|
||
| if resp.StatusCode < 200 || resp.StatusCode >= 300 { | ||
| bodyBytes, _ := io.ReadAll(resp.Body) | ||
| return GladiaSession{}, fmt.Errorf("bad status code: %d - %s", resp.StatusCode, string(bodyBytes)) | ||
| } | ||
|
|
||
| var data GladiaSession | ||
| if err := json.NewDecoder(resp.Body).Decode(&data); err != nil { | ||
| return GladiaSession{}, fmt.Errorf("failed to decode response: %w", err) | ||
| } | ||
| log.Printf("🛰 Gladia session ID: %s", data.ID) | ||
| return data, nil | ||
| } | ||
| ``` |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Missing context in createSession snippet
The example references gladiaAPIKey and session package-level variables that aren’t declared in the snippet, which may confuse readers and lead to compilation errors. Also, using a generic map[string]interface{} for the payload reduces type safety.
Consider adding these declarations and a strongly typed request struct:
import (
"bytes"
"encoding/json"
"fmt"
"io"
"log"
"net/http"
"time"
)
+// Package-level variables for API key and session
+var (
+ gladiaAPIKey string
+ session GladiaSession
+)
// createSession initializes a Gladia real-time transcription session...
func createSession() (GladiaSession, error) {
- payload := map[string]interface{}{ // μ-law, 8-bit, 8 kHz, mono
- "encoding": "wav/ulaw",
- "bit_depth": 8,
- "sample_rate": 8000,
- "channels": 1,
- }
+ // Use a typed struct for payload
+ type initPayload struct {
+ Encoding string `json:"encoding"`
+ BitDepth int `json:"bit_depth"`
+ SampleRate int `json:"sample_rate"`
+ Channels int `json:"channels"`
+ }
+ payload := initPayload{
+ Encoding: "wav/ulaw",
+ BitDepth: 8,
+ SampleRate: 8000,
+ Channels: 1,
+ }
body, err := json.Marshal(payload)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ### 1 — Initiate a Gladia live session | |
| ```go | |
| import ( | |
| "bytes" | |
| "encoding/json" | |
| "fmt" | |
| "io" | |
| "log" | |
| "net/http" | |
| "time" | |
| ) | |
| const ( | |
| gladiaInitURL = "https://api.gladia.io/v2/live" | |
| ) | |
| // GladiaSession stores session information | |
| type GladiaSession struct { | |
| ID string `json:"id"` | |
| URL string `json:"url"` | |
| } | |
| // createSession initializes a Gladia real-time transcription session and returns the WebSocket URL. | |
| func createSession() (GladiaSession, error) { | |
| payload := map[string]interface{}{ // μ-law, 8-bit, 8 kHz, mono | |
| "encoding": "wav/ulaw", | |
| "bit_depth": 8, | |
| "sample_rate": 8000, | |
| "channels": 1, | |
| } | |
| body, err := json.Marshal(payload) | |
| if err != nil { | |
| return GladiaSession{}, fmt.Errorf("failed to marshal payload: %w", err) | |
| } | |
| req, err := http.NewRequest("POST", gladiaInitURL, bytes.NewReader(body)) | |
| if err != nil { | |
| return GladiaSession{}, fmt.Errorf("failed to create request: %w", err) | |
| } | |
| req.Header.Set("X-Gladia-Key", gladiaAPIKey) | |
| req.Header.Set("Content-Type", "application/json") | |
| client := &http.Client{Timeout: 10 * time.Second} | |
| resp, err := client.Do(req) | |
| if err != nil { | |
| return GladiaSession{}, fmt.Errorf("session init request failed: %w", err) | |
| } | |
| defer resp.Body.Close() | |
| if resp.StatusCode < 200 || resp.StatusCode >= 300 { | |
| bodyBytes, _ := io.ReadAll(resp.Body) | |
| return GladiaSession{}, fmt.Errorf("bad status code: %d - %s", resp.StatusCode, string(bodyBytes)) | |
| } | |
| var data GladiaSession | |
| if err := json.NewDecoder(resp.Body).Decode(&data); err != nil { | |
| return GladiaSession{}, fmt.Errorf("failed to decode response: %w", err) | |
| } | |
| log.Printf("🛰 Gladia session ID: %s", data.ID) | |
| return data, nil | |
| } | |
| ``` | |
| import ( | |
| "bytes" | |
| "encoding/json" | |
| "fmt" | |
| "io" | |
| "log" | |
| "net/http" | |
| "time" | |
| ) | |
| // Package-level variables for API key and session | |
| var ( | |
| gladiaAPIKey string | |
| session GladiaSession | |
| ) | |
| const ( | |
| gladiaInitURL = "https://api.gladia.io/v2/live" | |
| ) | |
| // GladiaSession stores session information | |
| type GladiaSession struct { | |
| ID string `json:"id"` | |
| URL string `json:"url"` | |
| } | |
| // createSession initializes a Gladia real-time transcription session and returns the WebSocket URL. | |
| func createSession() (GladiaSession, error) { | |
| // Use a typed struct for payload | |
| type initPayload struct { | |
| Encoding string `json:"encoding"` | |
| BitDepth int `json:"bit_depth"` | |
| SampleRate int `json:"sample_rate"` | |
| Channels int `json:"channels"` | |
| } | |
| payload := initPayload{ | |
| Encoding: "wav/ulaw", | |
| BitDepth: 8, | |
| SampleRate: 8000, | |
| Channels: 1, | |
| } | |
| body, err := json.Marshal(payload) | |
| if err != nil { | |
| return GladiaSession{}, fmt.Errorf("failed to marshal payload: %w", err) | |
| } | |
| req, err := http.NewRequest("POST", gladiaInitURL, bytes.NewReader(body)) | |
| if err != nil { | |
| return GladiaSession{}, fmt.Errorf("failed to create request: %w", err) | |
| } | |
| req.Header.Set("X-Gladia-Key", gladiaAPIKey) | |
| req.Header.Set("Content-Type", "application/json") | |
| client := &http.Client{Timeout: 10 * time.Second} | |
| resp, err := client.Do(req) | |
| if err != nil { | |
| return GladiaSession{}, fmt.Errorf("session init request failed: %w", err) | |
| } | |
| defer resp.Body.Close() | |
| if resp.StatusCode < 200 || resp.StatusCode >= 300 { | |
| bodyBytes, _ := io.ReadAll(resp.Body) | |
| return GladiaSession{}, fmt.Errorf("bad status code: %d - %s", resp.StatusCode, string(bodyBytes)) | |
| } | |
| var data GladiaSession | |
| if err := json.NewDecoder(resp.Body).Decode(&data); err != nil { | |
| return GladiaSession{}, fmt.Errorf("failed to decode response: %w", err) | |
| } | |
| log.Printf("🛰 Gladia session ID: %s", data.ID) | |
| return data, nil | |
| } |
…on of Twilio calls using Gladia, including environment setup instructions, .gitignore, package.json, and necessary JavaScript files for server implementation and TwiML configuration
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (9)
blogs/twilio-solaria-javascript/src/main.js (6)
16-17: Consider using const for immutable variables.The variable
gladiaAPIKeyis initialized once and never reassigned, making it a good candidate for usingconstinstead oflet. This helps prevent accidental reassignment and follows JavaScript best practices.-let gladiaAPIKey = process.env.GLADIA_API_KEY; +const gladiaAPIKey = process.env.GLADIA_API_KEY;
29-51: Consider adding retry logic for API initialization.The API initialization is a critical step, but there's no retry mechanism if it fails due to temporary network issues. In a production environment, adding retry logic with exponential backoff would improve resilience.
You could implement a simple retry mechanism like this:
async function createSession() { // μ-law, 8-bit, 8 kHz, mono const payload = { encoding: 'wav/ulaw', bit_depth: 8, sample_rate: 8000, channels: 1 }; + let retries = 3; + const delay = ms => new Promise(resolve => setTimeout(resolve, ms)); + while (retries > 0) { try { const response = await fetch(GLADIA_INIT_URL, { method: 'POST', headers: { 'X-Gladia-Key': gladiaAPIKey, 'Content-Type': 'application/json' }, body: JSON.stringify(payload), timeout: 10000 }); if (!response.ok) { const errorBody = await response.text(); throw new Error(`Bad status code: ${response.status} - ${errorBody}`); } const data = await response.json(); console.log(`🛰 Gladia session ID: ${data.id}`); return data; } catch (error) { + retries--; + if (retries === 0) { throw new Error(`Failed to create session: ${error.message}`); + } + console.log(`Session creation failed, retrying (${retries} attempts left): ${error.message}`); + await delay(2000 * (4 - retries)); // Exponential backoff } + } }
105-105: Message handling should include rate limiting.The current implementation forwards all messages without any rate limiting, which could potentially lead to overwhelming the Gladia API in high-volume scenarios.
Consider implementing a simple token bucket rate limiter to ensure the system remains stable under heavy load.
152-159: Add content security policy headers for HTTP responses.For added security, consider adding Content-Security-Policy headers to protect against XSS attacks, especially for the health check endpoint.
if (req.url === '/health') { - res.writeHead(200, { 'Content-Type': 'application/json' }); + res.writeHead(200, { + 'Content-Type': 'application/json', + 'Content-Security-Policy': "default-src 'self'", + 'X-Content-Type-Options': 'nosniff' + }); res.end(JSON.stringify({ status: 'ok', service: 'twilio-gladia-transcription' }));
164-166: Enhance security headers for default HTTP responses.Similar to the health check endpoint, add security headers to other HTTP responses as well.
// Default message for HTTP requests - res.writeHead(200, { 'Content-Type': 'text/plain' }); + res.writeHead(200, { + 'Content-Type': 'text/plain', + 'Content-Security-Policy': "default-src 'self'", + 'X-Content-Type-Options': 'nosniff' + }); res.end('Twilio-Gladia Transcription Server\n\nAvailable endpoints:\n- /media (WebSocket): Connect Twilio Media Streams\n- /health (HTTP): Health check endpoint');
136-186: Add graceful shutdown handling.The server should handle process signals (SIGTERM, SIGINT) to ensure graceful shutdown, properly closing WebSocket connections and cleaning up resources.
Add the following code to the main function to handle graceful shutdowns:
// Start the server server.listen(port, () => { console.log(`🚀 Starting server on 0.0.0.0:${port}`); }); + // Handle graceful shutdown + const shutdown = (signal) => { + console.log(`Received ${signal}. Shutting down gracefully...`); + + // Close the HTTP server + server.close(() => { + console.log('HTTP server closed.'); + }); + + // Close all WebSocket connections + wss.clients.forEach((client) => { + client.close(1000, 'Server shutting down'); + }); + + // Exit after a timeout + setTimeout(() => { + console.log('Exiting process...'); + process.exit(0); + }, 3000); + }; + + // Register signal handlers + process.on('SIGTERM', () => shutdown('SIGTERM')); + process.on('SIGINT', () => shutdown('SIGINT')); } catch (error) { console.error(`Server initialization failed: ${error}`); process.exit(1); }blogs/twilio-solaria-javascript/blog.md (3)
207-218: Include security considerations for TwiML endpoint.The TwiML example is correct, but there's no mention of securing the endpoint to prevent unauthorized configuration of Twilio numbers.
Consider adding a brief section on securing your TwiML endpoint with Twilio's request validation to ensure only legitimate Twilio requests can configure call handling.
220-236: Fix punctuation in TwiML explanation list.There are some punctuation issues in this section with loose punctuation marks at the beginning of list items.
- - `<Response>`: The root element of any TwiML document. It contains all the TwiML instructions for handling the call. +* `<Response>`: The root element of any TwiML document. It contains all the TwiML instructions for handling the call. - - `<Start>`: This element initiates Twilio's Media Streams feature, which allows streaming of audio in real-time while the call is in progress. It tells Twilio to begin capturing and streaming media before executing the rest of the call flow. +* `<Start>`: This element initiates Twilio's Media Streams feature, which allows streaming of audio in real-time while the call is in progress. It tells Twilio to begin capturing and streaming media before executing the rest of the call flow. - - `<Stream>`: A child element of `<Start>` that configures the media stream: +* `<Stream>`: A child element of `<Start>` that configures the media stream:🧰 Tools
🪛 LanguageTool
[uncategorized] ~221-~221: Loose punctuation mark.
Context: ...his TwiML configuration: -<Response>: The root element of any TwiML document....(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~223-~223: Loose punctuation mark.
Context: ...ions for handling the call. -<Start>: This element initiates Twilio's Media S...(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~225-~225: Loose punctuation mark.
Context: ...the rest of the call flow. -<Stream>: A child element of<Start>that confi...(UNLIKELY_OPENING_PUNCTUATION)
[misspelling] ~228-~228: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...(EN_A_VS_AN)
[uncategorized] ~232-~232: Loose punctuation mark.
Context: ...connection to this endpoint. -<Dial>: After starting the media stream, this e...(UNLIKELY_OPENING_PUNCTUATION)
228-228: Correct article usage.The phrase "an ngrok URL" uses the wrong article for "ngrok" which starts with a consonant sound.
- - The domain should be your public domain (e.g., an ngrok URL or a custom domain). + - The domain should be your public domain (e.g., a ngrok URL or a custom domain).🧰 Tools
🪛 LanguageTool
[misspelling] ~228-~228: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...(EN_A_VS_AN)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
blogs/twilio-solaria-javascript/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (7)
blogs/twilio-solaria-javascript/.gitignore(1 hunks)blogs/twilio-solaria-javascript/blog.md(1 hunks)blogs/twilio-solaria-javascript/package.json(1 hunks)blogs/twilio-solaria-javascript/src/README.md(1 hunks)blogs/twilio-solaria-javascript/src/env_setup.txt(1 hunks)blogs/twilio-solaria-javascript/src/main.js(1 hunks)blogs/twilio-solaria-javascript/src/twiml_example.xml(1 hunks)
✅ Files skipped from review due to trivial changes (5)
- blogs/twilio-solaria-javascript/src/env_setup.txt
- blogs/twilio-solaria-javascript/.gitignore
- blogs/twilio-solaria-javascript/src/twiml_example.xml
- blogs/twilio-solaria-javascript/package.json
- blogs/twilio-solaria-javascript/src/README.md
🧰 Additional context used
🪛 LanguageTool
blogs/twilio-solaria-javascript/blog.md
[uncategorized] ~221-~221: Loose punctuation mark.
Context: ...his TwiML configuration: - <Response>: The root element of any TwiML document....
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~223-~223: Loose punctuation mark.
Context: ...ions for handling the call. - <Start>: This element initiates Twilio's Media S...
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~225-~225: Loose punctuation mark.
Context: ...the rest of the call flow. - <Stream>: A child element of <Start> that confi...
(UNLIKELY_OPENING_PUNCTUATION)
[misspelling] ~228-~228: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...
(EN_A_VS_AN)
[uncategorized] ~232-~232: Loose punctuation mark.
Context: ...connection to this endpoint. - <Dial>: After starting the media stream, this e...
(UNLIKELY_OPENING_PUNCTUATION)
🔇 Additional comments (8)
blogs/twilio-solaria-javascript/src/main.js (2)
30-38: Code looks good and follows proper API configuration.The payload is correctly configured to handle Twilio's μ-law audio format (8-bit, 8 kHz, mono), which is essential for proper transcription.
189-192: LGTM: Proper error handling.Good implementation of a global error handler to catch unhandled promise rejections and exit gracefully.
blogs/twilio-solaria-javascript/blog.md (6)
1-3: Great headline and introduction.The headline and introduction effectively communicate the value proposition: Gladia's native μ-law support allows for direct processing of Twilio audio without conversion steps.
9-18: Well-structured prerequisites table.The prerequisites table clearly outlines what users need and why each component is necessary. This helps readers ensure they have everything ready before attempting implementation.
23-61: Code example matches implementation perfectly.The createSession function example correctly shows how to initialize a Gladia session with the appropriate μ-law audio parameters, matching the implementation in the main.js file.
71-97: Excellent system architecture diagram.The Mermaid sequence diagram clearly illustrates the data flow between all components (Caller, Twilio, JavaScript Server, and Gladia API), making the system architecture easy to understand.
244-262: Great installation and deployment instructions.The command examples are clear, concise, and cover development, production, and tunneling scenarios, making it easy for users to get started.
279-284: Excellent "Next steps" section.The next steps section provides valuable guidance for users to extend the basic implementation with advanced features like diarization, dual-channel support, and scaling considerations.
|
|
||
| // Create WebSocket server | ||
| const wss = new WebSocket.Server({ server }); | ||
|
|
||
| // Handle WebSocket connections | ||
| wss.on('connection', (ws, req) => { | ||
| console.log(`🔌 WebSocket connected to ${req.url}`); | ||
| handleWebSocket(ws); | ||
| }); |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Add WebSocket connection validation.
The server accepts any WebSocket connection without validation. Consider adding basic validation to ensure connections are coming from Twilio or authorized sources.
// Create WebSocket server
const wss = new WebSocket.Server({ server });
// Handle WebSocket connections
wss.on('connection', (ws, req) => {
console.log(`🔌 WebSocket connected to ${req.url}`);
+
+ // Basic validation - check for Twilio headers or implement token-based auth
+ const twilioSignature = req.headers['x-twilio-signature'];
+ if (!twilioSignature) {
+ console.log('Warning: Connection without Twilio signature detected');
+ // In production, you might want to reject unauthorized connections
+ // ws.close(1008, 'Unauthorized');
+ // return;
+ }
+
handleWebSocket(ws, req);
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Create WebSocket server | |
| const wss = new WebSocket.Server({ server }); | |
| // Handle WebSocket connections | |
| wss.on('connection', (ws, req) => { | |
| console.log(`🔌 WebSocket connected to ${req.url}`); | |
| handleWebSocket(ws); | |
| }); | |
| // Create WebSocket server | |
| const wss = new WebSocket.Server({ server }); | |
| // Handle WebSocket connections | |
| wss.on('connection', (ws, req) => { | |
| console.log(`🔌 WebSocket connected to ${req.url}`); | |
| // Basic validation - check for Twilio headers or implement token-based auth | |
| const twilioSignature = req.headers['x-twilio-signature']; | |
| if (!twilioSignature) { | |
| console.log('Warning: Connection without Twilio signature detected'); | |
| // In production, you might want to reject unauthorized connections | |
| // ws.close(1008, 'Unauthorized'); | |
| // return; | |
| } | |
| handleWebSocket(ws, req); | |
| }); |
| const clientInfo = twilioConn._socket.remoteAddress; | ||
| console.log(`🔌 Twilio WebSocket connected from ${clientInfo}`); |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Avoid using internal socket properties.
Using _socket.remoteAddress accesses internal WebSocket implementation details which may change across library versions. Consider using the standard req.connection.remoteAddress passed from the connection handler instead.
-function handleWebSocket(twilioConn) {
- const clientInfo = twilioConn._socket.remoteAddress;
+function handleWebSocket(twilioConn, req) {
+ const clientInfo = req.connection.remoteAddress;Then update the call in the main connection handler (line 174):
- handleWebSocket(ws);
+ handleWebSocket(ws, req);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const clientInfo = twilioConn._socket.remoteAddress; | |
| console.log(`🔌 Twilio WebSocket connected from ${clientInfo}`); | |
| // In blogs/twilio-solaria-javascript/src/main.js around lines 93–94 | |
| -function handleWebSocket(twilioConn) { | |
| - const clientInfo = twilioConn._socket.remoteAddress; | |
| +function handleWebSocket(twilioConn, req) { | |
| + const clientInfo = req.connection.remoteAddress; | |
| console.log(`🔌 Twilio WebSocket connected from ${clientInfo}`); | |
| // …rest of your logic | |
| } |
| const clientInfo = twilioConn._socket.remoteAddress; | |
| console.log(`🔌 Twilio WebSocket connected from ${clientInfo}`); | |
| // In the main connection handler (around line 174) | |
| - handleWebSocket(ws); | |
| + handleWebSocket(ws, req); |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
blogs/twilio-solaria-typescript/src/app/handlers.ts (1)
45-62: Consider handling partial transcripts for more interactive applications.The function currently only logs and returns final transcripts. For applications requiring real-time feedback, you might want to also process partial transcripts.
export function handleGladia(message: Buffer): string { try { // Parse the message from Gladia const msg: GladiaMessage = JSON.parse(message.toString()); // Check if this is a final transcript if (msg.type === 'transcript' && msg.data?.is_final) { const transcript = msg.data.utterance.text; console.log(`📝 Transcript: ${transcript}`); return transcript; + } else if (msg.type === 'transcript' && msg.data?.utterance?.text) { + // Handle partial transcripts + const partialTranscript = msg.data.utterance.text; + console.log(`🔄 Partial: ${partialTranscript}`); + return partialTranscript; } return ''; } catch (error) { console.error(`Error parsing Gladia message: ${error}`); return ''; } }blogs/twilio-solaria-typescript/src/app/server.ts (1)
63-120: Consider implementing WebSocket reconnection logic.The server currently doesn't attempt to reconnect if the Gladia connection is interrupted. For production use, implementing reconnection logic would improve resilience.
wss.on('connection', async (twilioConn: WebSocket, req: http.IncomingMessage) => { const clientInfo = req.socket.remoteAddress || 'unknown'; console.log(`🔌 Twilio WebSocket connected from ${clientInfo} on path ${req.url}`); try { + // Function to establish Gladia connection with retry logic + const connectToGladia = async (retries = 3, delay = 1000): Promise<WebSocket> => { + let lastError: Error | undefined; + + for (let attempt = 1; attempt <= retries; attempt++) { + try { + // Connect to Gladia + const gladiaConn = new WebSocket(session.url); + + // Wait for connection to open + await new Promise<void>((resolve, reject) => { + gladiaConn.on('open', () => { + console.log(`Connected to Gladia session ${session.id} (attempt ${attempt}/${retries})`); + resolve(); + }); + gladiaConn.on('error', reject); + }); + + return gladiaConn; + } catch (error) { + lastError = error as Error; + console.error(`Connection attempt ${attempt}/${retries} failed: ${error}`); + + if (attempt < retries) { + console.log(`Retrying in ${delay}ms...`); + await new Promise(resolve => setTimeout(resolve, delay)); + // Exponential backoff + delay *= 2; + } + } + } + + throw lastError || new Error('Failed to connect to Gladia after multiple attempts'); + }; - // Connect to Gladia - const gladiaConn = new WebSocket(session.url); - - // Handle connection errors - gladiaConn.on('error', (error) => { - console.error(`Error with Gladia connection: ${error}`); - twilioConn.close(); - }); - - // Wait for Gladia connection to open - await new Promise<void>((resolve, reject) => { - gladiaConn.on('open', () => { - console.log(`Connected to Gladia session ${session.id}`); - resolve(); - }); - gladiaConn.on('error', reject); - }); + // Establish connection with retry logic + const gladiaConn = await connectToGladia(); + + // Handle connection errors + gladiaConn.on('error', (error) => { + console.error(`Error with Gladia connection: ${error}`); + twilioConn.close(); + });blogs/twilio-solaria-typescript/blog.md (1)
320-336: Fix formatting in markdown list.The list items have loose punctuation marks that should be fixed for better rendering.
Let's examine each element in this TwiML configuration: - `<Response>`: The root element of any TwiML document. It contains all the TwiML instructions for handling the call. - `<Start>`: This element initiates Twilio's Media Streams feature, which allows streaming of audio in real-time while the call is in progress. It tells Twilio to begin capturing and streaming media before executing the rest of the call flow. - `<Stream>`: A child element of `<Start>` that configures the media stream: - `url` attribute: Specifies the WebSocket endpoint where Twilio will send the audio data. - The URL must use secure WebSockets (`wss://`). - - The domain should be your public domain (e.g., an ngrok URL or a custom domain). + - The domain should be your public domain (e.g., a ngrok URL or a custom domain). - The path (`/media`) must match the WebSocket route in your TypeScript application. - Each call will create a new WebSocket connection to this endpoint. - `<Dial>`: After starting the media stream, this element connects the caller to another phone number. During this connection:🧰 Tools
🪛 LanguageTool
[uncategorized] ~320-~320: Loose punctuation mark.
Context: ...his TwiML configuration: -<Response>: The root element of any TwiML document....(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~322-~322: Loose punctuation mark.
Context: ...ions for handling the call. -<Start>: This element initiates Twilio's Media S...(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~324-~324: Loose punctuation mark.
Context: ...the rest of the call flow. -<Stream>: A child element of<Start>that confi...(UNLIKELY_OPENING_PUNCTUATION)
[misspelling] ~327-~327: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...(EN_A_VS_AN)
[uncategorized] ~331-~331: Loose punctuation mark.
Context: ...connection to this endpoint. -<Dial>: After starting the media stream, this e...(UNLIKELY_OPENING_PUNCTUATION)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
blogs/twilio-solaria-typescript/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (11)
blogs/twilio-solaria-typescript/.gitignore(1 hunks)blogs/twilio-solaria-typescript/blog.md(1 hunks)blogs/twilio-solaria-typescript/package.json(1 hunks)blogs/twilio-solaria-typescript/src/README.md(1 hunks)blogs/twilio-solaria-typescript/src/app/gladiaClient.ts(1 hunks)blogs/twilio-solaria-typescript/src/app/handlers.ts(1 hunks)blogs/twilio-solaria-typescript/src/app/server.ts(1 hunks)blogs/twilio-solaria-typescript/src/app/types.ts(1 hunks)blogs/twilio-solaria-typescript/src/env_setup.txt(1 hunks)blogs/twilio-solaria-typescript/src/twiml_example.xml(1 hunks)blogs/twilio-solaria-typescript/tsconfig.json(1 hunks)
✅ Files skipped from review due to trivial changes (7)
- blogs/twilio-solaria-typescript/src/twiml_example.xml
- blogs/twilio-solaria-typescript/tsconfig.json
- blogs/twilio-solaria-typescript/package.json
- blogs/twilio-solaria-typescript/src/env_setup.txt
- blogs/twilio-solaria-typescript/src/app/types.ts
- blogs/twilio-solaria-typescript/.gitignore
- blogs/twilio-solaria-typescript/src/README.md
🧰 Additional context used
🧬 Code Graph Analysis (3)
blogs/twilio-solaria-typescript/src/app/handlers.ts (2)
blogs/twilio-solaria-javascript/src/main.js (3)
gladiaConn(97-97)mulaw(63-63)transcript(80-80)blogs/twilio-solaria-typescript/src/app/types.ts (2)
TwilioMessage(8-13)GladiaMessage(15-23)
blogs/twilio-solaria-typescript/src/app/gladiaClient.ts (1)
blogs/twilio-solaria-typescript/src/app/types.ts (1)
GladiaSession(3-6)
blogs/twilio-solaria-typescript/src/app/server.ts (3)
blogs/twilio-solaria-typescript/src/app/types.ts (1)
GladiaSession(3-6)blogs/twilio-solaria-typescript/src/app/gladiaClient.ts (1)
createSession(12-87)blogs/twilio-solaria-typescript/src/app/handlers.ts (2)
processMessage(9-38)handleGladia(45-62)
🪛 LanguageTool
blogs/twilio-solaria-typescript/blog.md
[uncategorized] ~320-~320: Loose punctuation mark.
Context: ...his TwiML configuration: - <Response>: The root element of any TwiML document....
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~322-~322: Loose punctuation mark.
Context: ...ions for handling the call. - <Start>: This element initiates Twilio's Media S...
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~324-~324: Loose punctuation mark.
Context: ...the rest of the call flow. - <Stream>: A child element of <Start> that confi...
(UNLIKELY_OPENING_PUNCTUATION)
[misspelling] ~327-~327: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...
(EN_A_VS_AN)
[uncategorized] ~331-~331: Loose punctuation mark.
Context: ...connection to this endpoint. - <Dial>: After starting the media stream, this e...
(UNLIKELY_OPENING_PUNCTUATION)
🔇 Additional comments (4)
blogs/twilio-solaria-typescript/src/app/gladiaClient.ts (1)
12-87: LGTM! Well-implemented session creation with robust error handling.The
createSessionfunction is well-structured with comprehensive error handling for HTTP status codes, request errors, timeouts, and JSON parsing. The audio encoding parameters correctly match Twilio's μ-law format (8 kHz, 8-bit, mono).blogs/twilio-solaria-typescript/src/app/handlers.ts (1)
9-38: LGTM! Effective processing of Twilio messages with proper error handling.The function correctly parses JSON messages, filters for media events, validates payloads, and efficiently decodes base64 μ-law audio before binary transmission to Gladia.
blogs/twilio-solaria-typescript/src/app/server.ts (1)
18-140: Well-structured server with clean WebSocket handling.The main function orchestrates the application flow nicely with proper error handling and logging.
blogs/twilio-solaria-typescript/blog.md (1)
127-146: Good use of a sequence diagram to illustrate the architecture.The Mermaid diagram effectively illustrates the data flow and interactions between system components, making it easy to understand the overall architecture.
| wss.on('connection', async (twilioConn: WebSocket, req: http.IncomingMessage) => { | ||
| const clientInfo = req.socket.remoteAddress || 'unknown'; | ||
| console.log(`🔌 Twilio WebSocket connected from ${clientInfo} on path ${req.url}`); | ||
|
|
||
| try { | ||
| // Connect to Gladia | ||
| const gladiaConn = new WebSocket(session.url); | ||
|
|
||
| // Handle connection errors | ||
| gladiaConn.on('error', (error) => { | ||
| console.error(`Error with Gladia connection: ${error}`); | ||
| twilioConn.close(); | ||
| }); | ||
|
|
||
| // Wait for Gladia connection to open | ||
| await new Promise<void>((resolve, reject) => { | ||
| gladiaConn.on('open', () => { | ||
| console.log(`Connected to Gladia session ${session.id}`); | ||
| resolve(); | ||
| }); | ||
| gladiaConn.on('error', reject); | ||
| }); | ||
|
|
||
| // Handle messages from Twilio | ||
| twilioConn.on('message', (message: Buffer) => { | ||
| try { | ||
| processMessage(message, gladiaConn); | ||
| } catch (error) { | ||
| console.error(`Error processing Twilio message: ${error}`); | ||
| } | ||
| }); | ||
|
|
||
| // Handle messages from Gladia | ||
| gladiaConn.on('message', (message: Buffer) => { | ||
| try { | ||
| handleGladia(message); | ||
| } catch (error) { | ||
| console.error(`Error processing Gladia message: ${error}`); | ||
| } | ||
| }); | ||
|
|
||
| // Handle Twilio connection close | ||
| twilioConn.on('close', () => { | ||
| console.log(`Twilio connection closed from ${clientInfo}`); | ||
| gladiaConn.close(); | ||
| }); | ||
|
|
||
| // Handle Gladia connection close | ||
| gladiaConn.on('close', () => { | ||
| console.log('Gladia connection closed'); | ||
| twilioConn.close(); | ||
| }); | ||
|
|
||
| } catch (error) { | ||
| console.error(`Failed to establish connection to Gladia: ${error}`); | ||
| twilioConn.close(); | ||
| } | ||
| }); |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Add path validation for WebSocket connections.
The server accepts WebSocket connections on any path. To prevent unexpected connections, validate that req.url matches the expected path ('/media').
wss.on('connection', async (twilioConn: WebSocket, req: http.IncomingMessage) => {
const clientInfo = req.socket.remoteAddress || 'unknown';
console.log(`🔌 Twilio WebSocket connected from ${clientInfo} on path ${req.url}`);
+ // Validate that the WebSocket connection is to the expected path
+ if (req.url !== '/media') {
+ console.error(`Rejecting connection to unexpected path: ${req.url}`);
+ twilioConn.close(1008, 'Path not supported');
+ return;
+ }
try {
// Connect to Gladia
const gladiaConn = new WebSocket(session.url);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| wss.on('connection', async (twilioConn: WebSocket, req: http.IncomingMessage) => { | |
| const clientInfo = req.socket.remoteAddress || 'unknown'; | |
| console.log(`🔌 Twilio WebSocket connected from ${clientInfo} on path ${req.url}`); | |
| try { | |
| // Connect to Gladia | |
| const gladiaConn = new WebSocket(session.url); | |
| // Handle connection errors | |
| gladiaConn.on('error', (error) => { | |
| console.error(`Error with Gladia connection: ${error}`); | |
| twilioConn.close(); | |
| }); | |
| // Wait for Gladia connection to open | |
| await new Promise<void>((resolve, reject) => { | |
| gladiaConn.on('open', () => { | |
| console.log(`Connected to Gladia session ${session.id}`); | |
| resolve(); | |
| }); | |
| gladiaConn.on('error', reject); | |
| }); | |
| // Handle messages from Twilio | |
| twilioConn.on('message', (message: Buffer) => { | |
| try { | |
| processMessage(message, gladiaConn); | |
| } catch (error) { | |
| console.error(`Error processing Twilio message: ${error}`); | |
| } | |
| }); | |
| // Handle messages from Gladia | |
| gladiaConn.on('message', (message: Buffer) => { | |
| try { | |
| handleGladia(message); | |
| } catch (error) { | |
| console.error(`Error processing Gladia message: ${error}`); | |
| } | |
| }); | |
| // Handle Twilio connection close | |
| twilioConn.on('close', () => { | |
| console.log(`Twilio connection closed from ${clientInfo}`); | |
| gladiaConn.close(); | |
| }); | |
| // Handle Gladia connection close | |
| gladiaConn.on('close', () => { | |
| console.log('Gladia connection closed'); | |
| twilioConn.close(); | |
| }); | |
| } catch (error) { | |
| console.error(`Failed to establish connection to Gladia: ${error}`); | |
| twilioConn.close(); | |
| } | |
| }); | |
| wss.on('connection', async (twilioConn: WebSocket, req: http.IncomingMessage) => { | |
| const clientInfo = req.socket.remoteAddress || 'unknown'; | |
| console.log(`🔌 Twilio WebSocket connected from ${clientInfo} on path ${req.url}`); | |
| // Validate that the WebSocket connection is to the expected path | |
| if (req.url !== '/media') { | |
| console.error(`Rejecting connection to unexpected path: ${req.url}`); | |
| twilioConn.close(1008, 'Path not supported'); | |
| return; | |
| } | |
| try { | |
| // Connect to Gladia | |
| const gladiaConn = new WebSocket(session.url); | |
| // Handle connection errors | |
| gladiaConn.on('error', (error) => { | |
| console.error(`Error with Gladia connection: ${error}`); | |
| twilioConn.close(); | |
| }); | |
| // Wait for Gladia connection to open | |
| await new Promise<void>((resolve, reject) => { | |
| gladiaConn.on('open', () => { | |
| console.log(`Connected to Gladia session ${session.id}`); | |
| resolve(); | |
| }); | |
| gladiaConn.on('error', reject); | |
| }); | |
| // Handle messages from Twilio | |
| twilioConn.on('message', (message: Buffer) => { | |
| try { | |
| processMessage(message, gladiaConn); | |
| } catch (error) { | |
| console.error(`Error processing Twilio message: ${error}`); | |
| } | |
| }); | |
| // Handle messages from Gladia | |
| gladiaConn.on('message', (message: Buffer) => { | |
| try { | |
| handleGladia(message); | |
| } catch (error) { | |
| console.error(`Error processing Gladia message: ${error}`); | |
| } | |
| }); | |
| // Handle Twilio connection close | |
| twilioConn.on('close', () => { | |
| console.log(`Twilio connection closed from ${clientInfo}`); | |
| gladiaConn.close(); | |
| }); | |
| // Handle Gladia connection close | |
| gladiaConn.on('close', () => { | |
| console.log('Gladia connection closed'); | |
| twilioConn.close(); | |
| }); | |
| } catch (error) { | |
| console.error(`Failed to establish connection to Gladia: ${error}`); | |
| twilioConn.close(); | |
| } | |
| }); |
| * **Add-ons** – enable diarization, sentiment, keywords, etc., by including the flags when you create the session. | ||
| * **Dual-channel** – Twilio can stream separate channels; Gladia preserves them so you always know who's speaking. | ||
| * **Post-call JSON** – store the session `id` and hit `GET /v2/live/:id` for the full, punctuated transcript when the call ends. | ||
| * **Scale it** – TypeScript/Node.js's event-driven, non-blocking I/O model makes it easy to scale for high loads. Consider deploying with a load balancer for horizontal scaling. | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion
Add security considerations to the Next Steps section.
Consider adding security recommendations for production deployment, such as using TLS/HTTPS, implementing authentication for the WebSocket endpoint, and securely managing API keys.
### 5 — Next steps
* **Add-ons** – enable diarization, sentiment, keywords, etc., by including the flags when you create the session.
* **Dual-channel** – Twilio can stream separate channels; Gladia preserves them so you always know who's speaking.
* **Post-call JSON** – store the session `id` and hit `GET /v2/live/:id` for the full, punctuated transcript when the call ends.
* **Scale it** – TypeScript/Node.js's event-driven, non-blocking I/O model makes it easy to scale for high loads. Consider deploying with a load balancer for horizontal scaling.
+ * **Security** – For production, ensure you're using TLS/HTTPS, implement authentication for your WebSocket endpoint, and securely manage API keys using environment variables or a secret management solution.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| * **Add-ons** – enable diarization, sentiment, keywords, etc., by including the flags when you create the session. | |
| * **Dual-channel** – Twilio can stream separate channels; Gladia preserves them so you always know who's speaking. | |
| * **Post-call JSON** – store the session `id` and hit `GET /v2/live/:id` for the full, punctuated transcript when the call ends. | |
| * **Scale it** – TypeScript/Node.js's event-driven, non-blocking I/O model makes it easy to scale for high loads. Consider deploying with a load balancer for horizontal scaling. | |
| ### 5 — Next steps | |
| * **Add-ons** – enable diarization, sentiment, keywords, etc., by including the flags when you create the session. | |
| * **Dual-channel** – Twilio can stream separate channels; Gladia preserves them so you always know who's speaking. | |
| * **Post-call JSON** – store the session `id` and hit `GET /v2/live/:id` for the full, punctuated transcript when the call ends. | |
| * **Scale it** – TypeScript/Node.js's event-driven, non-blocking I/O model makes it easy to scale for high loads. Consider deploying with a load balancer for horizontal scaling. | |
| * **Security** – For production, ensure you're using TLS/HTTPS, implement authentication for your WebSocket endpoint, and securely manage API keys using environment variables or a secret management solution. |
…iption of Vonage calls using Gladia, including environment setup instructions, .gitignore, and necessary Python files for server implementation and NCCO configuration
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (9)
blogs/vonage-solaria-python-fastapi/src/server.py (2)
10-11: Remove the unusedHTTPExceptionimport
HTTPExceptionis imported but never referenced anywhere in the module. Keeping unused symbols creates noise and can hide real-world linting issues.-from fastapi import FastAPI, WebSocket, Response, HTTPException, Request +from fastapi import FastAPI, WebSocket, Response, Request🧰 Tools
🪛 Ruff (0.8.2)
10-10:
fastapi.HTTPExceptionimported but unusedRemove unused import:
fastapi.HTTPException(F401)
67-67: Drop superfluousf-prefixesBoth lines are plain strings without interpolation; the
fadds no value and triggers Ruff (F541).-logger.debug(f"Rate limiting: waiting before retrying Gladia session creation") +logger.debug("Rate limiting: waiting before retrying Gladia session creation") -logger.debug(f"Serving NCCO JSON") +logger.debug("Serving NCCO JSON")Also applies to: 222-222
🧰 Tools
🪛 Ruff (0.8.2)
67-67: f-string without any placeholders
Remove extraneous
fprefix(F541)
blogs/vonage-solaria-python-fastapi/blog.md (2)
57-58: Minor punctuation improvementAdd a comma before “or” to separate the independent clauses.
-Vonage WebSockets typically send L16 PCM audio by default. Gladia can process this directly or you can configure Vonage to send other formats. +Vonage WebSockets typically send L16 PCM audio by default. Gladia can process this directly, or you can configure Vonage to send other formats.🧰 Tools
🪛 LanguageTool
[uncategorized] ~57-~57: Use a comma before ‘or’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...efault. Gladia can process this directly or you can configure Vonage to send other ...(COMMA_COMPOUND_SENTENCE)
248-257: Markdown list rendering issueThe leading hyphens are rendered as bullets, but the preceding “:” causes markdown-lint warnings and produces a hanging colon. Drop the extra punctuation for cleaner output:
- - `<ncco>`: The root element of any Vonage NCCO document. It contains all the instructions for handling the call. +* `<ncco>` – The root element of any Vonage NCCO document. It contains all the instructions for handling the call.(Apply to the other list items in this block.)
🧰 Tools
🪛 LanguageTool
[uncategorized] ~248-~248: Loose punctuation mark.
Context: ... in this NCCO configuration: -<ncco>: The root element of any Vonage NCCO doc...(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~250-~250: Loose punctuation mark.
Context: ... for handling the call. -<websocket>: This element configures a WebSocket con...(UNLIKELY_OPENING_PUNCTUATION)
[misspelling] ~253-~253: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...(EN_A_VS_AN)
[uncategorized] ~256-~256: Loose punctuation mark.
Context: ...on to this endpoint. -<contentType>: Specifies the audio format that Vonage ...(UNLIKELY_OPENING_PUNCTUATION)
blogs/vonage-solaria-python-fastapi/src/README.md (2)
34-38: Specify a language for fenced code blocksTools such as GitHub’s renderer and syntax highlighters benefit from an explicit language.
-``` +```bash GLADIA_API_KEY=your_gladia_api_key_here🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
35-35: Fenced code blocks should have a language specified
null(MD040, fenced-code-language)
43-45: Avoid bare URLs to improve readabilityEmbed the URL in markdown syntax:
-In your answer URL configuration, use the content of `vonage_example.xml` as your NCCO (Nexmo Call Control Object) (https://jl.gladia.dev/media) +In your answer URL configuration, use the content of `vonage_example.xml` as your NCCO (Nexmo Call Control Object) (<https://jl.gladia.dev/media>)🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
43-43: Bare URL used
null(MD034, no-bare-urls)
blogs/telnyx-solaria-python-fastapi/blog.md (1)
11-18: Minor copy-editing for clarity & grammarA few small punctuation tweaks will improve readability and avoid LanguageTool warnings (“COMMA_COMPOUND_SENTENCE”, “EN_A_VS_AN”, etc.):
-Vonage WebSockets typically send L16 PCM audio by default. Gladia can process this directly or you can configure Vonage to send other formats. +Vonage WebSockets typically send L16 PCM audio by default. Gladia can process this directly, or you can configure Vonage to send other formats.and
-... expose a WebSocket endpoint with ngrok or a cloud VM. +... expose a WebSocket endpoint with ngrok or a cloud VM.Likewise, replace “an ngrok URL” with “a ngrok URL”.
These are purely stylistic; no functional impact.blogs/telnyx-solaria-python-fastapi/src/server.py (2)
146-154: Slow transcript polling & potential starvationReading transcripts only inside the message-receive loop (
while True … receive_text()) risks missing Gladia messages that arrive during network idle periods. Consider spawning a secondasyncio.create_taskthat continuouslyawait gladia_ws.recv()and queues/prints results, decoupling ingress and egress.
159-165: Replace bareexcept … passwithcontextlib.suppressStatic analysis (SIM105 / E722) flags the silent swallow below. Suppressing specific exceptions is clearer:
-import contextlib -... - try: - await gladia_ws.close() - except: - pass +import contextlib +... + with contextlib.suppress(Exception): + await gladia_ws.close()Avoiding bare
exceptprevents masking unexpected errors.🧰 Tools
🪛 Ruff (0.8.2)
162-165: Use
contextlib.suppress(Exception)instead oftry-except-passReplace with
contextlib.suppress(Exception)(SIM105)
164-164: Do not use bare
except(E722)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
.gitignore(1 hunks)blogs/telnyx-solaria-python-fastapi/.gitignore(1 hunks)blogs/telnyx-solaria-python-fastapi/blog.md(1 hunks)blogs/telnyx-solaria-python-fastapi/src/README.md(1 hunks)blogs/telnyx-solaria-python-fastapi/src/env_setup.txt(1 hunks)blogs/telnyx-solaria-python-fastapi/src/requirements.txt(1 hunks)blogs/telnyx-solaria-python-fastapi/src/server.py(1 hunks)blogs/telnyx-solaria-python-fastapi/src/vonage_example.xml(1 hunks)blogs/vonage-solaria-python-fastapi/.gitignore(1 hunks)blogs/vonage-solaria-python-fastapi/blog.md(1 hunks)blogs/vonage-solaria-python-fastapi/src/README.md(1 hunks)blogs/vonage-solaria-python-fastapi/src/env_setup.txt(1 hunks)blogs/vonage-solaria-python-fastapi/src/requirements.txt(1 hunks)blogs/vonage-solaria-python-fastapi/src/server.py(1 hunks)blogs/vonage-solaria-python-fastapi/src/vonage_example.xml(1 hunks)
✅ Files skipped from review due to trivial changes (10)
- blogs/telnyx-solaria-python-fastapi/.gitignore
- blogs/vonage-solaria-python-fastapi/.gitignore
- .gitignore
- blogs/vonage-solaria-python-fastapi/src/env_setup.txt
- blogs/telnyx-solaria-python-fastapi/src/env_setup.txt
- blogs/vonage-solaria-python-fastapi/src/vonage_example.xml
- blogs/telnyx-solaria-python-fastapi/src/requirements.txt
- blogs/telnyx-solaria-python-fastapi/src/vonage_example.xml
- blogs/vonage-solaria-python-fastapi/src/requirements.txt
- blogs/telnyx-solaria-python-fastapi/src/README.md
🧰 Additional context used
🪛 LanguageTool
blogs/telnyx-solaria-python-fastapi/blog.md
[uncategorized] ~57-~57: Use a comma before ‘or’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...efault. Gladia can process this directly or you can configure Vonage to send other ...
(COMMA_COMPOUND_SENTENCE)
[uncategorized] ~248-~248: Loose punctuation mark.
Context: ... in this NCCO configuration: - <ncco>: The root element of any Vonage NCCO doc...
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~250-~250: Loose punctuation mark.
Context: ... for handling the call. - <websocket>: This element configures a WebSocket con...
(UNLIKELY_OPENING_PUNCTUATION)
[misspelling] ~253-~253: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...
(EN_A_VS_AN)
[uncategorized] ~256-~256: Loose punctuation mark.
Context: ...on to this endpoint. - <contentType>: Specifies the audio format that Vonage ...
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~258-~258: Loose punctuation mark.
Context: ...CM at 8kHz in this case). - <connect>: This element connects the caller to ano...
(UNLIKELY_OPENING_PUNCTUATION)
blogs/vonage-solaria-python-fastapi/blog.md
[uncategorized] ~57-~57: Use a comma before ‘or’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...efault. Gladia can process this directly or you can configure Vonage to send other ...
(COMMA_COMPOUND_SENTENCE)
[uncategorized] ~248-~248: Loose punctuation mark.
Context: ... in this NCCO configuration: - <ncco>: The root element of any Vonage NCCO doc...
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~250-~250: Loose punctuation mark.
Context: ... for handling the call. - <websocket>: This element configures a WebSocket con...
(UNLIKELY_OPENING_PUNCTUATION)
[misspelling] ~253-~253: Use “a” instead of ‘an’ if the following word doesn’t start with a vowel sound, e.g. ‘a sentence’, ‘a university’.
Context: ...ain should be your public domain (e.g., an ngrok URL or a custom domain). - The ...
(EN_A_VS_AN)
[uncategorized] ~256-~256: Loose punctuation mark.
Context: ...on to this endpoint. - <contentType>: Specifies the audio format that Vonage ...
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~258-~258: Loose punctuation mark.
Context: ...CM at 8kHz in this case). - <connect>: This element connects the caller to ano...
(UNLIKELY_OPENING_PUNCTUATION)
🪛 Ruff (0.8.2)
blogs/telnyx-solaria-python-fastapi/src/server.py
162-165: Use contextlib.suppress(Exception) instead of try-except-pass
Replace with contextlib.suppress(Exception)
(SIM105)
164-164: Do not use bare except
(E722)
blogs/vonage-solaria-python-fastapi/src/server.py
10-10: fastapi.HTTPException imported but unused
Remove unused import: fastapi.HTTPException
(F401)
67-67: f-string without any placeholders
Remove extraneous f prefix
(F541)
85-85: Dictionary key literal "sample_rate" repeated
Remove repeated key literal "sample_rate"
(F601)
136-136: Do not use bare except
(E722)
222-222: f-string without any placeholders
Remove extraneous f prefix
(F541)
🪛 markdownlint-cli2 (0.17.2)
blogs/vonage-solaria-python-fastapi/src/README.md
35-35: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
43-43: Bare URL used
null
(MD034, no-bare-urls)
🔇 Additional comments (1)
blogs/telnyx-solaria-python-fastapi/src/server.py (1)
168-174: Environment variable guard won’t trigger under Gunicorn/UvicornThe
if __name__ == "__main__":block is bypassed when the app is run withuvicorn module:app. Move the API-key check to module import time (or FastAPI startup event) to fail fast in every deployment mode.
| try: | ||
| resp_json = r.json() | ||
| logger.debug(f"Gladia API response: {json.dumps(resp_json)}") | ||
| except: | ||
| logger.warning(f"Could not parse Gladia API response as JSON: {r.text[:200]}") |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Replace bare except: with a specific exception
Catching everything obscures real bugs and kills tracebacks. requests.Response.json() raises ValueError / json.JSONDecodeError for invalid JSON.
- except:
- logger.warning(f"Could not parse Gladia API response as JSON: {r.text[:200]}")
+ except (ValueError, json.JSONDecodeError):
+ logger.warning("Could not parse Gladia API response as JSON: %s", r.text[:200])📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try: | |
| resp_json = r.json() | |
| logger.debug(f"Gladia API response: {json.dumps(resp_json)}") | |
| except: | |
| logger.warning(f"Could not parse Gladia API response as JSON: {r.text[:200]}") | |
| try: | |
| resp_json = r.json() | |
| logger.debug(f"Gladia API response: {json.dumps(resp_json)}") | |
| except (ValueError, json.JSONDecodeError): | |
| logger.warning("Could not parse Gladia API response as JSON: %s", r.text[:200]) |
🧰 Tools
🪛 Ruff (0.8.2)
136-136: Do not use bare except
(E722)
| "sample_rate": 8000, | ||
| "encoding": "wav/pcm", | ||
| "sample_rate": 8000, | ||
| "bit_depth": 16, | ||
| "channels": 1, |
There was a problem hiding this comment.
Fix duplicate key and ordering in payload dict
"sample_rate" is declared twice. Only the last occurrence is retained by Python, which can lead to silent mis-configuration. Also, keeping "encoding" first improves readability.
- "sample_rate": 8000,
- "encoding": "wav/pcm",
- "sample_rate": 8000,
+ "encoding": "wav/pcm",
+ "sample_rate": 8000,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "sample_rate": 8000, | |
| "encoding": "wav/pcm", | |
| "sample_rate": 8000, | |
| "bit_depth": 16, | |
| "channels": 1, | |
| "encoding": "wav/pcm", | |
| "sample_rate": 8000, | |
| "bit_depth": 16, | |
| "channels": 1, |
🧰 Tools
🪛 Ruff (0.8.2)
85-85: Dictionary key literal "sample_rate" repeated
Remove repeated key literal "sample_rate"
(F601)
| current_time = time.time() | ||
| time_since_last_attempt = current_time - gladia_session["last_init_attempt"] | ||
| if time_since_last_attempt < 2 and gladia_session["last_init_attempt"] > 0: | ||
| logger.debug(f"Rate limiting: waiting before retrying Gladia session creation") | ||
| time.sleep(2 - time_since_last_attempt) | ||
|
|
||
| gladia_session["last_init_attempt"] = time.time() | ||
|
|
||
| # If we've tried too many times recently, back off | ||
| if gladia_session["retry_count"] >= MAX_RETRIES: | ||
| delay = min(MAX_RETRY_DELAY, INITIAL_RETRY_DELAY * (2 ** (gladia_session["retry_count"] - MAX_RETRIES))) | ||
| # Add jitter | ||
| delay = delay * (0.5 + random.random()) | ||
| logger.warning(f"Too many Gladia session creation attempts. Backing off for {delay:.2f} seconds") | ||
| time.sleep(delay) | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion
Avoid blocking the event-loop with time.sleep
create_session() is invoked from async contexts (handle_websocket) but calls time.sleep, blocking all other coroutines during the wait. Replace it with asyncio.sleep when an event-loop is running; fall back to time.sleep otherwise:
-import time
+import time, inspect
...
- time.sleep(2 - time_since_last_attempt)
+ delay = 2 - time_since_last_attempt
+ if inspect.iscoroutinefunction(asyncio.sleep):
+ await asyncio.sleep(delay)
+ else:
+ time.sleep(delay)A smaller helper (e.g. async_sleep(delay)) can encapsulate this pattern.
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.8.2)
67-67: f-string without any placeholders
Remove extraneous f prefix
(F541)
| # Create initial Gladia session | ||
| try: | ||
| create_session() | ||
| except Exception as e: | ||
| logger.error("Failed to create initial Gladia session: %s", e) | ||
| raise | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion
One Gladia session per process ⇒ cross-call audio mixing
create_session() is run once at startup and its URL is reused for every incoming WebSocket. If two callers connect concurrently, their audio is pushed into the same Gladia session, producing blended transcripts.
Refactor so each handle_websocket() invocation spins up its own Gladia session:
-# Create initial Gladia session
-try:
- create_session()
-except Exception as e:
- ...
+async def get_fresh_gladia_ws():
+ url = create_session()
+ return await websockets.connect(url)…and use await get_fresh_gladia_ws() inside handle_websocket.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Create initial Gladia session | |
| try: | |
| create_session() | |
| except Exception as e: | |
| logger.error("Failed to create initial Gladia session: %s", e) | |
| raise | |
| async def get_fresh_gladia_ws(): | |
| url = create_session() | |
| return await websockets.connect(url) |
| """Initialize a Gladia real-time transcription session.""" | ||
| payload = { | ||
| "encoding": "wav/ulaw", # μ-law! | ||
| "bit_depth": 8, # 8-bit μ-law | ||
| "sample_rate": 8000, # matches Vonage | ||
| "channels": 1, | ||
| } | ||
|
|
There was a problem hiding this comment.
Encoding mismatch will break recognition
create_session() registers the session with encoding="wav/ulaw", but later the code accepts frames labelled audio/l16;rate=8000 (PCM). Forwarding raw L16 bytes to a μ-law session yields garbled speech.
Either:
- Change the payload to
"encoding": "wav/pcm"(to match L16), or - Convert the incoming L16 stream to μ-law before sending.
Example quick fix (option 1):
- payload = {
- "encoding": "wav/ulaw", # μ-law!
+ payload = {
+ "encoding": "wav/pcm", # Linear PCM 16-bitFailing to align the formats will silently degrade transcription quality.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| """Initialize a Gladia real-time transcription session.""" | |
| payload = { | |
| "encoding": "wav/ulaw", # μ-law! | |
| "bit_depth": 8, # 8-bit μ-law | |
| "sample_rate": 8000, # matches Vonage | |
| "channels": 1, | |
| } | |
| """Initialize a Gladia real-time transcription session.""" | |
| payload = { | |
| "encoding": "wav/pcm", # Linear PCM 16-bit | |
| "bit_depth": 8, # 8-bit μ-law | |
| "sample_rate": 8000, # matches Vonage | |
| "channels": 1, | |
| } |
✨ (twilio-solaria-python-flask): add a new blog post on real-time Twilio call transcription using Flask and Gladia's STT API
📝 (twilio-solaria-python-flask): create .gitignore file to exclude environment and virtual environment files
📝 (twilio-solaria-python-flask): add README.md for project setup and usage instructions
📝 (twilio-solaria-python-flask): add env_setup.txt for environment variable setup instructions
✅ (twilio-solaria-python-flask): add requirements.txt for project dependencies
♻️ (twilio-solaria-python-flask): implement server.py for handling WebSocket connections and transcription logic
📝 (twilio-solaria-python-flask): add TwiML example for configuring Twilio to stream audio to the server
Summary by CodeRabbit