Conversation
Inworld TTS streams default to MP3, but the plugin only decodes the first
chunk via av.open() — subsequent mid-stream MP3 chunks fail to parse
('Invalid data found when processing input'). The bug is silent for short
inputs that fit in one chunk but breaks any reply long enough to span
multiple chunks.
Setting audioConfig.audioEncoding=LINEAR16 makes Inworld return each chunk
as a self-contained RIFF WAV, which the existing per-chunk decode path
handles cleanly.
Inworld TTS v2 (currently in pre-release) is API-compatible with v1 — same
streaming endpoint, same payload format, plus a new per-chunk 'usage' field
({processedCharactersCount, modelId}). Adding it to the Literal lets users
opt in; flipping the default exposes the new model to anyone constructing
inworld.TTS() without arguments.
Default Gemini in the example was gemini-3.1-pro-preview which has ~3-5s latency per turn — slow enough to break the conversational feel of an expressive-TTS demo. Flash-lite consistently lands replies in <1s while still picking the right Inworld steering tags ([whisper], [laugh], [sigh], [shout]) from the audio guide.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughChangelog updated for Inworld TTS v2. Built-in HTTP session routes moved under /calls/{call_id}/..., Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
plugins/inworld/vision_agents/plugins/inworld/tts.py (1)
52-53:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win
model_iddocstring is stale after changing the default.
The docstring still says default is"inworld-tts-1.5-max", but code default is now"inworld-tts-2"(Line 43). Update options/default text to match runtime behavior.Proposed fix
- model_id: The model ID to use for synthesis. Options: "inworld-tts-1.5-max", - "inworld-tts-1.5-mini" (default: "inworld-tts-1.5-max"). + model_id: The model ID to use for synthesis. Options: "inworld-tts-1.5-max", + "inworld-tts-1.5-mini", "inworld-tts-1", "inworld-tts-1-max", + "inworld-tts-2" (default: "inworld-tts-2").
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0893cced-59fe-49f2-9093-9b57706e28de
📒 Files selected for processing (3)
CHANGELOG.mdplugins/inworld/example/inworld_tts_example.pyplugins/inworld/vision_agents/plugins/inworld/tts.py
…arah, rewrite audio guide for TTS-2
There was a problem hiding this comment.
Actionable comments posted: 1
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 72a09d48-73d3-4f61-bdb6-09c81df3a2bf
📒 Files selected for processing (3)
plugins/inworld/README.mdplugins/inworld/example/inworld-audio-guide.mdplugins/inworld/vision_agents/plugins/inworld/tts.py
✅ Files skipped from review due to trivial changes (1)
- plugins/inworld/README.md
🚧 Files skipped from review as they are similar to previous changes (1)
- plugins/inworld/vision_agents/plugins/inworld/tts.py
| ``` | ||
| [say sadly with deliberate pauses in a low voice and hushed style] I'm sorry, that didn't work. | ||
| ``` |
There was a problem hiding this comment.
Add language identifiers to fenced code blocks.
These fences trigger markdownlint MD040. Add a language (for example text) to each fenced block.
Proposed fix
-```
+```text
[say sadly with deliberate pauses in a low voice and hushed style] I'm sorry, that didn't work.@@
- +text
[say warmly and a little excited] I'd be glad to help with that. [breathe] Here's what you need to know...
@@
-```
+```text
[say sadly with deliberate pauses in a low voice] Unfortunately, that's not possible. [sigh] Let me explain why...
@@
- +text
[say excitedly with a high pitch and fast pace] Oh, that's fascinating — I just realized something important.
@@
-```
+```text
[say slowly and thoughtfully] Let me think about this... [breathe] Yes, I believe the solution is...
@@
- +text
[clear throat] [say crisply with a measured pace] Actually, there's been a misunderstanding. Let me clarify...
@@
-```
+```text
[whisper in a hushed style] Between you and me, the real answer is simpler than it looks.
</details>
Also applies to: 56-58, 61-63, 66-68, 71-73, 76-78, 81-83
<details>
<summary>🧰 Tools</summary>
<details>
<summary>🪛 markdownlint-cli2 (0.22.1)</summary>
[warning] 24-24: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
</details>
</details>
<!-- fingerprinting:phantom:triton:hawk -->
<!-- d98c2f50 -->
<!-- This is an auto-generated comment by CodeRabbit -->
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
plugins/inworld/example/inworld_tts_example.py (1)
10-10:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winRemove stale Smart Turn claim from module docstring.
Line 10 says Smart Turn is part of this example, but the plugin import and
turn_detectionwiring were removed. Update the feature list to match current behavior.Proposed fix
-- Smart Turn for turn detection
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 9d744a9f-77d8-4ff1-b4ca-d83cf5817024
📒 Files selected for processing (1)
plugins/inworld/example/inworld_tts_example.py
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
plugins/inworld/example/inworld_tts_example.py (1)
6-10:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winStale docstring — remove the "Smart Turn" line.
smart_turnwas removed from both imports and the agent config, but the module docstring still lists it as a component.Proposed fix
This example creates an agent that uses: - Inworld AI for text-to-speech (TTS) - Stream for edge/real-time communication - Deepgram for speech-to-text (STT) -- Smart Turn for turn detection
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b52d7323-dacb-41d6-bfe2-a0d21e51484d
📒 Files selected for processing (1)
plugins/inworld/example/inworld_tts_example.py
Why
The Inworld TTS plugin only decoded the first streaming chunk via
av.open(). Inworld's default audio encoding is MP3, where mid-stream chunks aren't self-contained — they fail to parse withInvalid data found when processing input. The bug was silent for short replies that fit in one chunk and only surfaced once a reply was long enough to span multiple chunks, so the demo "worked" right up until it didn't.Forcing
audioConfig.audioEncoding=LINEAR16makes Inworld emit each chunk as a self-contained RIFF WAV, which the existing per-chunk decode path already handles cleanly. No decoder rewrite needed.While in here: Inworld TTS v2 (currently in pre-release, API-compatible with v1) is added to the model
Literaland made the default — anyone constructinginworld.TTS()without arguments now gets the newer model. The example's LLM is also swapped fromgemini-3.1-pro-previewtogemini-3.1-flash-lite-previewbecause the pro variant's ~3-5s per-turn latency drowned out the expressive-TTS demo; flash-lite lands replies in <1s and still picks the right Inworld steering tags ([whisper],[laugh],[sigh],[shout]) from the audio guide.Changes
LINEAR16audio encoding in the streaming TTS payloadinworld-tts-2to the modelLiteraland use it as the defaultgemini-3.1-flash-lite-preview