feat: Add WebSocket support for real-time TTS streaming with multi-user capability#356
feat: Add WebSocket support for real-time TTS streaming with multi-user capability#356Robinbinu wants to merge 10 commits intoKoljaB:masterfrom
Conversation
…improve client feedback
…proved voice retrieval
…cket support for concurrent users
|
Hi @KoljaB What's new:
Perfect for LLM integrations - each user gets their own queue so multiple conversations can run simultaneously. Works with OpenAI, Kokoro, Azure, and ElevenLabs engines. Everything's backward compatible. |
|
Hi @KoljaB , |
|
Thank you so much. This looks like a great PR. I'm in Spain for the next two months, so can't really look into it for a while. |
|
Hi @KoljaB, please take your time. I was testing and found a few bugs related to handling different audio formats from different providers. They are fixed in the latest commit #57bbfcb |
Kyutai Labs' Pocket TTS - lightweight 100M parameter model with: - CPU-optimized inference (~6x real-time performance) - Voice cloning via WAV files - ~200ms latency to first audio chunk - 8 built-in voices Install with: pip install pocket-tts
Summary
Implements a WebSocket endpoint for real-time text-to-speech streaming, enabling bidirectional communication and support for multiple concurrent users. Includes a complete demo client and enhanced web UI with mode switching.
Key Features
WebSocket Endpoint (
/ws)Enhanced Web Interface
WebSocket Client Demo (
websocket_client.py)Engine Improvements
Technical Details
Engine Compatibility
✅ WebSocket-compatible: OpenAI, Kokoro, Azure, ElevenLabs
❌ Not compatible: System engine (pyttsx3) - displays clear error message
Dependencies
websockets- WebSocket client/serverpyaudio- Audio playback in demo clientFiles Changed
async_server.py- WebSocket endpoint, engine tracking, UI enhancementsstatic/tts.js- WebSocket client logic, mode switching, auto-sendwebsocket_client.py- Python demo client (new)README.md- Updated documentationBreaking Changes
None - all changes are additive and backward compatible.
Testing
Tested with OpenAI and Kokoro engines. WebSocket mode successfully handles: