Real-Time TTS via WebSocket Streaming | Voters

Real-Time TTS via WebSocket Streaming

complete

Gregory Cluff

We love Hume's emotionally expressive TTS, especially with instant_mode. But for next-gen AI applications — like animated avatars and voice assistants — we'd love to see support for true real-time WebSocket streaming. This would allow developers to send text and receive playable audio in tiny chunks (≤1s) with ultra-low latency, enabling lifelike back-and-forth conversations without waiting for full MP3 rendering. Current HTTP-based streaming works well, but a socket-based approach would take Hume to the next level for live interaction and animation sync. Please consider this!

April 25, 2025

Rob Hughes

marked this post as

complete

https://dev.hume.ai/reference/text-to-speech-tts/stream-input

Richard Marmorstein

Merged in a post:

TTS websocket endpoint

Richard Marmorstein

July 17, 2025