Motivation
Omi currently relies on Whisper/Deepgram for speech-to-text. SenseVoice (8.3K+ stars) offers significant advantages for a wearable AI device:
- 5x faster than Whisper large-v3 (non-autoregressive architecture, 234M params)
- 50+ languages with automatic language detection
- Built-in emotion detection (happy, sad, angry, neutral) — valuable for an AI companion
- Audio event detection (laughter, applause, music, crying) — enriches memory context
- Low latency — critical for real-time wearable experiences
Why this matters for Omi
- Battery life: Faster inference = less compute = longer battery on wearable hardware
- Richer context: Emotion + audio events give the AI companion deeper understanding of conversations
- Self-hosted option: Users concerned about privacy can run SenseVoice locally (works on CPU)
- Cost reduction: For cloud processing, faster inference means lower API costs
Integration
SenseVoice provides an OpenAI-compatible API server:
pip install funasr
funasr-server --device cuda
# Drop-in replacement for Whisper API
import openai
client = openai.OpenAI(base_url="http://localhost:8000/v1")
result = client.audio.transcriptions.create(
model="iic/SenseVoiceSmall",
file=open("audio.wav", "rb"),
)
print(result.text) # transcription with emotion tags
Or use the Python API directly:
from funasr import AutoModel
model = AutoModel(model="iic/SenseVoiceSmall")
result = model.generate(input="audio.wav")
# Returns: text + language + emotion + audio events
Benchmarks
| Model |
Params |
Speed (RTF) |
CER (AISHELL-1) |
Languages |
| Whisper-large-v3 |
1.5B |
1.0x |
8.4% |
99 |
| SenseVoice-Small |
234M |
5x |
3.6% |
50+ |
References
Happy to help with integration if there's interest.
Motivation
Omi currently relies on Whisper/Deepgram for speech-to-text. SenseVoice (8.3K+ stars) offers significant advantages for a wearable AI device:
Why this matters for Omi
Integration
SenseVoice provides an OpenAI-compatible API server:
Or use the Python API directly:
Benchmarks
References
pip install funasr(1M+ monthly downloads)Happy to help with integration if there's interest.