Summary
Users on a third‑party (custom) STT provider are correctly exempt from Omi's transcription limits — they transcribe on their own provider and Omi only stores the resulting text (use_custom_stt short‑circuits the STT path and skips transcription credit checks + usage recording in the live /v4/listen handler). However, there is no equivalent limit or accounting on the LLM / post‑processing side for these users.
That means everything downstream of transcription — conversation structuring/summarization, memory extraction, and other LLM post‑processing — still runs on Omi's infrastructure and our LLM spend, with no plan cap or metering for custom‑STT users. A heavy custom‑STT user can therefore generate effectively unbounded Omi LLM cost while contributing no transcription revenue and bypassing the usual usage gates.
Why this matters
- Custom STT was designed so the user bears the STT cost. The LLM cost was not similarly considered.
- These users skip the transcription usage meter entirely, so the existing fair‑use / plan‑limit machinery never sees their activity and never gates the LLM work that follows.
- This is a standing, always‑on cost with no upper bound per user.
What to investigate / decide
- Quantify current LLM spend attributable to custom‑STT (and BYOK) users.
- Decide policy: should LLM post‑processing for non‑paying custom‑STT users be metered, capped, or otherwise accounted for?
- If yes, add an LLM‑side usage gate/accounting analogous to the transcription credit check, so custom‑STT sessions still count toward (or are bounded by) a plan limit for the LLM work they trigger.
Pointers
- Live custom‑STT exemption:
backend/routers/transcribe.py (use_custom_stt → credits bypassed, usage recording skipped).
- Transcription credit/limit logic:
backend/utils/subscription.py (has_transcription_credits). No LLM‑side counterpart gates conversation post‑processing.
Flagged as a follow‑up to the offline‑sync change for custom‑STT users (separate PR).
Summary
Users on a third‑party (custom) STT provider are correctly exempt from Omi's transcription limits — they transcribe on their own provider and Omi only stores the resulting text (
use_custom_sttshort‑circuits the STT path and skips transcription credit checks + usage recording in the live/v4/listenhandler). However, there is no equivalent limit or accounting on the LLM / post‑processing side for these users.That means everything downstream of transcription — conversation structuring/summarization, memory extraction, and other LLM post‑processing — still runs on Omi's infrastructure and our LLM spend, with no plan cap or metering for custom‑STT users. A heavy custom‑STT user can therefore generate effectively unbounded Omi LLM cost while contributing no transcription revenue and bypassing the usual usage gates.
Why this matters
What to investigate / decide
Pointers
backend/routers/transcribe.py(use_custom_stt→ credits bypassed, usage recording skipped).backend/utils/subscription.py(has_transcription_credits). No LLM‑side counterpart gates conversation post‑processing.Flagged as a follow‑up to the offline‑sync change for custom‑STT users (separate PR).