Skip to content

Custom STT (BYOK) users: uncapped LLM/post-processing spend (no limit or accounting) #7690

@mdmohsin7

Description

@mdmohsin7

Summary

Users on a third‑party (custom) STT provider are correctly exempt from Omi's transcription limits — they transcribe on their own provider and Omi only stores the resulting text (use_custom_stt short‑circuits the STT path and skips transcription credit checks + usage recording in the live /v4/listen handler). However, there is no equivalent limit or accounting on the LLM / post‑processing side for these users.

That means everything downstream of transcription — conversation structuring/summarization, memory extraction, and other LLM post‑processing — still runs on Omi's infrastructure and our LLM spend, with no plan cap or metering for custom‑STT users. A heavy custom‑STT user can therefore generate effectively unbounded Omi LLM cost while contributing no transcription revenue and bypassing the usual usage gates.

Why this matters

  • Custom STT was designed so the user bears the STT cost. The LLM cost was not similarly considered.
  • These users skip the transcription usage meter entirely, so the existing fair‑use / plan‑limit machinery never sees their activity and never gates the LLM work that follows.
  • This is a standing, always‑on cost with no upper bound per user.

What to investigate / decide

  1. Quantify current LLM spend attributable to custom‑STT (and BYOK) users.
  2. Decide policy: should LLM post‑processing for non‑paying custom‑STT users be metered, capped, or otherwise accounted for?
  3. If yes, add an LLM‑side usage gate/accounting analogous to the transcription credit check, so custom‑STT sessions still count toward (or are bounded by) a plan limit for the LLM work they trigger.

Pointers

  • Live custom‑STT exemption: backend/routers/transcribe.py (use_custom_stt → credits bypassed, usage recording skipped).
  • Transcription credit/limit logic: backend/utils/subscription.py (has_transcription_credits). No LLM‑side counterpart gates conversation post‑processing.

Flagged as a follow‑up to the offline‑sync change for custom‑STT users (separate PR).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingp2Priority: Important (score 14-21)

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions