Skip to content

UsageSummary: per-model LLM/STT/TTS usage for cost calculation #4564

@MSameerAbbas

Description

@MSameerAbbas

Feature Type

Would make my life easier

Feature Description

Today UsageSummary only aggregates totals across all LLM/STT/TTS usage, which drops the
model identity. When multiple models/providers are used in one session (fallbacks, mixed
providers, Realtime vs Chat LLMs), dataclasses.asdict(summary) does not include enough
information to compute costs per model.

Request: refactor UsageSummary so it exposes per-model usage buckets for LLM, STT, and
TTS, each including the model_name (and ideally model_provider) alongside the usage
totals. This should make dataclasses.asdict() sufficient for downstream cost accounting
without needing to re-aggregate metrics events.

Suggested shape (example, not prescriptive):

@dataclass
class LLMUsageTotals:
    model_name: str
    model_provider: str | None = None
    prompt_tokens: int = 0
    prompt_cached_tokens: int = 0
    completion_tokens: int = 0
    input_text_tokens: int = 0
    input_cached_text_tokens: int = 0
    input_image_tokens: int = 0
    input_cached_image_tokens: int = 0
    input_audio_tokens: int = 0
    input_cached_audio_tokens: int = 0
    output_text_tokens: int = 0
    output_image_tokens: int = 0
    output_audio_tokens: int = 0


@dataclass
class TTSUsageTotals:
    model_name: str
    model_provider: str | None = None
    characters_count: int = 0
    audio_duration: float = 0.0


@dataclass
class STTUsageTotals:
    model_name: str
    model_provider: str | None = None
    audio_duration: float = 0.0


@dataclass
class UsageSummary:
    llm: list[LLMUsageTotals] = field(default_factory=list)
    tts: list[TTSUsageTotals] = field(default_factory=list)
    stt: list[STTUsageTotals] = field(default_factory=list)

Collector logic would bucket metrics by (model_provider, model_name) using
metrics.metadata, with a fallback such as "unknown" when metadata is missing.
LLMMetrics and RealtimeModelMetrics should both roll up into the LLM totals.

Example asdict() output (shape only):

{
  "llm": [
    {
      "model_name": "gpt-4.1-mini",
      "model_provider": "openai",
      "prompt_tokens": 1200,
      "completion_tokens": 340,
      "input_text_tokens": 1200,
      "output_text_tokens": 340
    }
  ],
  "tts": [
    {
      "model_name": "sonic-3",
      "model_provider": "cartesia",
      "characters_count": 540,
      "audio_duration": 12.4
    }
  ],
  "stt": [
    {
      "model_name": "nova-2",
      "model_provider": "deepgram",
      "audio_duration": 8.2
    }
  ]
}

Workarounds / Alternatives

  • Subscribe to metrics_collected events and aggregate per-model usage manually based
    on metrics.metadata.
  • Parse structured logs from metrics.log_metrics and build per-model totals outside
    the SDK.

Additional Context

Potential ripple effects and files to update:

  • livekit/agents/metrics/usage_collector.py: new per-model dataclasses and bucketing logic.
  • livekit/agents/metrics/__init__.py: export new usage summary types (public API).
  • examples/**: update any usage summary logging or documentation samples if they assume
    flat llm_*, stt_*, tts_* fields.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions