Skip to content

Conversation

@davidzhao
Copy link
Member

@davidzhao davidzhao commented Jan 20, 2026

exposing this as AgentSession.usage, ensuring collection is performed for each model & provider pair.

this also updates the UsageSummary fields to be more consistent

  • prompt_token -> input_token
  • completion_token -> output_token

Summary by CodeRabbit

  • New Features

    • Added per-provider and per-model usage aggregation for detailed metrics tracking.
    • Introduced new telemetry attributes for measuring latency: time-to-first-token, time-to-first-byte, and end-to-end latency.
    • Session-level usage tracking now directly accessible from agent sessions.
  • Refactor

    • Streamlined usage collection across example applications.
    • Enhanced propagation of model and provider information throughout voice infrastructure.

✏️ Tip: You can customize this high-level summary in your review settings.

@davidzhao davidzhao requested a review from a team January 20, 2026 07:34
exposing this as AgentSession.usage, ensuring collection is performed
for each model & provider pair.

this also updates the UsageSummary fields to be more consistent
- prompt_token -> input_token
- completion_token -> output_token
@davidzhao davidzhao changed the base branch from main to chenghaomou/v1.4.0 January 20, 2026 07:37
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 20, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

The pull request enhances usage metrics tracking by introducing per-provider/model aggregation in UsageCollector, adds model and provider information propagation through the voice agent pipeline for improved instrumentation, includes new telemetry constants for latency tracking, and simplifies example code by replacing collector-based logging with direct session.usage access.

Changes

Cohort / File(s) Summary
Example Usage Logging Simplification
examples/bank-ivr/ivr_navigator_agent.py, examples/bank-ivr/ivr_system_agent.py, examples/dtmf/basic_dtmf_agent.py, examples/survey/survey_agent.py, examples/voice_agents/multi_agent.py, examples/voice_agents/zapier_mcp_integration.py
Removed UsageCollector instantiation and replaced collector-based usage summary logging with direct session.usage logging; simplified metrics handling in shutdown callbacks.
Enhanced Voice Agent Tool Support
examples/voice_agents/basic_agent.py
Introduced EndCallTool import and added to agent tools; removed UsageCollector usage; updated logging to use session.usage directly.
Usage Tracking Infrastructure
livekit-agents/livekit/agents/voice/agent_session.py, livekit-agents/livekit/agents/job.py, livekit-agents/livekit/agents/voice/report.py
Added internal UsageCollector to AgentSession for metrics collection; exposed usage property returning per-model/provider summaries; added usage field to SessionReport; updated session report construction to include usage data.
UsageCollector Refactor
livekit-agents/livekit/agents/metrics/usage_collector.py
Restructured UsageSummary with new provider and model fields; replaced llm_prompt_tokens/llm_completion_tokens with llm_input_tokens/llm_output_tokens; refactored UsageCollector to aggregate metrics per (provider, model) tuple; added backwards-compatible property aliases; changed get_summary() to return list of UsageSummary; added helper methods _get_summary(), _extract_provider_model(), to_dict(), and __repr__().
Telemetry Constants
livekit-agents/livekit/agents/telemetry/trace_types.py
Added new public constants: ATTR_LLM_NODE_TTFT, ATTR_TTS_NODE_TTFB, ATTR_E2E_LATENCY, ATTR_GEN_AI_PROVIDER_NAME for enhanced tracing.
Telemetry Integration
livekit-agents/livekit/agents/telemetry/traces.py
Updated _to_proto_chat_item to use preserving_proto_field_name=True; enhanced session report logging to include optional usage field as list of dicts when present.
Voice Agent Metrics and Tracing
livekit-agents/livekit/agents/voice/agent_activity.py
Added propagation of stt_model, stt_provider, model, and provider parameters through STT/TTS/LLM inference calls; moved e2e_latency calculation to local variable and updated span with latency attribute.
Audio Recognition Enhancement
livekit-agents/livekit/agents/voice/audio_recognition.py
Added optional stt_model and stt_provider parameters to constructor; stored as instance attributes; augmented user turn span with model and provider trace attributes when present.
LLM and TTS Inference Enhancement
livekit-agents/livekit/agents/voice/generation.py
Added optional model and provider parameters to perform_llm_inference, _llm_inference_task, perform_tts_inference, and _tts_inference_task; propagated parameters through to tracing spans with GEN_AI_REQUEST_MODEL and GEN_AI_PROVIDER_NAME attributes; added TTfB timing annotation on TTS node spans.

Sequence Diagram(s)

sequenceDiagram
    participant Agent as Agent Activity
    participant Session as AgentSession
    participant Collector as UsageCollector
    participant Metrics as Metrics Event
    participant Report as SessionReport
    
    Agent->>Agent: Emit Metrics (LLM/TTS/STT)
    Metrics->>Session: MetricsCollectedEvent
    Session->>Collector: collect(metrics)
    Collector->>Collector: Extract provider/model
    Collector->>Collector: Aggregate to per-(provider,model) summary
    Agent->>Session: Request session.usage
    Session->>Collector: get_summary()
    Collector->>Session: Return list[UsageSummary]
    Session->>Report: Populate usage field
    Report->>Report: to_dict() - serialize usage list
Loading
sequenceDiagram
    participant Agent as Voice Agent
    participant Activity as Agent Activity
    participant Gen as Generation Functions
    participant Span as OpenTelemetry Span
    participant Metrics as Metrics Collector
    
    Agent->>Activity: Start with model/provider config
    Activity->>Gen: Call perform_llm_inference(model, provider)
    Gen->>Span: Set GEN_AI_REQUEST_MODEL attribute
    Gen->>Span: Set GEN_AI_PROVIDER_NAME attribute
    Gen->>Gen: Execute LLM inference
    Gen->>Metrics: Emit LLMMetrics with provider/model
    Metrics->>Activity: Complete inference
    Activity->>Activity: Calculate e2e_latency
    Activity->>Span: Set ATTR_E2E_LATENCY
    Activity->>Gen: Call perform_tts_inference(model, provider)
    Gen->>Span: Set TTS model/provider attributes
    Gen->>Gen: Stream TTS output with TTfB timing
    Gen->>Span: Annotate ATTR_TTS_NODE_TTFB
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 Hops of joy through metrics flow,
Per-model traces steal the show!
Provider names now neatly tied,
E2E latency won't hide!
Usage sums leap through the sky,
Cleaner logs say "goodbye, goodbye!" 🎉

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.34% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'automatically collect session usage' accurately describes the main objective of the PR, which is to expose session usage via AgentSession.usage and ensure automatic collection for each model & provider pair.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@davidzhao
Copy link
Member Author

@codex review

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. Hooray!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@longcw
Copy link
Contributor

longcw commented Jan 20, 2026

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 20, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@livekit-agents/livekit/agents/voice/agent_session.py`:
- Around line 370-378: The _usage_collector is only created once in the
constructor, so usage metrics accumulate across session restarts; reset it when
a session starts/restarts by reassigning self._usage_collector =
UsageCollector() at the beginning of the session start/restart routine (the
method that initializes a run for the AgentSession), so MetricsCollectedEvent
handling in emit continues to collect only for the current run (refer to
_usage_collector, emit, and MetricsCollectedEvent to locate the logic).
🧹 Nitpick comments (2)
examples/voice_agents/zapier_mcp_integration.py (1)

64-66: Consider adding usage logging for consistency.

Unlike other example agents in this PR (e.g., basic_agent.py, multi_agent.py, ivr_navigator_agent.py), this file doesn't log session.usage at shutdown. If this is intentional to keep the MCP example minimal, this is fine. Otherwise, consider adding:

async def log_usage():
    logger.info(f"Usage: {session.usage}")

ctx.add_shutdown_callback(log_usage)
livekit-agents/livekit/agents/metrics/usage_collector.py (1)

143-145: Consider deterministic ordering for usage summaries.

dict.values() order follows insertion, which can vary by ingestion order; sorting can keep logs/reports stable.

♻️ Proposed tweak
-        return [deepcopy(s) for s in self._summaries.values()]
+        return [deepcopy(self._summaries[key]) for key in sorted(self._summaries)]
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d6ec0cc and 3ccfdcd.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (16)
  • examples/bank-ivr/ivr_navigator_agent.py
  • examples/bank-ivr/ivr_system_agent.py
  • examples/dtmf/basic_dtmf_agent.py
  • examples/survey/survey_agent.py
  • examples/voice_agents/basic_agent.py
  • examples/voice_agents/multi_agent.py
  • examples/voice_agents/zapier_mcp_integration.py
  • livekit-agents/livekit/agents/job.py
  • livekit-agents/livekit/agents/metrics/usage_collector.py
  • livekit-agents/livekit/agents/telemetry/trace_types.py
  • livekit-agents/livekit/agents/telemetry/traces.py
  • livekit-agents/livekit/agents/voice/agent_activity.py
  • livekit-agents/livekit/agents/voice/agent_session.py
  • livekit-agents/livekit/agents/voice/audio_recognition.py
  • livekit-agents/livekit/agents/voice/generation.py
  • livekit-agents/livekit/agents/voice/report.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-agents/livekit/agents/voice/agent_session.py
  • livekit-agents/livekit/agents/voice/audio_recognition.py
  • livekit-agents/livekit/agents/telemetry/trace_types.py
  • examples/bank-ivr/ivr_system_agent.py
  • examples/dtmf/basic_dtmf_agent.py
  • examples/voice_agents/multi_agent.py
  • livekit-agents/livekit/agents/job.py
  • livekit-agents/livekit/agents/telemetry/traces.py
  • livekit-agents/livekit/agents/voice/report.py
  • examples/voice_agents/zapier_mcp_integration.py
  • livekit-agents/livekit/agents/voice/agent_activity.py
  • examples/voice_agents/basic_agent.py
  • livekit-agents/livekit/agents/metrics/usage_collector.py
  • examples/survey/survey_agent.py
  • examples/bank-ivr/ivr_navigator_agent.py
  • livekit-agents/livekit/agents/voice/generation.py
🧠 Learnings (1)
📚 Learning: 2026-01-16T07:44:56.353Z
Learnt from: CR
Repo: livekit/agents PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-16T07:44:56.353Z
Learning: Implement Model Interface Pattern for STT, TTS, LLM, and Realtime models with provider-agnostic interfaces, fallback adapters for resilience, and stream adapters for different streaming patterns

Applied to files:

  • livekit-agents/livekit/agents/voice/agent_activity.py
🧬 Code graph analysis (13)
livekit-agents/livekit/agents/voice/agent_session.py (2)
livekit-agents/livekit/agents/metrics/usage_collector.py (4)
  • UsageCollector (67-145)
  • UsageSummary (9-64)
  • collect (94-141)
  • get_summary (143-145)
livekit-agents/livekit/agents/voice/events.py (1)
  • MetricsCollectedEvent (145-148)
examples/bank-ivr/ivr_system_agent.py (2)
livekit-agents/livekit/agents/voice/agent_activity.py (1)
  • session (252-253)
livekit-agents/livekit/agents/voice/agent_session.py (1)
  • usage (444-446)
examples/dtmf/basic_dtmf_agent.py (1)
livekit-agents/livekit/agents/voice/agent_session.py (1)
  • usage (444-446)
examples/voice_agents/multi_agent.py (1)
livekit-agents/livekit/agents/voice/agent_session.py (1)
  • usage (444-446)
livekit-agents/livekit/agents/job.py (1)
livekit-agents/livekit/agents/voice/agent_session.py (1)
  • usage (444-446)
livekit-agents/livekit/agents/telemetry/traces.py (3)
livekit-agents/livekit/agents/metrics/usage_collector.py (1)
  • to_dict (58-60)
livekit-agents/livekit/agents/voice/report.py (1)
  • to_dict (33-66)
livekit-agents/livekit/agents/voice/agent_session.py (1)
  • usage (444-446)
livekit-agents/livekit/agents/voice/report.py (2)
livekit-agents/livekit/agents/metrics/usage_collector.py (2)
  • UsageSummary (9-64)
  • to_dict (58-60)
livekit-agents/livekit/agents/voice/agent_session.py (1)
  • usage (444-446)
livekit-agents/livekit/agents/voice/agent_activity.py (4)
livekit-agents/livekit/agents/voice/audio_recognition.py (2)
  • model (54-55)
  • provider (58-59)
livekit-agents/livekit/agents/stt/stream_adapter.py (2)
  • model (38-39)
  • provider (42-43)
livekit-agents/livekit/agents/tts/tts.py (2)
  • model (89-98)
  • provider (101-110)
livekit-agents/livekit/agents/llm/llm.py (2)
  • model (99-108)
  • provider (111-120)
examples/voice_agents/basic_agent.py (2)
livekit-agents/livekit/agents/voice/agent_session.py (2)
  • tools (440-441)
  • usage (444-446)
livekit-agents/livekit/agents/beta/tools/end_call.py (1)
  • EndCallTool (6-28)
livekit-agents/livekit/agents/metrics/usage_collector.py (1)
livekit-agents/livekit/agents/metrics/base.py (4)
  • LLMMetrics (13-27)
  • STTMetrics (30-41)
  • TTSMetrics (44-57)
  • RealtimeModelMetrics (91-132)
examples/survey/survey_agent.py (3)
livekit-agents/livekit/agents/llm/tool_context.py (1)
  • info (142-143)
livekit-agents/livekit/agents/voice/agent_activity.py (1)
  • session (252-253)
livekit-agents/livekit/agents/voice/agent_session.py (1)
  • usage (444-446)
examples/bank-ivr/ivr_navigator_agent.py (2)
livekit-agents/livekit/agents/voice/agent_activity.py (1)
  • session (252-253)
livekit-agents/livekit/agents/voice/agent_session.py (1)
  • usage (444-446)
livekit-agents/livekit/agents/voice/generation.py (2)
livekit-agents/livekit/agents/llm/llm.py (2)
  • model (99-108)
  • provider (111-120)
livekit-agents/livekit/agents/telemetry/traces.py (1)
  • start_as_current_span (62-64)
🔇 Additional comments (33)
examples/bank-ivr/ivr_navigator_agent.py (1)

116-117: LGTM!

The change correctly simplifies usage logging by leveraging the new session.usage property, which internally delegates to the UsageCollector. This aligns with the PR's goal of exposing usage summaries directly from AgentSession.

livekit-agents/livekit/agents/telemetry/traces.py (2)

340-340: LGTM!

Using preserving_proto_field_name=True maintains consistency with snake_case naming conventions used elsewhere in the codebase and ensures field names in the serialized output match the canonical protobuf definitions.


381-381: LGTM!

The usage serialization correctly handles the optional report.usage field by conditionally converting each UsageSummary to a dict, consistent with the to_dict() method defined in usage_collector.py.

examples/voice_agents/multi_agent.py (1)

159-160: LGTM!

The simplified usage logging via session.usage is consistent with the pattern applied across all example agents in this PR.

examples/bank-ivr/ivr_system_agent.py (1)

652-653: LGTM!

The usage logging correctly leverages session.usage. The use of %s format specifier is preferred for logging as it enables lazy evaluation.

examples/voice_agents/basic_agent.py (3)

18-18: LGTM!

The EndCallTool import enables the agent to gracefully terminate calls when the user indicates they're done.


39-39: LGTM!

Adding EndCallTool() to the agent's tools list allows the LLM to end calls when appropriate (e.g., when the user says goodbye), improving the user experience.


115-116: LGTM!

The simplified usage logging via session.usage is consistent with the PR's objective to expose usage summaries directly from AgentSession.

examples/dtmf/basic_dtmf_agent.py (1)

149-150: LGTM: direct session usage logging.

This is consistent with the new session-level usage reporting.

livekit-agents/livekit/agents/telemetry/trace_types.py (1)

60-63: LGTM: new trace attribute constants are clear and consistent.

Also applies to: 69-69

livekit-agents/livekit/agents/metrics/usage_collector.py (4)

10-25: LGTM: provider/model fields and token renames are well integrated.


33-56: LGTM: backward-compatible token aliases preserve old API usage.


58-64: LGTM: compact serialization helpers are clear.


76-141: LGTM: per-provider/model aggregation logic is solid.

livekit-agents/livekit/agents/job.py (1)

266-278: LGTM: session usage is now included in SessionReport.

examples/survey/survey_agent.py (1)

349-350: LGTM: direct session usage logging.

livekit-agents/livekit/agents/voice/agent_session.py (2)

31-55: LGTM: usage collector wiring and metrics event import are in place.


443-446: LGTM: session.usage provides a clean, typed access point.

livekit-agents/livekit/agents/voice/report.py (2)

30-31: LGTM!

The new usage field is well-typed with proper Optional semantics and includes a descriptive docstring consistent with the other timestamp fields in this dataclass.


65-71: LGTM!

The serialization logic correctly delegates to UsageSummary.to_dict() for each summary item. The conditional check in to_dict is a reasonable optimization to avoid the method call when usage is None.

livekit-agents/livekit/agents/voice/audio_recognition.py (2)

111-112: LGTM!

The new stt_model and stt_provider parameters are properly typed as optional strings and correctly stored as instance attributes, following the existing parameter patterns in this constructor.

Also applies to: 126-127


1010-1018: LGTM!

The span attributes are conditionally set only when the values are present, which is the correct approach. Using the ATTR_GEN_AI_REQUEST_MODEL and ATTR_GEN_AI_PROVIDER_NAME constants ensures consistency with the broader telemetry implementation.

livekit-agents/livekit/agents/voice/agent_activity.py (5)

609-610: LGTM!

The STT model and provider are correctly extracted from self.stt with proper null-safety checks, enabling telemetry to capture STT metadata for user turn spans.


1941-1942: LGTM!

The model and provider parameters are consistently propagated to both LLM and TTS inference calls using the same null-safe access pattern across all call sites.

Also applies to: 1959-1960


2106-2108: LGTM!

The refactor to extract e2e_latency into a local variable before assigning it to both the metrics dictionary and span attribute is clean. This ensures the span telemetry captures this important latency metric.


2473-2474: LGTM!

TTS model/provider propagation in the realtime generation path is consistent with other call sites.


1757-1758: LGTM!

TTS model/provider propagation in the _tts_task_impl path follows the established pattern.

livekit-agents/livekit/agents/voice/generation.py (6)

63-64: LGTM!

The new model and provider parameters in perform_llm_inference are properly typed and correctly forwarded to the internal inference task.

Also applies to: 70-70


92-93: LGTM!

The attribute dictionary pattern is clean and efficient. Conditionally adding model and provider only when present ensures spans don't contain misleading empty or null attribute values.

Also applies to: 102-114


185-186: LGTM!

Recording ATTR_LLM_NODE_TTFT (time to first token) on the span provides valuable latency instrumentation for LLM inference.


203-204: LGTM!

The model and provider parameters in perform_tts_inference mirror the LLM inference pattern, maintaining consistency across the inference functions.

Also applies to: 211-211


232-233: LGTM!

Model and provider span attributes are correctly set at the start of TTS node inference, enabling proper attribution of TTS operations in telemetry.

Also applies to: 240-245


268-269: LGTM!

Recording ATTR_TTS_NODE_TTFB (time to first byte) on the span provides valuable latency instrumentation for TTS inference. The span is retrieved again here because this code executes inside the streaming loop after the first audio frame is received.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Contributor

@longcw longcw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

Comment on lines 19 to 23
llm_input_audio_tokens: int = 0
llm_input_cached_audio_tokens: int = 0
llm_input_text_tokens: int = 0
llm_input_cached_text_tokens: int = 0
llm_input_image_tokens: int = 0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's probably a good idea to remove the llm_ from the prefixes.. since we'll be reporting a UsageSummary per model.

perhaps including a model_type field to help differentiate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants