Skip to content

Nova Sonic 2.0 Proactive Speech (SYSTEM_SPEECH) and Cross-Modal Interactive Input #4574

@tonyfruzza

Description

@tonyfruzza

Feature Type

I cannot use LiveKit without it

Feature Description

Amazon Nova Sonic 2.0 introduces two powerful new capabilities that are not currently supported in the AWS realtime plugin:

  1. SYSTEM_SPEECH Role - A new role that allows the system to inject content that the assistant will speak aloud. Unlike the existing SYSTEM role (which provides silent instructions), SYSTEM_SPEECH content is vocalized by the assistant.

  2. Cross-Modal Interactive Input - The ability to inject text messages during an active voice session using interactive: true, enabling mixed audio and text input in the same conversation.

Use Cases

Proactive Speech (SYSTEM_SPEECH):

  • Proactively informing users of events or notifications without waiting for user input
  • Injecting context for the assistant to announce (e.g., "Your meeting starts in 5 minutes")
  • Triggering assistant speech based on external events or tool results

Cross-Modal Interactive Input:

  • Injecting text-based context mid-conversation (e.g., data from a database query)
  • Sending structured input alongside voice
  • Providing tool results or external data as text during voice sessions

AWS Documentation References

Proposed API

from livekit.plugins.aws.experimental.realtime import RealtimeModel, RealtimeSession

# Create Nova Sonic 2.0 session
model = RealtimeModel.with_nova_sonic_2()
session = model.session()

# Proactive speech - assistant speaks this aloud
await session.inject_system_speech("You have a new message from John.")

# Cross-modal text input during voice session
await session.send_text_input("What's the weather in Seattle?")

Implementation Notes

  • These features are Nova Sonic 2.0 only (amazon.nova-2-sonic-v1:0)
  • The existing ROLE type needs to be extended to include SYSTEM_SPEECH
  • Version gating should prevent usage with Nova Sonic 1.0

Environment

  • Plugin: livekit-plugins-aws
  • Module: livekit.plugins.aws.experimental.realtime
  • Nova Sonic version: 2.0 (amazon.nova-2-sonic-v1:0)

Workarounds / Alternatives

No response

Additional Context

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions