Summary
Teach users how to add voice capabilities to their agents using speech-to-text and text-to-speech sidecars. Voice support enables hands-free interaction while keeping the core agent pipeline text-based for reliability and debuggability.
Course Section Outline
- Architecture overview — STT/TTS sidecars flanking the text-based agent pipeline
- Deploying Granite Speech or Faster-Whisper for speech-to-text
- Deploying Kokoro-FastAPI for text-to-speech synthesis
- Configuring voice endpoints in agent.yaml
- Gateway routing for audio requests — content type negotiation and streaming
- UI microphone capture, audio playback, and push-to-talk integration
- FIPS considerations for media transport and audio codec selection
Lab Exercise
Deploy STT and TTS sidecars alongside an existing agent. Configure the gateway to route audio. Use the UI to record a voice question, observe the transcription, read the agent's text response, and hear the TTS playback. Test the complete voice conversation flow end-to-end.
Companion Issues
Companion issues filed on fips-agents/agent-template, fips-agents/gateway-template, fips-agents/ui-template, and fips-agents/fips-agents-cli.
Size
M
Summary
Teach users how to add voice capabilities to their agents using speech-to-text and text-to-speech sidecars. Voice support enables hands-free interaction while keeping the core agent pipeline text-based for reliability and debuggability.
Course Section Outline
Lab Exercise
Deploy STT and TTS sidecars alongside an existing agent. Configure the gateway to route audio. Use the UI to record a voice question, observe the transcription, read the agent's text response, and hear the TTS playback. Test the complete voice conversation flow end-to-end.
Companion Issues
Companion issues filed on fips-agents/agent-template, fips-agents/gateway-template, fips-agents/ui-template, and fips-agents/fips-agents-cli.
Size
M