Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions recipes/python/speech-to-text/v1/multichannel/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Multichannel Transcription (Speech-to-Text v1)

Transcribe each audio channel independently, producing a separate transcript per channel.

## What it does

When `multichannel=True` is set, Deepgram processes each audio channel as a separate stream. This is ideal for stereo recordings where different speakers occupy different channels (e.g., call-centre recordings with agent on one channel and caller on the other). Each channel gets its own transcript in the response.

## Key parameters

| Parameter | Value | Description |
|-----------|-------|-------------|
| `multichannel` | `True` | Transcribe each audio channel independently |
| `model` | `"nova-3"` | Transcription model |
| `smart_format` | `True` | Format numbers, dates, etc. |

## Example output

```
Channel 0: Yeah, as much as it's worth celebrating the 50th anniversary of the spacewalk...
```

## Prerequisites

- Python 3.10+
- Set `DEEPGRAM_API_KEY` environment variable
- Install: `pip install -r recipes/python/requirements.txt`

## Run

```bash
python example.py
```

## Test

```bash
pytest example_test.py -v
```
38 changes: 38 additions & 0 deletions recipes/python/speech-to-text/v1/multichannel/example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
"""
Recipe: Multichannel Transcription (Speech-to-Text v1)
======================================================
Demonstrates the `multichannel` feature, which transcribes each audio
channel independently.

Without multichannel: all audio is mixed into a single channel[0].
With multichannel: each channel gets its own transcript in
response.results.channels[i].

This is useful for stereo recordings where each speaker is on a
separate channel (e.g., call-centre audio with agent on left and
caller on right).
"""

from deepgram import DeepgramClient

AUDIO_URL = "https://dpgr.am/spacewalk.wav"


def main():
client = DeepgramClient() # reads DEEPGRAM_API_KEY from environment

response = client.listen.v1.media.transcribe_url(
url=AUDIO_URL,
model="nova-3",
smart_format=True,
multichannel=True, # <-- THIS transcribes each channel separately.
)

if response.results and response.results.channels:
for i, channel in enumerate(response.results.channels):
transcript = channel.alternatives[0].transcript
print(f"Channel {i}: {transcript[:150]}")


if __name__ == "__main__":
main()
17 changes: 17 additions & 0 deletions recipes/python/speech-to-text/v1/multichannel/example_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
import subprocess
from pathlib import Path

def test_example_runs():
"""Runs the multichannel example and verifies it produces output."""
example = Path(__file__).parent / "example.py"
result = subprocess.run(
["python", str(example)],
capture_output=True,
text=True,
timeout=60,
)
assert result.returncode == 0, (
f"Example failed\nSTDOUT: {result.stdout}\nSTDERR: {result.stderr}"
)
assert result.stdout.strip(), "Example produced no output"
assert "Channel" in result.stdout, "Expected channel labels in output"
Loading