Speech Segmentation / Speaker Diarization

Offline speaker diarization using ONNX models (pyannote-segmentation-3.0).

Models

Model	Description
`model.onnx`	Base segmentation model from onnx-community/pyannote-segmentation-3.0
`model_with_embedding.onnx`	Extended version with speaker embeddings as an additional output (generated via `speech_embedding_export.py`)

Usage

Basic diarization

Outputs detected speakers with timestamps and confidence scores:

uv run python speech_diarizer.py

Automatically downloads the model and sample audio (mlk.wav), then prints segments like:

  SPEAKER_01      0.37s -    2.84s  (conf=0.951)
  SPEAKER_02      2.84s -    5.21s  (conf=0.876)

Diarization with embeddings

Extract per-segment speaker embeddings alongside timestamps:

uv run python speech_embedding.py

Output includes embedding dimensions for each segment, useful for downstream clustering or verification.

Export model with embeddings

Re-exports the base ONNX model to include the LeakyRelu activation (speaker embeddings) as a graph output:

uv run python speech_embedding_export.py
# Produces: model_with_embedding.onnx

Setup

Dependencies are managed via uv:

pip install uv          # if not already installed
uv sync                 # installs dependencies from pyproject.toml

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md		AGENTS.md
README.md		README.md
mlk.wav		mlk.wav
model.onnx		model.onnx
model_with_embedding.onnx		model_with_embedding.onnx
pyproject.toml		pyproject.toml
speech_diarizer.py		speech_diarizer.py
speech_embedding.py		speech_embedding.py
speech_embedding_export.py		speech_embedding_export.py
test_embedding_segments.py		test_embedding_segments.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech Segmentation / Speaker Diarization

Models

Usage

Basic diarization

Diarization with embeddings

Export model with embeddings

Setup

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Speech Segmentation / Speaker Diarization

Models

Usage

Basic diarization

Diarization with embeddings

Export model with embeddings

Setup

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages