This English-only model is more accurate than Whisper and much faster: <img width="1164" height="588" alt="Image" src="https://github.com/user-attachments/assets/29360ef2-217e-4e1e-bef6-3653fd505916" /> ([Open ASR Leaderboard](https://huggingface.co/spaces/hf-audio/open_asr_leaderboard)) Model: https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2 Example implementations in Python: - https://github.com/SridharSampath/parakeet-asr-demo (CUDA) - https://github.com/senstella/parakeet-mlx (Apple MLX) - https://github.com/jfgonsalves/parakeet-diarized (Includes pyannote diarization)
This English-only model is more accurate than Whisper and much faster:
(Open ASR Leaderboard)
Model: https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2
Example implementations in Python: