Skip to content

Automatic speech recognition with speaker diarisation

License

Notifications You must be signed in to change notification settings

HanBnrd/NeMoASR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NeMoASR

Automatic speech recognition with speaker diarisation.

Based on:

Requirements

Python 3.12+

Setup

Linux:

sudo apt install ffmpeg
conda create -n nemoasr python=3.12
conda activate nemoasr
pip install git+https://github.com/HanBnrd/NeMoASR.git

MacOS:

brew install ffmpeg
conda create -n nemoasr python=3.12
conda activate nemoasr
pip install git+https://github.com/HanBnrd/NeMoASR.git

Update NeMoASR

pip install --upgrade git+https://github.com/HanBnrd/NeMoASR.git

Usage

To transcribe a WAV or MPEG file:

nemoasr myfile.mp3

Note: running this for the first time may be long as the models need to be downloaded.

The default configuration cuts long audio files into 7-minute chunks, which should work well on machines with limited RAM or VRAM. However, the chunk duration can be adjusted if needed. For example with more RAM or VRAM:

nemoasr myfile.mp3 --max-duration=12

This will cut a long audio file into chunks of 12 minutes maximum.

About

Automatic speech recognition with speaker diarisation

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages