Skip to content

Latest commit

Β 

History

History
23 lines (23 loc) Β· 1.87 KB

File metadata and controls

23 lines (23 loc) Β· 1.87 KB

ASR-service for transcription and diarization of meetings audio recordings

πŸ“£ Introduction

The purpose of the service is to provide a transcript and summary of audio recordings from meetings. The transcript will be broken down into semantic parts, with time stamps and speaker information, and abstracts for each part. There is also a web-service which can be used to test and demonstrate how the whole ASR-system works

βš™ Technology stack

πŸ›  Installation

Whole project was developed and tested with Python 3.10

  1. Clone repository: git clone https://github.com/DefinitelyNik/ASR-service.git
  2. Install Whisper: pip install git+https://github.com/openai/whisper.git or visit their repo and follow instruction there
  3. Install Pyannote diarization model from their repo or hugginface page
  4. Install sberbank-ai/ruRoberta-large model(no huggingface page at the moment, so you can try to use another word embedding model for example)
  5. Install cointegrated/rut5-base-absum model from huggingface page
  6. Install PyTorch from their website (tested on cuda 11.8 but should work completely fine on other versions)
  7. Install other dependencies:
pip install Flask librosa numpy dotenv matplotlib scikit-learn transformers