ASR-service for transcription and diarization of meetings audio recordings

📣 Introduction

The purpose of the service is to provide a transcript and summary of audio recordings from meetings. The transcript will be broken down into semantic parts, with time stamps and speaker information, and abstracts for each part. There is also a web-service which can be used to test and demonstrate how the whole ASR-system works

⚙ Technology stack

OpenAI Whisper models for transcription
Pyannote model for diarization
sberbank-ai/ruRoberta-large model for word embeddings
cointegrated/rut5-base-absum model for summarization
Flask framework for web-interface

🛠 Installation

Whole project was developed and tested with Python 3.10

Clone repository: git clone https://github.com/DefinitelyNik/ASR-service.git
Install Whisper: pip install git+https://github.com/openai/whisper.git or visit their repo and follow instruction there
Install Pyannote diarization model from their repo or hugginface page
Install sberbank-ai/ruRoberta-large model(no huggingface page at the moment, so you can try to use another word embedding model for example)
Install cointegrated/rut5-base-absum model from huggingface page
Install PyTorch from their website (tested on cuda 11.8 but should work completely fine on other versions)
Install other dependencies:

pip install Flask librosa numpy dotenv matplotlib scikit-learn transformers

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ASR-service for transcription and diarization of meetings audio recordings

📣 Introduction

⚙ Technology stack

🛠 Installation

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

ASR-service for transcription and diarization of meetings audio recordings

📣 Introduction

⚙ Technology stack

🛠 Installation