Skip to content

alexeysherbakov/youtube-transcript-extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 

Repository files navigation

YouTube-Transcript-Extractor

This project includes a straightforward and functional Python script that automates the process of fetching transcripts from YouTube videos. The script takes a YouTube URL as an input, extracts the video transcript, punctuates it using an external API, and saves the result as a text file. It uses the youtube_transcript_api package for transcript retrieval and the Bark Punctuator API for punctuating the transcripts.

The extracted transcript is then processed further to make it more readable and editable. For this, the Natural Language Toolkit NLTK is employed. The resulting formatted and punctuated transcript is finally saved as a text file, providing a ready-to-use document for further uses like analysis, translation, etc.

ss

Installation

Use the package manager pip to install:

requests
youtube_transcript_api
nltk==3.8.1

Also, make sure to download NLTK's tokenizers before you can use them. This can be done by running the following Python command:

nltk.download('punkt')

About

A Python script for extracting and punctuating transcripts from YouTube videos.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages