Skip to content

WhiteeRabbit/VidPhrase

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

VidPhrase

vidphrs

Search phrases inside YouTube videos, uploaded video and audio files using subtitles, comments, descriptions, and AI-powered semantic search.

✨ Features β€’ 🧠 How It Works β€’ πŸ› οΈ Tech Stack β€’ βš™οΈ Setup β€’ ⚠️ Cookies Helper β€’ ▢️ Usage β€’ πŸ“ Structure


✨ Features

  • πŸ”Ž Search phrases inside YouTube videos
  • 🎬 Search inside uploaded video files
  • 🎧 Search inside uploaded audio files
  • πŸ“ Search using YouTube subtitles
  • πŸ’¬ Search inside YouTube comments
  • πŸ“„ Search inside video descriptions
  • 🧠 AI-powered semantic search
  • 🌍 Multi-language subtitle support
  • πŸŽ™οΈ Local transcription using Faster-Whisper
  • πŸ“₯ Download subtitles as text files
  • ⚑ Fast and lightweight Flask web interface

🧠 How It Works

VidPhrase supports multiple search methods.

πŸ” Exact & Fuzzy Search

Search directly inside:

  • YouTube subtitles
  • YouTube comments
  • Video descriptions
  • Uploaded video transcriptions
  • Uploaded audio transcriptions

Even if the phrase contains typos or slightly different wording, fuzzy matching can still return relevant results.

πŸ€– AI Semantic Search

VidPhrase can search by meaning, not only by exact keywords.

The AI understands:

  • synonyms and paraphrases
  • abbreviations and acronyms
  • broader and narrower concepts
  • technologies, products, services, and brands related to the query
  • explanations and examples that imply the same idea
  • surrounding context even when the exact words never appear

Example

Query

artificial intelligence

Possible matches:

  • AI
  • Machine Learning
  • Deep Learning
  • Neural Networks
  • LLMs
  • ChatGPT
  • GPT
  • Gemini
  • Qwen

Query

cloud technologies

Possible matches:

  • AWS
  • Amazon Web Services
  • Google Cloud
  • Azure
  • Kubernetes
  • Docker
  • Containers
  • Serverless
  • Cloud Infrastructure
  • Virtual Machines

This allows you to discover moments that are conceptually related to your query instead of being limited to exact keyword matching.


πŸ› οΈ Tech Stack

  • 🐍 Python
  • 🌐 Flask
  • πŸ“Ί yt-dlp
  • πŸŽ™οΈ faster-whisper
  • πŸ” thefuzz
  • πŸ€– Google GenAI
  • πŸͺ browser-cookie3
  • πŸ”§ Werkzeug

βš™οΈ Setup

1️⃣ Install Python

Windows

Download Python from:

https://www.python.org/downloads/

During installation, make sure to enable:

  • βœ… Add Python to PATH
  • βœ… Install pip

Ubuntu / Debian

sudo apt update
sudo apt install python3 python3-pip

2️⃣ Verify Installation

Windows:

python --version
pip --version

Linux:

python3 --version
pip3 --version

3️⃣ Clone the Repository

git clone https://github.com/yourusername/VidPhrase.git
cd VidPhrase

4️⃣ Create a Virtual Environment (Recommended)

python -m venv venv

Windows

venv\Scripts\activate

Linux / macOS

source venv/bin/activate

5️⃣ Install Dependencies

pip install -r requirements.txt

or

pip3 install -r requirements.txt

6️⃣ Configure Gemini API

VidPhrase uses Google Gemini for semantic search.

Open app.py and replace:

client = genai.Client(api_key="GEMINI_API_TOKEN")

with your own API key:

api_key="YOUR_GEMINI_API_KEY"

Without a valid Gemini API key, AI Semantic Search will not work.


⚠️ Cookies Helper

YouTube sometimes rate-limits requests and yt-dlp may fail with:

HTTP Error 429: Too Many Requests

If this happens, generate fresh browser cookies by running:

python3 cookie_fetch_profiles.py

The script automatically extracts YouTube cookies from available Chrome and Firefox profiles and saves them inside the cookies/ directory.

After generating the cookies, restart VidPhrase and try again.


7️⃣ Run the Application

Windows:

python app.py

Linux:

python3 app.py

Open:

http://127.0.0.1:9005

▢️ Usage

  1. Open VidPhrase in your browser.
  2. Paste a YouTube link or upload a video/audio file.
  3. Enter the phrase you want to find.
  4. Choose the search method:
  • πŸ“ Subtitle Search
  • πŸ’¬ Comment & Description Search
  • πŸŽ™οΈ Whisper Transcription Search
  • 🧠 AI Semantic Search
  1. Browse the results.
  2. Jump directly to the relevant moment and continue from the surrounding context.

πŸ“ Project Structure

VidPhrase/
β”œβ”€β”€ app.py
β”œβ”€β”€ cookie_fetch_profiles.py
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ cookies/
β”‚   β”œβ”€β”€ cookies_1.txt
β”‚   β”œβ”€β”€ cookies_2.txt
β”‚   β”œβ”€β”€ cookies_3.txt
β”‚   β”œβ”€β”€ cookies_4.txt
β”‚   └── cookies_fire.txt
β”œβ”€β”€ templates/
β”œβ”€β”€ static/
└── README.md

πŸ“ Notes

  • re
  • os
  • json
  • glob
  • tempfile
  • uuid
  • shutil
  • warnings
  • io.BytesIO

These modules are part of Python's standard library and do not need to be installed separately.

  • Subtitle search uses fuzzy matching and may return closely related phrases.
  • AI search is designed to find contextual and conceptual matches, not only exact words.
  • Local video and audio files are automatically transcribed using Faster-Whisper.
  • The application runs on port 9005 by default.

Made with ❀️ to make searching through videos and audio effortless.

About

VidPhrase is a web app for searching phrases in YouTube videos, uploaded video and audio files, with subtitle, comment, and AI-powered semantic search support.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors