Conference Video Summarizer

This project automates the extraction and summarization of conference slide presentations. It uses Selenium to navigate and capture slides, OCR (Tesseract) to extract text, and Anthropic’s Claude API to generate concise 2–3 paragraph summaries of the presentations. This is based on my internship work at RelationalAI.

Features

Automatically loads conference presentations (tested on ICML virtual conference site).
Extracts text from slides with Tesseract OCR.
Detects slides containing useful graphics not captured in text.
Sends text + images to Claude (Anthropic API) for high-level summaries.
Works on entire conferences.

Requirements

Python Dependencies

Install all required dependencies with:

pip install selenium pillow pytesseract openai anthropic

System Dependencies

Google Chrome
ChromeDriver (must match your Chrome version and be in your PATH)
Tesseract OCR
- macOS: brew install tesseract
- Ubuntu/Debian: sudo apt-get install tesseract-ocr
- Windows: Download installer

Setup

Clone the repository:

git clone https://github.com/amoghakella/Conference-Video-Summarizer.git
cd Conference-Video-Summarizer

Insert your Anthropic API key into summarizer.py:

client = anthropic.Anthropic(
    api_key="INSERT API KEY"
)

Usage

Run the summarizer with:

python summarize.py

Example Output

When successful, the script prints summaries of each detected slideshow in the conference:

Summary of slideshow 1:
[Generated 2–3 paragraph summary here...]

Visualizing Summaries with NotebookLM

Once the summaries are generated, you can optionally feed them into NotebookLM to create interactive mind-graph visualizations of the workshop content. NotebookLM automatically structures the summarized information into a knowledge graph, making it easier to explore relationships between concepts, papers, and ideas presented in the slides.

For best results:

Run the script to generate summaries.
Copy the output into a .txt or .md file.
Upload the file to NotebookLM.
Use the mind-graph view to explore the extracted knowledge.

Notes

The script launches Chrome and interacts with slides automatically.
Summaries are generated with Claude, so Anthropic API credits are required.
OCR results may vary depending on slide formatting.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.DS_Store		.DS_Store
README.md		README.md
summarize.py		summarize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Conference Video Summarizer

Features

Requirements

Python Dependencies

System Dependencies

Setup

Usage

Example Output

Visualizing Summaries with NotebookLM

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Conference Video Summarizer

Features

Requirements

Python Dependencies

System Dependencies

Setup

Usage

Example Output

Visualizing Summaries with NotebookLM

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages