A tool for understanding podcasts and long-form videos.
Podcast-Agent turns YouTube and Bilibili podcasts or long-form videos into structured reports for faster understanding and analysis.
Homepage | Overview | Architecture | Project Structure | Installation | Quick Start | CLI Usage
Podcast-Agent is useful when you want to:
- Quickly understand what a podcast or long-form video is about.
- Produce reports in multiple formats, including editable Markdown, PDF, and Xiaohongshu-style image outputs.
- Generate shareable reports in multiple formats for easier distribution.
- Save intermediate artifacts for review, debugging, or downstream analysis.
Current input support includes YouTube and Bilibili videos.
Podcast-Agent is organized around four core layers:
-
Content Ingestion
Capture essential podcast and video elements, including metadata, transcripts, and contextual signals. -
Semantic Extraction
Analyze raw content around the user's question to identify relevant evidence, key moments, and meaningful context. -
Insight Structuring
Organize extracted information into core viewpoints, logical relationships, and a coherent analytical framework. -
Report Generation
Assemble metadata, evidence, viewpoints, and summaries into a polished structured report for fast understanding.
src/podcast_agent/
├── sources/ Source detection and source adapters
├── elements/ Metadata, transcript fetching, and formatting
├── transcribers/ Audio transcription fallback
├── insights/ Evidence, outline, viewpoint, and summary generation
├── pipeline/ Pipeline orchestration, context, and artifact handling
├── reports/ Markdown, HTML, PDF, and Xiaohongshu report rendering
└── cli/ Command-line entry points
- Python 3.10+
ffmpeg- Playwright Chromium
- Network access to YouTube, Bilibili, DeepSeek, Aliyun DashScope ASR, and Aliyun OSS
- Fonts are bundled; no extra font installation is required
Common system dependency installation:
# macOS
brew install python ffmpeg
# Ubuntu / Debian
sudo apt-get update
sudo apt-get install -y python3 python3-venv python3-pip ffmpegIf Chromium fails to launch on Linux, install the required Playwright system libraries:
.venv/bin/playwright install-deps chromiumCreate a virtual environment and install Python dependencies from the project root:
python -m venv .venv
.venv/bin/pip install -U pip
.venv/bin/pip install -e ".[dev,pdf,xhs]"
.venv/bin/playwright install chromiumVerify that the CLI is available:
.venv/bin/podcast-agent --helpCopy the environment template:
cp .env.example .envThen fill in .env:
DEEPSEEK_API_KEY=
DEEPSEEK_API_BASE=https://api.deepseek.com/v1
DEEPSEEK_MODEL=deepseek-chat
YOUTUBE_COOKIES_FILE=
BILIBILI_COOKIES_FILE=
BILIBILI_USER_AGENT=
ALIYUN_API_KEY=
OSS_ENDPOINT=
OSS_BUCKET_NAME=
OSS_ACCESS_KEY_ID=
OSS_ACCESS_KEY_SECRET=Create an API key in the DeepSeek console, then set:
DEEPSEEK_API_KEY=<your-deepseek-api-key>
DEEPSEEK_API_BASE=https://api.deepseek.com/v1
DEEPSEEK_MODEL=deepseek-chatTo use another DeepSeek model, only change DEEPSEEK_MODEL.
- Install the browser extension
Get cookies.txt LOCALLY. - Log in to your YouTube account.
- Open YouTube and export
cookies.txtwith the extension. - Place
cookies.txtin the project root:
Podcast-Agent/
├── cookies.txt
├── README.md
└── src/
- Set the cookies file path in
.env:
YOUTUBE_COOKIES_FILE=./cookies.txtIf commands are run from another working directory, use an absolute path:
YOUTUBE_COOKIES_FILE=/absolute/path/to/Podcast-Agent/cookies.txtNotes:
- The cookies file contains login credentials. Do not commit it or share it.
- If the cookies expire, export the file again.
- Install the browser extension
Get cookies.txt LOCALLY. - Log in to your Bilibili account.
- Open Bilibili and export
cookies.txtwith the extension. - Place the exported file in the project root, for example:
Podcast-Agent/
├── bilibili-cookies.txt
├── README.md
└── src/
- Set the Bilibili options in
.env:
BILIBILI_COOKIES_FILE=./bilibili-cookies.txt
BILIBILI_USER_AGENT=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125 Safari/537.36If commands are run from another working directory, use an absolute path:
BILIBILI_COOKIES_FILE=/absolute/path/to/Podcast-Agent/bilibili-cookies.txtPrepare an Aliyun DashScope API key and OSS bucket configuration.
Set:
ALIYUN_API_KEY: Create an API key in the Aliyun Bailian / DashScope console.OSS_ENDPOINT: Find the endpoint on the OSS bucket overview page, for examplehttps://oss-cn-hangzhou.aliyuncs.com.OSS_BUCKET_NAME: Use the OSS bucket name.OSS_ACCESS_KEY_ID: Create an AccessKey in Aliyun RAM.OSS_ACCESS_KEY_SECRET: Use the matching AccessKey secret.
.env example:
ALIYUN_API_KEY=<your-dashscope-api-key>
OSS_ENDPOINT=https://oss-cn-hangzhou.aliyuncs.com
OSS_BUCKET_NAME=<your-oss-bucket-name>
OSS_ACCESS_KEY_ID=<your-oss-access-key-id>
OSS_ACCESS_KEY_SECRET=<your-oss-access-key-secret>Run the bundled batch script with the default example cases:
scripts/run-full-batch.shTo use a custom cases file, output directory, or concurrency level:
CASES_PATH=examples/full-report-cases.json \
OUTPUT_ROOT=output \
MAX_JOBS=3 \
scripts/run-full-batch.shThe final report will be generated at:
output/<case-id>/reports/report.md
output/<case-id>/reports/report.html
output/<case-id>/reports/report.pdf
output/<case-id>/reports/xhs/images/
Run the full pipeline from the command line:
.venv/bin/podcast-agent full \
--url "https://www.youtube.com/watch?v=<video-id>" \
--question "Your question about the video" \
--output-dir output/my-reportBilibili URLs are supported in the same command:
.venv/bin/podcast-agent full \
--url "https://www.bilibili.com/video/<BV-id>" \
--question "Your question about the video" \
--output-dir output/my-bilibili-reportFor a complete command reference, see CLI Usage Guide.