Skip to content

wshuai190/PaperDemoAgent

Repository files navigation

Paper Demo Agent

Turn any scientific paper into a live interactive demo.

PyPI version Python 3.9+ License: MIT Downloads

Paper Demo Agent Demo

Paper Demo Agent UI

Quick Start | What's New in v0.4.0 | Forms | Providers | CLI | UI | Python API


What's New in v0.4.0

  • New flowchart_pro form: draw.io-quality interactive architecture diagrams powered by Cytoscape.js 3.30.2 + dagre layout. Compound nodes, zoom/pan/export, click-to-inspect detail panels, step-by-step walkthrough.
  • Vector figure extraction: extract_pdf_page now defaults to SVG output (via PyMuPDF get_svg_image) for crisp, resolution-independent paper figures.
  • Quality improvements: all 6 core output forms benchmarked at an average 9.4/10 (website 9.1, presentation 9.75, flowchart 9.8, latex 9.1, slides 8.8, app 9.5).
  • Bug fixes: resolved f-string {slug} NameError, Gradio version pinned to >=6.0, XSS guard added to app form, textcomp LaTeX package added for \texttimes support.

Quick Start

Install:

pip install paper-demo-agent

Generate a demo from an arXiv paper:

paper-demo-agent demo 1706.03762

Use Claude Code credentials if you already have them:

npm install -g @anthropic-ai/claude-code
claude login
paper-demo-agent demo 1706.03762 --provider anthropic

Use Gemini CLI credentials instead:

npm install -g @google/gemini-cli
gemini
paper-demo-agent demo 1706.03762 --provider gemini

Or set API keys directly:

paper-demo-agent key set ANTHROPIC_API_KEY <token>
paper-demo-agent key set OPENAI_API_KEY <token>
paper-demo-agent demo 1706.03762 --provider openai

Launch the UI:

paper-demo-agent ui

With pipx:

pipx run paper-demo-agent ui

What It Does

Paper Demo Agent reads a paper, classifies its contribution, routes it to a specialized skill, and generates one of 11 output formats across 4 top-level categories:

  • app: Gradio or Streamlit
  • presentation: HTML slides, PowerPoint, or LaTeX/Beamer
  • page: project page, README, or blog article
  • diagram: Mermaid, Cytoscape.js (draw.io-quality), or Graphviz

Current repo highlights:

  • 13 routed skills for model, dataset, algorithm, framework, theory, survey, findings, README, blog, Streamlit, Mermaid, and Graphviz generation
  • 15 generation tools including append_file, validate_output, render_svg, extract_pdf_page, extract_figure, extract_tables, and list_pdf_pages
  • 6 providers: Anthropic, OpenAI, Gemini, DeepSeek, Qwen, and MiniMax
  • Gradio UI with auth status, progress streaming, phase stepper, file preview, ZIP download, and one-click open
  • Expanded graphics toolkit for SVG, Mermaid, Chart.js, D3, and TikZ-based outputs

How It Works

  1. Parse the source from arXiv, URL, local PDF, or raw text.
  2. Analyze the paper to infer paper type, demo type, and best form.
  3. Route to a specialized skill.
  4. Run the generation loop: Research, Build, Polish, Validate.
  5. Return runnable output under demos/ by default.

The generator uses form-specific budgets. Current defaults:

  • presentation: build 18, polish 5
  • website, page_blog, slides, latex: build 14–16, polish 4
  • app, app_streamlit: build 14, polish 4
  • flowchart_pro: build 12, polish 3
  • flowchart: build 10, polish 3
  • page_readme, diagram_graphviz: build 6, polish 2

Forms And Subtypes

Preferred CLI usage is category + subtype:

Category Subtypes Internal Output
app gradio, streamlit app.py
presentation revealjs, pptx, beamer demo.html, build.py, presentation.tex
page project, readme, blog index.html, README.md
diagram mermaid, cytoscape, graphviz index.html, build.py

Examples:

paper-demo-agent demo 1706.03762 --form app --subtype streamlit
paper-demo-agent demo 1706.03762 --form presentation --subtype revealjs
paper-demo-agent demo 1706.03762 --form presentation --subtype pptx
paper-demo-agent demo 1706.03762 --form presentation --subtype beamer
paper-demo-agent demo 1706.03762 --form page --subtype project
paper-demo-agent demo 1706.03762 --form page --subtype readme
paper-demo-agent demo 1706.03762 --form page --subtype blog
paper-demo-agent demo 1706.03762 --form diagram --subtype mermaid
paper-demo-agent demo 1706.03762 --form diagram --subtype cytoscape
paper-demo-agent demo 1706.03762 --form diagram --subtype graphviz

Or use flat form keys directly:

paper-demo-agent demo 1706.03762 --form flowchart        # Mermaid.js interactive diagram
paper-demo-agent demo 1706.03762 --form flowchart_pro    # Cytoscape.js draw.io-quality diagram
paper-demo-agent demo 1706.03762 --form website          # Project page (alias for page/project)
paper-demo-agent demo 1706.03762 --form slides           # PowerPoint (alias for presentation/pptx)
paper-demo-agent demo 1706.03762 --form latex            # LaTeX/Beamer (alias for presentation/beamer)

Input Sources

paper-demo-agent demo SOURCE accepts:

  • arXiv ID: 1706.03762
  • arXiv-prefixed ID: arxiv:1706.03762
  • arXiv URL
  • local PDF path
  • raw text

Supported Providers

Provider Default Model Key Notes
Anthropic claude-sonnet-4-6 ANTHROPIC_API_KEY Supports Claude Code auto-detection
OpenAI gpt-5.2 OPENAI_API_KEY Supports Codex CLI auto-detection
Gemini auto-gemini-2.5 GOOGLE_API_KEY Supports Gemini CLI and gcloud ADC
DeepSeek deepseek-chat DEEPSEEK_API_KEY OpenAI-compatible provider
Qwen qwen-max QWEN_API_KEY DashScope-backed
MiniMax abab6.5-chat MINIMAX_API_KEY Also needs MINIMAX_GROUP_ID

Credential Resolution

Current key resolution is source-specific:

  • Anthropic: Claude Code -> saved config -> environment -> Aider
  • Gemini: Gemini CLI -> saved config -> environment -> gcloud ADC
  • OpenAI: saved config -> environment -> OpenAI Codex CLI -> Aider

Supported auto-detected sources:

  • Claude Code: ~/.claude/.credentials.json or macOS Keychain
  • Gemini CLI: ~/.gemini/oauth_creds.json, macOS Keychain, or OpenClaw profile
  • Google ADC: ~/.config/gcloud/application_default_credentials.json
  • OpenAI Codex CLI: ~/.codex/auth.json
  • Aider: ~/.aider.conf.yml

Web UI

Run:

paper-demo-agent ui

Optional flags:

paper-demo-agent ui --port 8080
paper-demo-agent ui --share
paper-demo-agent ui --auth admin:secret
paper-demo-agent ui --no-browser

The UI includes:

  • quick auth cards for Claude Code and Gemini CLI
  • provider dropdown with credential status
  • output category and subtype selectors
  • live progress split by Parse, Analyze, Research, Build, Polish, and Validate
  • generated file list, preview, ZIP download, and open/run actions

System Dependencies

Most outputs are pure Python or HTML.

Graphviz diagrams need the system dot binary:

# macOS
brew install graphviz

# Ubuntu / Debian
sudo apt-get install graphviz

LaTeX / Beamer output needs a TeX distribution:

# macOS
brew install --cask mactex-no-gui

# Ubuntu / Debian
sudo apt-get install texlive-latex-recommended texlive-fonts-extra

CLI Reference

# Auto-pick the output
paper-demo-agent demo 1706.03762

# Pick provider and model
paper-demo-agent demo 1706.03762 --provider anthropic --model claude-opus-4-6
paper-demo-agent demo 1706.03762 --provider openai --model gpt-5.2

# Pick category + subtype
paper-demo-agent demo 1706.03762 --form app --subtype streamlit
paper-demo-agent demo 1706.03762 --form presentation --subtype beamer
paper-demo-agent demo 1706.03762 --form page --subtype readme
paper-demo-agent demo 1706.03762 --form diagram --subtype graphviz

# Local PDF
paper-demo-agent demo ./paper.pdf --form presentation --subtype pptx

# Output directory
paper-demo-agent demo 1706.03762 --output ./my-demo

# Provider list
paper-demo-agent providers

# Key management
paper-demo-agent key set ANTHROPIC_API_KEY <token>
paper-demo-agent key list
paper-demo-agent key delete ANTHROPIC_API_KEY

# Hugging Face login for gated assets
paper-demo-agent login
paper-demo-agent logout

Python API

from paper_demo_agent import PaperDemoAgent

agent = PaperDemoAgent(provider="anthropic")

result = agent.run(
    source="1706.03762",
    demo_form="page",
    demo_subtype="project",
    max_iter=25,
    on_progress=print,
)

print(result.output_dir)
print(result.main_file)
print(result.run_command)

You can also use the lower-level steps:

paper = agent.parse("1706.03762")
analysis = agent.analyze(paper)
print(analysis.paper_type)
print(analysis.demo_form)
print(analysis.demo_subtype)

run_from_pdf() is available for UI-style byte uploads.


Project Structure

paper_demo_agent/
|-- agent.py
|-- cli.py
|-- config.py
|-- ui/app.py
|-- analysis/
|-- paper/
|-- providers/
|-- generation/
|   |-- generator.py
|   |-- runner.py
|   `-- tools.py
|-- graphics/
`-- skills/

Important top-level docs:

  • IMPROVEMENT_LOG.md
  • CONTRIBUTING.md

Examples:

  • examples/demo_attention_paper.py
  • examples/demo_imagenet_dataset.py
  • examples/demo_survey_paper.py

Notes And Troubleshooting

  • Large single-file outputs are handled with write_file plus append_file; the hard 300-line write limit described in older docs is no longer enforced.
  • validate_output exists as an internal generation tool, not as a public CLI command.
  • Figure and table extraction are available to the generator through extract_pdf_page, extract_figure, extract_tables, and list_pdf_pages.
  • Generated demos are written under demos/ unless --output is provided.

Development

git clone https://github.com/wshuai190/PaperDemoAgent
cd PaperDemoAgent
pip install -e ".[dev]"
pytest tests/

Contributing

See CONTRIBUTING.md.


License

MIT. See LICENSE.

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages