Paperclip

Local tool to capture web papers/articles, parse them into clean text + sections + references, and export in a few useful formats.

Quickstart

python -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -r requirements.txt
python run.py

Open:

Library: http://127.0.0.1:8000/library/
Collections: http://127.0.0.1:8000/collections/

Chrome extension

Open chrome://extensions
Enable Developer mode
Load unpacked → select extensions/chrome/

If your server isn’t on http://127.0.0.1:8000, edit extensions/chrome/background.js (API_ENDPOINT).

How it works

Extension posts captures to POST /api/captures/ (URL + full HTML + best-effort main content + metadata).
Server parses with a site-aware parser (PMC/OUP/Wiley/…) and falls back to generic heuristics.
Data is stored locally:
- SQLite: data/db.sqlite3
- Artifacts: data/artifacts/<capture_id>/

Artifacts

Each capture has a folder at data/artifacts/<capture_id>/, typically containing:

page.html, content.html
article.json, reduced.json
sections.json, references.json
paper.md (deterministic bundle)

Exports

BibTeX: /exports/bibtex/
RIS: /exports/ris/
Master Markdown: /exports/master.md/
Papers JSONL: /exports/papers.jsonl/

Add ?collection=<collection_id> to export a specific collection.

Dev

pip install -r requirements-dev.txt
pytest

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
.github/workflows		.github/workflows
extensions/chrome		extensions/chrome
paperclip		paperclip
static		static
templates		templates
tests		tests
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.prettierrc		.prettierrc
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
pytest.ini		pytest.ini
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Paperclip

Quickstart

Chrome extension

How it works

Artifacts

Exports

Dev

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Paperclip

Quickstart

Chrome extension

How it works

Artifacts

Exports

Dev

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages