Skip to content

Commit 2011d6b

Browse files
authored
Merge pull request #4 from PSPDFKit-labs/codex/nutrient-skill-router
Refactor skill package into a compact router with modular DWS references
2 parents 861b1e9 + 3fc5516 commit 2011d6b

15 files changed

Lines changed: 1105 additions & 58 deletions

.github/workflows/validate.yml

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
name: Validate
2+
3+
on:
4+
pull_request:
5+
push:
6+
branches:
7+
- main
8+
- "codex/**"
9+
10+
jobs:
11+
validate:
12+
runs-on: ubuntu-latest
13+
steps:
14+
- name: Check out repository
15+
uses: actions/checkout@v4
16+
17+
- name: Set up Python
18+
uses: actions/setup-python@v5
19+
with:
20+
python-version: "3.12"
21+
22+
- name: Install YAML parser
23+
run: python -m pip install --disable-pip-version-check pyyaml
24+
25+
- name: Validate repo structure
26+
run: python tools/validate-repo.py
27+
28+
- name: Validate Codex metadata
29+
run: |
30+
python - <<'PY'
31+
import pathlib
32+
import yaml
33+
34+
path = pathlib.Path("nutrient-document-processing/agents/openai.yaml")
35+
data = yaml.safe_load(path.read_text())
36+
interface = data.get("interface", {})
37+
required = ["display_name", "short_description", "default_prompt"]
38+
missing = [key for key in required if key not in interface]
39+
if missing:
40+
raise SystemExit(f"openai.yaml missing interface keys: {missing}")
41+
print("openai.yaml parsed successfully")
42+
PY
43+
44+
- name: Compile Python scripts
45+
run: |
46+
python -m py_compile nutrient-document-processing/scripts/*.py nutrient-document-processing/scripts/lib/common.py
47+
48+
- name: Smoke test script help
49+
run: |
50+
for script in nutrient-document-processing/scripts/*.py; do
51+
python "$script" --help > /dev/null
52+
done

README.md

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,14 @@
22

33
<p align="center">
44
<a href="https://www.nutrient.io/api/"><img src="https://img.shields.io/badge/Nutrient-DWS%20API-blue" alt="Nutrient DWS API"></a>
5-
<a href="LICENSE"><img src="https://img.shields.io/badge/license-Apache--2.0-green" alt="License"></a>
5+
<a href="https://www.npmjs.com/package/@nutrient-sdk/dws-mcp-server"><img src="https://img.shields.io/npm/v/@nutrient-sdk/dws-mcp-server" alt="npm version"></a>
6+
<a href="nutrient-document-processing/LICENSE.txt"><img src="https://img.shields.io/badge/license-Apache--2.0-green" alt="License"></a>
67
<a href="https://agentskills.io"><img src="https://img.shields.io/badge/Agent%20Skills-compatible-purple" alt="Agent Skills"></a>
78
</p>
89

910
<p align="center">
1011
<strong>Give your AI agent PDF superpowers — in one command.</strong><br>
11-
Convert, extract, OCR, redact, sign, and fill documents from any coding agent.
12+
Generate, convert, extract, OCR, redact, sign, archive, and optimize documents from any coding agent.
1213
</p>
1314

1415
<p align="center">
@@ -131,13 +132,17 @@ patient-records.pdf (contains PII)
131132

132133
| Capability | Description | Example prompt |
133134
|------------|-------------|----------------|
135+
|**Generate** | Create PDFs from HTML templates, uploaded assets, or remote URLs | *"Generate a PDF proposal from this HTML template"* |
134136
| 📄 **Convert** | PDF ↔ DOCX/XLSX/PPTX, HTML → PDF, images → PDF | *"Convert report.docx to PDF"* |
137+
| 🧩 **Assemble** | Merge, split, reorder, rotate, and flatten PDF packets before delivery | *"Merge these PDFs, rotate the landscape pages, and keep only pages 1-5"* |
135138
| 📝 **Extract** | Text, tables, and key-value pairs from PDFs | *"Extract all tables from invoice.pdf as Excel"* |
136139
| 🔍 **OCR** | Multi-language OCR for scanned documents | *"OCR this German scan and extract the text"* |
137140
| 🔒 **Redact** | Pattern-based + AI-powered PII redaction | *"Redact all SSNs and emails from records.pdf"* |
138141
| 💧 **Watermark** | Text or image watermarks with full styling | *"Add a DRAFT watermark to proposal.pdf"* |
139142
| ✍️ **Sign** | CMS and CAdES digital signatures | *"Digitally sign contract.pdf"* |
140143
| 📋 **Fill Forms** | Programmatic PDF form filling | *"Fill the tax form with these values…"* |
144+
| 🗂️ **Compliance** | Convert PDFs for archival or accessibility targets like PDF/A and PDF/UA | *"Convert this PDF to PDF/A-2a"* |
145+
|**Optimize** | Optimize and linearize PDFs for web delivery and download performance | *"Linearize this PDF for fast web viewing"* |
141146
| 📊 **Credits** | Monitor API usage and balance | *"How many API credits do I have left?"* |
142147

143148
---
@@ -188,28 +193,37 @@ cp -r nutrient-agent-skill/nutrient-document-processing ~/.claude/skills/
188193
```
189194
nutrient-document-processing/
190195
├── SKILL.md # Main instructions (loaded by agents)
196+
├── agents/
197+
│ └── openai.yaml # Optional Codex App metadata
198+
├── references/
199+
│ ├── REFERENCE.md # Reference index
200+
│ └── *.md # Focused cookbooks by workflow type
191201
├── scripts/
192202
│ ├── *.py # Single-operation scripts
193203
│ └── lib/common.py # Shared utilities
194204
├── assets/
205+
│ ├── nutrient.svg # Skill icon
195206
│ └── templates/
196207
│ └── custom-workflow-template.py # Runtime pipeline template
197208
├── tests/
198209
│ └── testing-guide.md
199-
└── LICENSE # Apache-2.0
210+
└── LICENSE.txt # Apache-2.0
200211
```
201212

202213
### Script Model
203214

204215
- `scripts/*.py` are single-operation scripts only.
205216
- Multi-step workflows are generated at runtime in a temporary script from `assets/templates/custom-workflow-template.py`.
206217
- Do not commit runtime pipeline scripts.
218+
- Use `references/` for HTML/URL generation, compliance outputs, and other workflows that are easier to express as direct API payloads or temporary pipelines.
207219

208220
## Documentation
209221

210222
- **[SKILL.md](nutrient-document-processing/SKILL.md)** — Agent instructions with setup and operation examples
223+
- **[Reference Index](nutrient-document-processing/references/REFERENCE.md)** — Modular cookbook for generation, conversion, extraction, security, compliance, and workflow sequencing
211224
- **[Testing Guide](nutrient-document-processing/tests/testing-guide.md)** — Manual test procedures
212225
- **[Custom Workflow Template](nutrient-document-processing/assets/templates/custom-workflow-template.py)** — Runtime pipeline starting point
226+
- **[Codex App Metadata](nutrient-document-processing/agents/openai.yaml)** — Optional manifest for Codex App packaging
213227
- **[API Playground](https://dashboard.nutrient.io/processor-api/playground/)** — Interactive API testing
214228
- **[Official API Docs](https://www.nutrient.io/guides/dws-processor/)** — Nutrient documentation
215229

@@ -219,4 +233,4 @@ Built by [Nutrient](https://www.nutrient.io/) (formerly PSPDFKit) — document S
219233

220234
## License
221235

222-
[Apache-2.0](nutrient-document-processing/LICENSE)
236+
[Apache-2.0](nutrient-document-processing/LICENSE.txt)

nutrient-document-processing/SKILL.md

Lines changed: 88 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -1,83 +1,117 @@
11
---
22
name: nutrient-document-processing
33
description: >-
4-
Process documents with the Nutrient DWS API. Use this skill when the user wants to convert documents
5-
(PDF, DOCX, XLSX, PPTX, HTML, images), extract text or tables from PDFs, OCR scanned documents,
6-
redact sensitive information (PII, SSN, emails, credit cards), add watermarks, digitally sign PDFs,
7-
fill PDF forms, or check API credit usage. Activates on keywords: PDF, document, convert, extract,
8-
OCR, redact, watermark, sign, merge, compress, form fill, document processing.
4+
Process documents with Nutrient DWS. Use when the user wants to generate PDFs from HTML or URLs,
5+
convert Office/images/PDFs, assemble or split packets, OCR scans, extract text/tables/key-value
6+
pairs, redact PII, watermark, sign, fill forms, optimize PDFs, or produce compliance outputs like
7+
PDF/A or PDF/UA. Triggers include convert to PDF, merge these PDFs, OCR this scan, extract tables,
8+
redact PII, sign this PDF, make this PDF/A, or linearize for web delivery.
99
license: Apache-2.0
1010
metadata:
1111
author: nutrient-sdk
1212
version: "1.0"
1313
homepage: "https://www.nutrient.io/api/"
1414
repository: "https://github.com/PSPDFKit-labs/nutrient-agent-skill"
15-
compatibility: "Requires Node.js 18+ and internet. Works with Claude Code, Codex CLI, Gemini CLI, OpenCode, Cursor, Windsurf, GitHub Copilot, Amp, or any Agent Skills-compatible product."
15+
compatibility: "Requires Python 3.10+, uv, and internet. Works with Claude Code, Codex CLI, Gemini CLI, OpenCode, Cursor, Windsurf, GitHub Copilot, Amp, or any Agent Skills-compatible product."
16+
short-description: "Generate, convert, assemble, OCR, redact, sign, archive, and optimize documents"
1617
---
1718

1819
# Nutrient Document Processing
1920

20-
Process, convert, extract, redact, sign, and manipulate documents using the [Nutrient DWS Processor API](https://www.nutrient.io/api/).
21+
Use Nutrient DWS for managed document workflows where fidelity, compliance, or multi-step processing matters more than local-tool convenience.
2122

2223
## Setup
23-
24-
You need a Nutrient DWS API key. Get one free at <https://dashboard.nutrient.io/sign_up/?product=processor>.
25-
26-
Export the API key before running scripts:
27-
28-
```bash
29-
export NUTRIENT_API_KEY="nutr_sk_..."
30-
```
31-
32-
Scripts live in `scripts/` relative to this SKILL.md. Use the directory containing this SKILL.md as the working directory when running scripts:
33-
34-
```bash
35-
cd <directory containing this SKILL.md> && uv run scripts/<script>.py --help
36-
```
37-
38-
Page ranges use `start:end` (0-based, end-exclusive). Negative indices count from the end. Use comma-separated ranges like `0:2,3:5,-2:-1`.
39-
40-
## PDF Requirements
41-
42-
Some operations require specific document characteristics:
43-
44-
- **split.py**: Requires multi-page PDFs (2+ pages). Cannot extract a range from a single-page document.
45-
- **delete-pages.py**: Must retain at least one page. Cannot delete all pages in a document.
46-
- **sign.py**: Only accepts local file paths (not URLs).
47-
48-
## Single-Operation Scripts
49-
50-
- Convert format: `uv run scripts/convert.py --input doc.pdf --format docx --out doc.docx`
51-
- Merge files: `uv run scripts/merge.py --inputs a.pdf,b.pdf --out merged.pdf`
52-
- Split by ranges: `uv run scripts/split.py --input doc.pdf --ranges 0:2,2: --out-dir out --prefix part`
53-
- OCR: `uv run scripts/ocr.py --input scan.pdf --languages english --out scan-ocr.pdf`
54-
- Rotate pages: `uv run scripts/rotate.py --input doc.pdf --angle 90 --out rotated.pdf`
55-
- Optimize: `uv run scripts/optimize.py --input doc.pdf --out optimized.pdf`
56-
- Extract text: `uv run scripts/extract-text.py --input doc.pdf --out text.json`
57-
- Extract tables: `uv run scripts/extract-table.py --input doc.pdf --out tables.json`
58-
- Extract key-value pairs: `uv run scripts/extract-key-value-pairs.py --input doc.pdf --out kvp.json`
59-
- Add text watermark: `uv run scripts/watermark-text.py --input doc.pdf --text CONFIDENTIAL --out watermarked.pdf`
60-
- AI redact: `uv run scripts/redact-ai.py --input doc.pdf --criteria "Remove all SSNs" --mode apply --out redacted.pdf`
61-
- Sign: `uv run scripts/sign.py --input doc.pdf --out signed.pdf`
62-
- Password protect: `uv run scripts/password-protect.py --input doc.pdf --user-password upass --owner-password opass --out protected.pdf`
63-
- Add pages: `uv run scripts/add-pages.py --input doc.pdf --count 2 --out with-pages.pdf`
64-
- Delete pages: `uv run scripts/delete-pages.py --input doc.pdf --pages 0,2,-1 --out trimmed.pdf`
65-
- Duplicate/reorder pages: `uv run scripts/duplicate-pages.py --input doc.pdf --pages 2,0,1,1 --out reordered.pdf`
24+
- Get a Nutrient DWS API key at <https://dashboard.nutrient.io/sign_up/?product=processor>.
25+
- Direct API calls use `Authorization: Bearer $NUTRIENT_API_KEY`.
26+
```bash
27+
export NUTRIENT_API_KEY="nutr_sk_..."
28+
```
29+
- MCP setups commonly use `@nutrient-sdk/dws-mcp-server` with `NUTRIENT_DWS_API_KEY`.
30+
- Scripts live in `scripts/` relative to this SKILL.md. Use the directory containing this SKILL.md as the working directory:
31+
```bash
32+
cd <directory containing this SKILL.md> && uv run scripts/<script>.py --help
33+
```
34+
- Page ranges use `start:end` with 0-based indexes and end-exclusive semantics. Negative indexes count from the end.
35+
36+
## When to use
37+
- Generate PDFs from HTML templates, uploaded assets, or remote URLs.
38+
- Convert Office, HTML, image, and PDF files between supported formats.
39+
- OCR scans and extract text, tables, or key-value pairs.
40+
- Redact PII, watermark, sign, fill forms, merge, split, rotate, flatten, or encrypt PDFs.
41+
- Produce delivery targets like PDF/A, PDF/UA, optimized PDFs, or linearized PDFs.
42+
- Check credits before large, batch, or AI-heavy runs.
43+
44+
## Tool preference
45+
1. Prefer `scripts/*.py` for covered single-operation workflows.
46+
2. Use `assets/templates/custom-workflow-template.py` for multi-step jobs that should still run through the Python client.
47+
3. Use the modular `references/` docs and direct API payloads for capabilities that do not yet have a dedicated helper script, especially HTML/URL generation and compliance tuning.
48+
4. Use local PDF utilities only for lightweight inspection. Use Nutrient when output fidelity or compliance matters.
49+
50+
## Single-operation scripts
51+
- `convert.py` -> convert between `pdf`, `pdfa`, `pdfua`, `docx`, `xlsx`, `pptx`, `png`, `jpeg`, `webp`, `html`, and `markdown`
52+
- `merge.py` -> merge multiple files into one PDF
53+
- `split.py` -> split one PDF into multiple PDFs by page ranges
54+
- `add-pages.py` -> append blank pages
55+
- `delete-pages.py` -> remove specific pages
56+
- `duplicate-pages.py` -> reorder or duplicate pages into a new PDF
57+
- `rotate.py` -> rotate selected pages
58+
- `ocr.py` -> OCR scanned PDFs or images
59+
- `extract-text.py` -> extract text to JSON
60+
- `extract-table.py` -> extract tables
61+
- `extract-key-value-pairs.py` -> extract key-value pairs
62+
- `watermark-text.py` -> apply a text watermark
63+
- `redact-ai.py` -> detect and apply AI-powered redactions
64+
- `sign.py` -> digitally sign a local PDF
65+
- `password-protect.py` -> write encrypted output PDFs
66+
- `optimize.py` -> apply optimization and linearization-style options via JSON
6667

6768
## Multi-Step Workflow Rule
68-
6969
Do not add new committed pipeline scripts under `scripts/`.
7070

7171
When the user asks for multiple operations in one run:
72-
73-
1. Copy `assets/templates/custom-workflow-template.py` to a temporary location (for example `/tmp/ndp-workflow-<task>.py`).
72+
1. Copy `assets/templates/custom-workflow-template.py` to a temporary location such as `/tmp/ndp-workflow-<task>.py`.
7473
2. Implement the combined workflow in that temporary script.
7574
3. Run it with `uv run /tmp/ndp-workflow-<task>.py ...`.
7675
4. Return generated output files.
7776
5. Delete the temporary script unless the user explicitly asks to keep it.
7877

79-
## Rules
78+
## PDF Requirements
79+
- `split.py` requires a multi-page PDF and cannot extract ranges from a single-page document.
80+
- `delete-pages.py` must retain at least one page and cannot delete the entire document.
81+
- `sign.py` only accepts local file paths for the main PDF.
82+
83+
## Decision rules
84+
- Prefer a helper script when one already covers the requested operation cleanly.
85+
- If you control the source markup, prefer HTML generation over browser print workflows.
86+
- Use remote `file.url` inputs when the source already lives at a stable URL and you want to avoid local uploads.
87+
- Use `output.type` for conversion and finalization targets. Use `actions` for transformations when building direct API payloads.
88+
- OCR before text extraction, key-value extraction, or semantic redaction on scans.
89+
- Prefer preset or regex redaction when the target is explicit. Use AI redaction only for contextual or natural-language requests.
90+
- Use the PDF manipulation reference for merge, split, rotate, flatten, and page-range workflows instead of inferring those payloads from conversion examples.
91+
- Treat PDF/A and PDF/UA as compliance targets, not cosmetic export formats. Choose the target up front and validate final artifacts when requirements are contractual.
92+
- For PDF/UA, clean born-digital inputs and structured HTML usually tag better than rasterized or flattened source PDFs.
93+
- For delivery optimization, linearize or optimize unsigned output artifacts instead of mutating already signed files.
94+
- When the user asks for multiple steps, keep destructive or final steps late in the sequence. Use the workflow recipes when ordering is ambiguous.
95+
96+
## Anti-patterns
97+
- Do not OCR born-digital PDFs just because the task mentions extraction. Extract first and OCR only if the text layer is missing.
98+
- Do not flatten forms or annotations until the user confirms the artifact no longer needs to stay editable.
99+
- Do not sign, archive, or linearize intermediate working files. Keep those as final-delivery steps.
100+
- Do not promise PDF/A or PDF/UA compliance without a validation step when the requirement is contractual.
101+
- Do not commit temporary workflow scripts under `scripts/`.
102+
103+
## Reference map
104+
Read only what you need:
105+
106+
- `references/request-basics.md` -> endpoint model, auth, multipart vs JSON, credits, limits, and errors
107+
- `references/generation-and-conversion.md` -> HTML/URL generation and format conversion
108+
- `references/pdf-manipulation.md` -> merge, split, page-range, rotate, and flatten workflows
109+
- `references/extraction-and-ocr.md` -> OCR, text extraction, tables, and key-value workflows
110+
- `references/security-signing-and-forms.md` -> redaction, watermarking, signatures, forms, and passwords
111+
- `references/compliance-and-optimization.md` -> PDF/A, PDF/UA, optimization, and linearization
112+
- `references/workflow-recipes.md` -> end-to-end sequencing patterns for common business document workflows
80113

114+
## Rules
81115
- Fail fast when required arguments are missing.
82116
- Write outputs to explicit paths and print created files.
83117
- Do not log secrets.
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
interface:
2+
display_name: "Nutrient Document Processing"
3+
short_description: "Generate, convert, assemble, OCR, redact, sign, archive, and optimize documents"
4+
icon_small: "./assets/nutrient.svg"
5+
icon_large: "./assets/nutrient.svg"
6+
default_prompt: "Use $nutrient-document-processing to generate, convert, assemble, OCR, extract, redact, sign, fill, archive, optimize, or linearize this document, then return the output files and a concise summary."
Lines changed: 4 additions & 0 deletions
Loading

0 commit comments

Comments
 (0)