feat: extract-text skill bundle + bundled GSD/Superpowers skills#88
feat: extract-text skill bundle + bundled GSD/Superpowers skills#88
Conversation
…skills Bundles Anthropic's extract-text Rust binary at /usr/local/bin/ for unified plain-text extraction (docx/odt/epub/xlsx/pptx/rtf/html/ipynb) used by the new file-reading skill. Adds PyMuPDF (import fitz) and xlrd to support pdf-reading and legacy .xls parsing. Forward-ported from internal fork (be1b27a).
Pull GSD (github.com/gsd-build/get-shit-done) and Superpowers
(github.com/obra/superpowers) at build time into /opt/skills-external/.
Entrypoint symlinks them into ~/.claude/{skills,agents,commands,hooks,
get-shit-done} so /home/assistant volume stays small.
Main AI keeps reading from /mnt/skills/ (unchanged) — GSD is invisible
to it. Claude Code inside the container gets /gsd:* slash-commands,
gsd-* agents, superpowers skills, hooks (SessionStart, Pre/PostToolUse)
wired via settings.json, and the get-shit-done engine reachable as
`gsd` CLI.
Forward-ported from internal fork (07dcead).
📝 WalkthroughWalkthroughThis PR introduces document reading capabilities by adding ChangesDocument Reading Skills & Extract-Text Infrastructure
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…nse disclaimer - Pin GSD and Superpowers to released tags (v1.9.9, v5.0.7) via build ARGs for reproducibility; overridable with --build-arg. - Move extract-text binary from repo root to vendor/extract-text/ with a README explaining licensing and the open-source fallback path. Update Dockerfile COPY accordingly. - Merge the two git-clone RUN layers into one; replace non-functional bash ** glob with find -name .git -type d -exec rm. - Guard every settings.json hook command with [ -f ... ] && ... || true so missing upstream files do not error every Claude Code session. - Add skills/README.md with a licensing matrix + Anthropic-skill disclaimer covering existing docx/pdf/pptx/xlsx and the new file-reading/pdf-reading. Operators without an Anthropic agreement get a one-liner to drop the affected directories before build. - CHANGELOG: add Unreleased section describing the PR and noting the followup to fetch extract-text via curl + sha256 instead of vendoring.
tests/test-pr88-skills.sh runs eight grouped checks against a built image: 1. extract-text binary: executable, --help, real round-trip parsing of docx/xlsx/pptx/ipynb (built on the fly with python-docx/openpyxl/ python-pptx, then extracted) — proves the binary is more than just a runnable ELF and actually handles every format the file-reading skill advertises. 2. New Python deps from this PR (PyMuPDF/fitz, xlrd 2.x) import cleanly. 3. file-reading + pdf-reading skills mounted with SKILL.md / REFERENCE.md / LICENSE.txt; skills/README.md disclaimer present in image. 4. /opt/skills-external/gsd populated with agents/commands/hooks/ get-shit-done; gsd CLI symlink invokes; >=5 agents shipped. 5. /opt/skills-external/superpowers populated with skills/commands/agents; >=5 skills shipped. 6. No leftover .git directories under /opt/skills-external/ (regression guard for the old ** glob bug). 7. settings.json is valid JSON; all 8 hook commands carry the `[ -f … ] && … || true` guard added in the cleanup commit. 8. Entrypoint wires gsd-* into ~/.claude/agents/ and at least 4 of the 8 expected GSD hook files into ~/.claude/hooks/ — proves the symlink loop in the entrypoint actually ran end-to-end. Run with: ./tests/test-pr88-skills.sh [image-name] Default image: open-computer-use:latest
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@coderabbitai full review please |
|
Tip For best results, initiate chat on the files or code changes.
[full_review] |
Shows an open-source model scraping GitHub Trending, building an Excel chart, and shipping an editorial-style web dashboard via the web-artifacts-builder skill — one chat, no Anthropic key required. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ard, CHANGELOG followups - Dockerfile: comment now states upstream tags are mutable; for strict reproducibility, override with a commit SHA via --build-arg. - tests/test-pr88-skills.sh: section [3/8] heading describes what is actually checked (skills layout + skills/README.md). Drop misleading comment about vendor/extract-text/README.md (not in image). - tests/test-pr88-skills.sh: settings.json hook-guard regex no longer hard-codes the gsd- prefix, so future hooks added without that prefix are still verified to carry the [ -f ... ] && ... || true guard. - CHANGELOG: Known followups now lists three real compromises explicitly: vendored binary blob (sha256 fetch TBD, no public Anthropic release URL), mutable tag pinning (commit-SHA override available), PyMuPDF AGPL-3.0 conveyance question (drop PyMuPDF or add NOTICE). No functional change. Documentation honesty + test-guard robustness only.
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@Dockerfile`:
- Around line 599-618: The Dockerfile documents allowing commit-SHA pinning via
build args GSD_REF and SUPERPOWERS_REF, but the current git clone lines use
--branch which only accepts branch/tag names; update the clone logic so it
supports arbitrary SHAs: for each repo (the git clone lines that reference
GSD_REF and SUPERPOWERS_REF) clone the repo shallowly into /tmp, then if the
provided ref is a commit SHA (e.g. 40-hex) fetch that ref and git checkout it
(git -C /tmp/<repo> fetch --depth 1 origin <sha> && git -C /tmp/<repo> checkout
<sha>), otherwise use git clone --depth 1 --branch <ref>; apply this pattern to
both the get-shit-done and superpowers clone steps so build-arg SHAs work as
documented.
- Around line 393-416: The policy currently grants
Write(/home/assistant/.claude/**) which lets the agent modify executable guard
scripts referenced under the "hooks" entries (SessionStart, PreToolUse,
PostToolUse) and specific files like /home/assistant/.claude/hooks/gsd-*.js,
creating a persistence vector; remove the blanket
Write(/home/assistant/.claude/**) and instead grant write access only to
explicit non-executable file patterns that need editing (e.g., config/state/log
files) while changing the hooks/, commands/, agents/, and skills/ directories to
Read-only (retain Read(/home/assistant/.claude/**) for those paths), and update
the permission list where "Write(/home/assistant/.claude/**)" appears so all
hook scripts referenced by SessionStart, PreToolUse, PostToolUse remain
read-only.
In `@requirements.txt`:
- Line 15: The PyMuPDF entry in requirements.txt (PyMuPDF==1.24.10) introduces
AGPL-3.0/licensing obligations; either remove/replace it or document/accept the
license and obtain a commercial license. To fix: (a) if replacing, delete the
PyMuPDF==1.24.10 line in requirements.txt and add an alternative (e.g.,
pdfminer.six or pypdf) and update any imports referencing PyMuPDF/fitz in the
codebase to the alternative's API; or (b) if keeping, add a LICENSES.md entry
and a project-level legal signoff note stating AGPL acceptance or commercial
license procurement and ensure CI/docs include the licensing decision before
distribution. Ensure all references to the symbol fitz or PyMuPDF in code are
updated or validated against the chosen action.
In `@vendor/extract-text/README.md`:
- Line 15: The vendored extract-text binary is x86_64-only and the Dockerfile
unconditionally installs it to /usr/local/bin/extract-text; update the
Dockerfile to detect the target architecture (e.g., via uname -m or build
ARG/PLATFORM) and skip installing the binary on non-x86_64 (linux/arm64) with a
graceful fallback (skip install and use an open-source alternative or build from
source), and/or add a clear note in vendor/extract-text/README.md and the main
README requirements stating the x86_64-only limitation and suggested
alternatives; reference the extract-text binary name and the Dockerfile install
step so reviewers can locate and change the conditional install and the
documentation entries.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2e5a3c63-4856-4f7e-a7c3-b992c4f1b659
⛔ Files ignored due to path filters (1)
docs/demo-qwen-trending.gifis excluded by!**/*.gif
📒 Files selected for processing (13)
CHANGELOG.mdDockerfileREADME.mdrequirements.txtskills/README.mdskills/public/file-reading/LICENSE.txtskills/public/file-reading/SKILL.mdskills/public/pdf-reading/LICENSE.txtskills/public/pdf-reading/REFERENCE.mdskills/public/pdf-reading/SKILL.mdtests/test-pr88-skills.shvendor/extract-text/README.mdvendor/extract-text/extract-text
| # Refs are pinned to upstream tags. Tags are mutable (upstream can re-tag); | ||
| # for strict reproducibility, override with a commit SHA instead: | ||
| # --build-arg GSD_REF=<sha> --build-arg SUPERPOWERS_REF=<sha> | ||
| # To bump, change the ARGs below and rebuild. | ||
| ARG GSD_REF=v1.9.9 | ||
| ARG SUPERPOWERS_REF=v5.0.7 | ||
|
|
||
| # GSD (Get Shit Done) — commands, agents, hooks, engine | ||
| # NOTE: upstream repo has no skills/ dir — gsd-* skills are generated by the | ||
| # official npx installer. Users invoke via /gsd:<cmd> slash-commands instead. | ||
| RUN git clone --depth 1 --branch "${GSD_REF}" https://github.com/gsd-build/get-shit-done.git /tmp/gsd && \ | ||
| mkdir -p /opt/skills-external/gsd/get-shit-done \ | ||
| /opt/skills-external/gsd/agents \ | ||
| /opt/skills-external/gsd/commands \ | ||
| /opt/skills-external/gsd/hooks && \ | ||
| cp -r /tmp/gsd/get-shit-done/. /opt/skills-external/gsd/get-shit-done/ && \ | ||
| cp /tmp/gsd/agents/gsd-*.md /opt/skills-external/gsd/agents/ && \ | ||
| cp -r /tmp/gsd/commands/. /opt/skills-external/gsd/commands/ && \ | ||
| cp -r /tmp/gsd/hooks/. /opt/skills-external/gsd/hooks/ && \ | ||
| git clone --depth 1 --branch "${SUPERPOWERS_REF}" https://github.com/obra/superpowers.git /tmp/superpowers && \ |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Does git clone --branch support checking out an arbitrary commit hash, or does--branch only accept branch and tag names?
💡 Result:
No—git clone --branch is intended for branch or tag names, not arbitrary commit hashes. Authoritative documentation says --branch “point[s] the newly created HEAD to branch instead” and that it can “take tags and detaches the HEAD at that commit” (i.e., the tag name is what’s supplied; the tag resolves to a commit). It does not say that --branch accepts raw hexadecimal commit IDs. If you want to start from an arbitrary commit during clone, use git clone --revision= instead; the docs state --revision accepts a “ref name … that peels down to a commit, or a hexadecimal object name” and that --revision is incompatible with --branch. Example (arbitrary commit): git clone --revision= Then you can also optionally detach HEAD or create your own branch after clone.
Citations:
- 1: https://git-scm.com/docs/git-clone/2.36.0.html
- 2: https://git-scm.com/docs/git-clone.html
- 3: https://chromium.googlesource.com/external/github.com/git/git/+/HEAD/Documentation/git-clone.adoc
The documented SHA pinning escape hatch does not work with the current clone command.
The comment says callers can pass a commit SHA via build args, but both repos are fetched with git clone --branch "${...}", which only accepts branch and tag names, not commit hashes. To support arbitrary commit pinning, use git clone --revision instead (which is incompatible with --branch). Currently this leaves the image pinned to mutable tags while baking in third-party commands and hooks that execute at runtime.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@Dockerfile` around lines 599 - 618, The Dockerfile documents allowing
commit-SHA pinning via build args GSD_REF and SUPERPOWERS_REF, but the current
git clone lines use --branch which only accepts branch/tag names; update the
clone logic so it supports arbitrary SHAs: for each repo (the git clone lines
that reference GSD_REF and SUPERPOWERS_REF) clone the repo shallowly into /tmp,
then if the provided ref is a commit SHA (e.g. 40-hex) fetch that ref and git
checkout it (git -C /tmp/<repo> fetch --depth 1 origin <sha> && git -C
/tmp/<repo> checkout <sha>), otherwise use git clone --depth 1 --branch <ref>;
apply this pattern to both the get-shit-done and superpowers clone steps so
build-arg SHAs work as documented.
| pypdf==5.9.0 | ||
| pdfplumber==0.11.9 | ||
| pypdfium2==4.30.0 | ||
| PyMuPDF==1.24.10 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check PyPI for PyMuPDF version and license info
curl -s https://pypi.org/pypi/PyMuPDF/json | jq -r '.releases["1.24.10"] | if . then "✓ Version 1.24.10 exists" else "✗ Version not found" end'
curl -s https://pypi.org/pypi/PyMuPDF/json | jq -r '.info | "License: \(.license)"'
# Check for security advisories
gh api graphql -f query='
{
securityVulnerabilities(first: 5, ecosystem: PIP, package: "PyMuPDF") {
nodes {
advisory {
summary
severity
publishedAt
}
vulnerableVersionRange
firstPatchedVersion {
identifier
}
}
}
}'Repository: Yambr/open-computer-use
Length of output: 435
Resolve AGPL-3.0 licensing implications before distribution.
PyMuPDF 1.24.10 is dual licensed under AGPL-3.0 or Artifex Commercial License. Using the AGPL-3.0 variant requires either accepting AGPL conveyance obligations (source code disclosure) when distributing the image publicly, or switching to the commercial license. This requires an explicit decision: either replace PyMuPDF with an alternative, accept AGPL compliance, or negotiate the commercial license. Without resolution, public distribution of this image carries legal risk.
Note: A security advisory exists for PyMuPDF versions 1.26.5–1.26.7 (path traversal in main.py), but version 1.24.10 is unaffected.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@requirements.txt` at line 15, The PyMuPDF entry in requirements.txt
(PyMuPDF==1.24.10) introduces AGPL-3.0/licensing obligations; either
remove/replace it or document/accept the license and obtain a commercial
license. To fix: (a) if replacing, delete the PyMuPDF==1.24.10 line in
requirements.txt and add an alternative (e.g., pdfminer.six or pypdf) and update
any imports referencing PyMuPDF/fitz in the codebase to the alternative's API;
or (b) if keeping, add a LICENSES.md entry and a project-level legal signoff
note stating AGPL acceptance or commercial license procurement and ensure
CI/docs include the licensing decision before distribution. Ensure all
references to the symbol fitz or PyMuPDF in code are updated or validated
against the chosen action.
Adds demo-frames/, docs/demo-prompts/, tools/ to .gitignore — these are local scratch space for regenerating the hero GIF and shouldn't be checked in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Tighten Write permissions: drop .claude/**, allow only CLAUDE.md + settings.json (CodeRabbit #1 — prevents agent self-mutation of hooks) - Fix incorrect SHA-pinning hint — `git clone --branch` takes tags only, not raw SHAs; document the no-checkout+fetch+checkout alternative (CodeRabbit #2) - Document x86_64-only constraint in vendor/extract-text/README.md; qemu-emulated arm64 builds segfault (CodeRabbit #4) - Add THIRD-PARTY-LICENSES.md with PyMuPDF AGPL disclaimer + full bundled-deps license table; link from README License section (CodeRabbit #3 — repo doesn't distribute images, conveyance is on whoever builds and hosts) - test-pr88-skills.sh: skip round-trip parsing under qemu emulation, strip entrypoint banner before JSON parse, lower hook-count expectation to upstream reality (2 of 8), add security-regression assert that Write(.claude/**) is not granted - CHANGELOG: drop PyMuPDF Known followup (closed via notice), clarify reproducibility followup Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Bundles the Anthropic
extract-textRust CLI, two new skills (file-reading,pdf-reading), and the GSD + Superpowers skill bundles into the sandbox image. All Anthropic-licensed materials carry a disclaimer inskills/README.md. Third-party dep licensing (PyMuPDF AGPL etc.) covered inTHIRD-PARTY-LICENSES.md.Commits
74cc3d0): pin upstream tags, guard hook commands, scrub.gitdirs, vendor binary undervendor/extract-text/, expanded Known followupse246186): CodeRabbit findings + test methodology fixesVerification
docker build --platform linux/amd64 -t open-computer-use:latest . ./tests/test-pr88-skills.sh ./tests/test-docker-image.sh open-computer-use:latest ./tests/test-no-corporate.sh ./tests/test-project-structure.shAll four scripts green locally (24 + 54 + 14 + 22 PASS). The PR88 smoke script auto-detects qemu emulation and skips real-document round-trip on non-amd64 hosts (binary segfaults under qemu but works natively on amd64 production hosts).
Resolved CodeRabbit findings
Write(/home/assistant/.claude/**)was too broad — agent could overwrite its own hook scripts (persistence path). Narrowed toWrite(CLAUDE.md)+Write(settings.json)only. Test asserts the regression.GSD_REF/SUPERPOWERS_REFwith raw SHAs —git clone --branchdoesn't accept SHAs. Fixed comment to document theclone --no-checkout && fetch <sha> && checkout <sha>alternative.THIRD-PARTY-LICENSES.mdwith explicit AGPL disclaimer + full bundled-deps table; linked from README License section.vendor/extract-text/README.md. Project already mandates--platform linux/amd64; arm64-native builds need to remove theCOPYand use OSS fallbacks.Known followups
extract-textbinary is vendored undervendor/extract-text/(~2 MB blob). A future patch should fetch it at build time with sha256 verification and remove the blob from git history.v1.9.9,v5.0.7). Tags are mutable; for strict reproducibility a future patch should switch togit clone --no-checkout && git fetch <sha> && git checkout <sha>.🤖 Generated with Claude Code