Skip to content

feat(plugins): YAML-driven plugin catalog + Claude/Codex marketplaces#91

Merged
sayalinvidia merged 11 commits into
NVIDIA:mainfrom
jasonnvidia:feat/plugin-build-modes
May 27, 2026
Merged

feat(plugins): YAML-driven plugin catalog + Claude/Codex marketplaces#91
sayalinvidia merged 11 commits into
NVIDIA:mainfrom
jasonnvidia:feat/plugin-build-modes

Conversation

@jasonnvidia
Copy link
Copy Markdown
Contributor

Summary

Adds an NVIDIA-branded plugin layer on top of the existing skills/ catalog so the same skills can be installed via Claude marketplace (claude plugin install) and Codex marketplace (codex plugin add), without duplicating skill content or fragmenting curation.

The substantive change is infrastructure (~140 lines of Python in .github/scripts/build-plugins.py, ~30 lines added to sync-skills.yml, a new 78-line validate-plugins.yml workflow); the rest of the diff is the first plugin definition (plugins.d/nvidia-skills.yml) and its generated artifacts under plugins/nvidia-skills/.

The daily skills sync continues to be the only producer of skills/. Plugin trees are derived from skills/ plus per-plugin YAML, regenerated atomically inside the same sync run, and validated on every PR.


What this adds

1. plugins.d/ — plugin catalog YAML

One file per plugin, plus _defaults.yml for shared author/license/capability fields. The first plugin defined is nvidia-skills curating two cuopt skills.

plugins.d/
├── _defaults.yml         # shared defaults (skill_files: copy, license, brand color, …)
├── nvidia-skills.yml     # the NVIDIA-official plugin spec
└── README.md             # schema + onboarding instructions

Per-plugin YAML controls:

  • identity (name, display_name, description, keywords)
  • curation (include_skills: — list of paths under skills/)
  • materialization mode (skill_files: copy | symlink — see below)
  • per-marketplace opt-out (marketplace_enabled.{claude,codex}: false)

2. .github/scripts/build-plugins.py — derived-artifact generator

Idempotent build script that, from plugins.d/ and skills/, regenerates:

  • plugins/<name>/skills/<skill>/ — curated subset, materialized as either real-dir copies (rsync) or relative symlinks
  • plugins/<name>/.claude-plugin/plugin.json — Anthropic plugin manifest
  • plugins/<name>/.codex-plugin/plugin.json — OpenAI Codex plugin manifest with interface block
  • entries in .claude-plugin/marketplace.json and .agents/plugins/marketplace.json (top-level fields preserved; plugins: array rebuilt)

Modes:

skill_files: What lands in plugins/<name>/skills/ Required when
copy (default) rsync'd real directories shipping to Codex (codex plugin add silently drops symlinks during install) — confirmed empirically against codex 0.132
symlink relative symlinks → ../../../skills/<Product>/<skill> Claude-only or npx skills add consumers; avoids duplicated SKILL.md

The current default (copy) keeps every consumer happy at the cost of duplicated SKILL.md content under plugins/. Plugins can opt into symlink per-file when Codex isn't a target.

3. .github/workflows/validate-plugins.yml — drift guard

Runs on PRs touching plugins.d/, skills/, plugins/, either marketplace.json, or the build script. Re-runs build-plugins.sh --check, which rebuilds into the working tree and exits non-zero if the committed plugin tree drifted from the YAML + skills sources.

4. .github/workflows/sync-skills.yml — one new step

Inserted between the existing Track dropped skills and Regenerate README tables steps:

      - name: Rebuild plugin catalog
        run: |
          set -euo pipefail
          .github/scripts/build-plugins.sh
          if ! git diff --quiet plugins/ \
              .claude-plugin/marketplace.json \
              .agents/plugins/marketplace.json; then
            if ! grep -qx "- plugin catalog" /tmp/changed-components.txt 2>/dev/null; then
              echo "- plugin catalog" >> /tmp/changed-components.txt
            fi
          fi

This regenerates copy-mode plugin trees atomically in the same PR as the upstream skills update, so validate-plugins.yml --check is never red on a sync PR.


How it all fits together

                 cron 06/18 UTC
                       │
         ┌─────────────▼─────────────┐
         │  sync-skills.yml          │
         │  ① rsync from product     │
         │     repos → skills/       │
         │  ② enforce sig+card,      │
         │     drop non-compliant    │
         │  ③ track dropped skills   │
         │  ④ rebuild plugin catalog │  ← NEW
         │     (build-plugins.sh)    │
         │  ⑤ regenerate-readme.sh   │
         │  ⑥ open PR                │
         └─────────────┬─────────────┘
                       │
                  PR opens
                       │
         ┌─────────────▼─────────────┐
         │  validate-plugins.yml     │  ← NEW
         │  build-plugins.sh --check │
         │     (drift guard)         │
         └───────────────────────────┘

skills/ remains the only source of truth that gets written — the build only derives plugins/, .claude-plugin/marketplace.json, and .agents/plugins/marketplace.json from it.


Why this design

A few decisions worth surfacing for review:

  1. skills/ is canonical, plugins/ is derived. The build script never edits skills/ and never touches components.d/. This keeps the existing daily-sync invariant intact — no new authoritative directory the team has to remember.

  2. Soft-fail policy. Curation references that the daily sync can legitimately invalidate (rename, removal, compliance-driven SKILL.md drop, basename collision) emit warnings and skip the entry instead of dying. The plugin ships with whatever resolved; the sync PR stays mergeable; a maintainer reconciles plugins.d/<plugin>.yml in a follow-up. We hit this empirically when upstream cuopt restructured its skill paths during testing.

  3. Two materialization modes, copy as default. Empirical testing showed codex plugin add (codex 0.132) silently drops symlinks during install, leaving the cached plugin's skills/ empty. To keep Codex consumers working out of the box, copy is the default. Plugins that won't ship to Codex can opt into symlink mode and avoid duplicated SKILL.md content.

  4. Marketplace identity fields are preserved, not regenerated. The build only rewrites plugins: arrays in marketplace.json files. Top-level name, displayName, metadata.description, owner, etc. are hand-edited and left alone — so the marketplace identity isn't accidentally clobbered by a build run.

  5. Curated plugins (plugins/<name>/.skills-manifest.yml) coexist. A plugin directory with a hand-edited manifest skips the catalog-build path entirely; only its skills/ tree is regenerated, and its marketplace.json entry is preserved verbatim. This leaves the door open for a non-NVIDIA-managed plugin that still wants the build system to refresh its skills tree.


Testing

Local testing exercised:

  • Baseline: build-plugins.sh produces plugins/nvidia-skills/skills/cuopt-routing-api-python/ and cuopt-user-rules/ cleanly in copy mode (real directories with SKILL.md + skill-card.md + skill.oms.sig).
  • Drift guard: build-plugins.sh --check is green on a clean working tree, red on hand-edits under plugins/.
  • Soft-fail scenarios — each verified to log a clear error/warning and continue:
    • missing include_skills: path on disk (rename/removal)
    • include_skills: path is not a directory (rare upstream change)
    • no SKILL.md under a curated path (compliance enforcement dropped them all)
    • duplicate skill basename across include_skills: entries
    • malformed plugins.d/<x>.yml
    • skill_files: with an invalid value (falls back to copy)
    • bad plugin name (not kebab-case)
    • empty skills: list in a curated .skills-manifest.yml
    • missing name: field in plugins.d
    • duplicate plugin name across plugins.d files
  • Symlink mode: confirmed produces relative symlinks resolving correctly through to ../../../skills/<Product>/<skill> and that claude plugin install follows them; confirmed codex plugin add silently drops them (motivating copy as default).
  • Mode switch: flipped nvidia-skills.yml between copy and symlink and verified the build cleanly converts the on-disk tree both directions.
  • Merge from upstream/main during development: catch'd a real-world skill-rename via the soft-fail path (cuopt skill restructure + NemoClaw removal during an in-flight sync).

CI integration tested:

  • validate-plugins.yml --check on this branch → green.

What's not yet tested in production CI:

  • The new Rebuild plugin catalog step inside sync-skills.yml running against a real product-repo clone. Recommend manually triggering sync-skills.yml via workflow_dispatch once after merge to validate end-to-end before the next cron tick.

Risk assessment

Concern Likelihood Impact Mitigation
Build step fails on ubuntu-latest due to PyYAML install low sync fails wrapper script handles --user then --break-system-packages fallback; same install pattern works in validate-plugins.yml today
Curated cuopt skill renamed/removed by an upstream-product sync moderate over time warning logged; plugin ships fewer skills soft-fail policy — no PR block
First production run hits an environment quirk low sync fails on first cron tick workflow_dispatch smoke-test recommended after merge
PR title slightly longer (now includes "plugin catalog" when applicable) certain cosmetic only accurate summary of what changed

What stays guaranteed-safe:

  • No existing step in sync-skills.yml was modified or removed; one step was inserted.
  • skills/ is still the only catalog-source written by sync.
  • All other workflows (dco, verify-authors, team-request, request-nvskills-ci) untouched.
  • Hard-fail still applies to genuine config disasters: missing _defaults.yml, missing marketplace.json, drift-check fail.

Follow-up work (deliberately out of scope)

Captured as a TODO comment block in validate-plugins.yml:

  • claude plugin validate per generated plugin manifest + claude plugin validate .claude-plugin/marketplace.json (verified locally against @anthropic-ai/claude-code@2.1.145; deferred to keep this PR lean).
  • A Codex-side validator. The codex CLI has no plugin validate subcommand; options are an install-as-validator dry-run, a hand-rolled JSON schema check, or skipping (relying on the build's schema parity).

Other follow-ups, not required for this PR:

  • Open a plugins-curation-debt tracking issue (mirroring the existing missing-compliance pattern) when soft-fail warnings fire, so curation gaps don't get lost in workflow logs.
  • Future plugins (NemoClaw, RAG, VSS, etc.) once those teams' skills stabilize.

Test plan

  • build-plugins.sh produces clean output in copy mode
  • build-plugins.sh produces clean output in symlink mode
  • build-plugins.sh --check green on clean tree, red on hand-edits
  • All 10 soft-fail scenarios log warnings/errors and continue
  • Drift check green after the sync's auto-rebuild step
  • workflow_dispatch run of sync-skills.yml after merge confirms the new step works end-to-end against real product repos
  • First scheduled sync after merge produces a clean PR

Introduces a build pipeline that turns plugins.d/<name>.yml definitions
into installable plugin trees, keeping the canonical ./skills/ catalog
as the single source of truth.

- plugins.d/_defaults.yml + plugins.d/nvidia-skills.yml: hand-edited
  plugin spec (curated subset, branding, capabilities)
- .github/scripts/build-plugins.{py,sh}: regenerates plugins/<name>/
  (.{claude,codex}-plugin/plugin.json, skills/) and updates both
  top-level marketplace.json files
- .github/workflows/validate-plugins.yml: PR drift guard
- plugins/nvidia-skills/: generated catalog plugin tree
- .claude-plugin/marketplace.json + .agents/plugins/marketplace.json:
  top-level marketplaces pointing at the generated subtree

The build script supports two materialization strategies via the
`materialize_skills:` field:

  copy     - rsync skill content into the plugin tree. Works for Codex
             and Claude install but duplicates SKILL.md.
  symlink  - relative symlinks back into ./skills/. No duplication and
             works for Claude + npx skills, but Codex 0.132 silently
             drops symlinks during `codex plugin add`.

nvidia-skills uses `symlink` on this branch to demonstrate the
no-duplication design; flip to `copy` in plugins.d/nvidia-skills.yml
when shipping for Codex local-marketplace support.

Signed-off-by: Jason Dudash <jdudash@nvidia.com>
Mirror the Codex side, which already declares "skills": "./skills/".
Claude expects an array of paths, so we emit ["./skills/"] (container
path; Claude scans immediate children for SKILL.md).

Signed-off-by: Jason Dudash <jdudash@nvidia.com>
Rename the per-plugin YAML option that selects how `include_skills` show
up under `plugins/<name>/skills/` from `materialize_skills` to
`skill_files`, with values `copy | symlink`. Both modes remain supported;
this branch's purpose is to provide that choice.

`copy` stays the default in plugins.d/_defaults.yml — Codex 0.132 silently
drops symlinks during `codex plugin add`, so real files are required for
local-marketplace install. Plugins that ship to Claude / `npx skills add`
only can opt into `symlink` per-plugin to avoid SKILL.md duplication.

Touched:
- .github/scripts/build-plugins.py: read spec.get("skill_files"),
  updated error message
- plugins.d/_defaults.yml: skill_files: copy
- plugins.d/nvidia-skills.yml: comment now points at skill_files (no
  override; inherits copy)
- plugins.d/README.md: heading + override example

Also folds in the marketplace identity rename from another branch
(`nvidia-skills` → `nvidia-official` in both marketplace.json files) and
regenerates plugins/nvidia-skills/skills/ as real directories under the
new copy default.

Signed-off-by: Jason Dudash <jdudash@nvidia.com>
Carry over the trailing comment block from
jasonnvidia/skills@feat/plugin-marketplace describing how to add
upstream CLI validation as a follow-up: Claude Code CLI
`claude plugin validate` for per-plugin manifests and
.claude-plugin/marketplace.json (verified 2026-05-21), and three
options for Codex-side coverage given that the codex CLI has no
`plugin validate` subcommand. Documentation only; no behavior change.

Signed-off-by: Jason Dudash <jdudash@nvidia.com>
Set `skill_files: symlink` in plugins.d/nvidia-skills.yml. The on-disk
plugins/nvidia-skills/skills/ tree is now three relative symlinks back
into ./skills/<Product>/<skill>/ instead of rsync'd copies, eliminating
the duplicated SKILL.md content from the previous commit.

Compatible with `claude plugin install` and `npx skills add`. NOT
compatible with `codex plugin add` (Codex 0.132 silently drops symlinks
during install) — flip back to `copy` to test Codex.

The default in plugins.d/_defaults.yml stays `copy`; this is a
per-plugin override.

Signed-off-by: Jason Dudash <jdudash@nvidia.com>
The module docstring still claimed skills/ was always materialized as
relative symlinks. Update it to describe the `skill_files: copy | symlink`
choice (default `copy`), and refresh the curated-plugin docstring to
match. No behavior change.

Signed-off-by: Jason Dudash <jdudash@nvidia.com>
Upstream's daily skills sync (7946eaa) reorganized skills/cuopt/ and
removed skills/NemoClaw/ entirely. The previous three curated entries
in plugins.d/nvidia-skills.yml were all broken:

- skills/cuopt/cuopt-install/                      → removed upstream
- skills/cuopt/cuopt-numerical-optimization-api-python/ → split into
  api-c / api-cli / server-api-python (no Python LP wrapper anymore)
- skills/NemoClaw/nemoclaw-user-get-started/       → entire NemoClaw
                                                     directory removed

Repoint the plugin at a single still-present skill,
skills/cuopt/cuopt-routing-api-python, so the build resolves and the
plugin ships symlinks consumers can actually follow. Wider re-curation
(adding more cuopt-* skills, restoring NemoClaw when it returns
upstream) deferred to a follow-up.

Signed-off-by: Jason Dudash <jdudash@nvidia.com>
Append skills/cuopt/cuopt-user-rules to include_skills so the plugin
ships base end-user guidance (covers install / routing / LP / MILP / QP
/ server) alongside the existing cuopt-routing-api-python skill.
Trailing-slash also normalized for consistency.

Signed-off-by: Jason Dudash <jdudash@nvidia.com>
Drop the skill_files: symlink override from plugins.d/nvidia-skills.yml
so the plugin inherits skill_files: copy from _defaults.yml. Rebuild
materializes the two curated cuopt skills as real directory copies
(including the new skill-card.md and skill.oms.sig sidecars now
shipping from upstream cuopt) instead of relative symlinks.

This is the layout Codex local-marketplace install requires (codex
plugin add silently drops symlinks). Trade-off: plugins/<name>/skills/
now duplicates SKILL.md content, and copy-mode plugin trees go stale
whenever the daily skills sync touches a curated source path.

Signed-off-by: Jason Dudash <jdudash@nvidia.com>
Two changes so copy-mode plugins (the default) coexist with the daily
skills sync without manually rebuilding the plugin tree:

1. .github/scripts/build-plugins.py — soft-fail include_skills entries
   that the daily sync can legitimately invalidate:
     * include_skills path missing on disk      (rename / removal)
     * path not a directory                     (rare upstream change)
     * no SKILL.md under a curated path         (compliance enforcement
                                                 dropped every skill)
     * duplicate skill basename across entries  (rename collision)
   Each emits a loud "warning: ... skipping" log line and the build
   continues. Plugin ships with whatever curated skills actually
   resolved. Static config errors (bad yaml, invalid skill_files mode,
   etc.) still hard-fail.

2. .github/workflows/sync-skills.yml — run build-plugins.sh after the
   compliance-enforcement step and before regenerate-readme, so the
   sync PR ships a refreshed plugin tree alongside the skills/ updates.
   Mark "plugin catalog" as a changed component when the rebuild
   produced a diff so the PR summary surfaces it and the has_changes
   gate fires.

Net effect:
- Copy-mode plugins stay in sync with the canonical catalog
  automatically; validate-plugins.yml --check no longer trips on the
  daily sync.
- Renames / compliance drops in upstream skills do not block the
  sync PR — they surface as warnings and as a curation-debt entry in
  plugins.d/<name>.yml that a maintainer can reconcile in a follow-up.

Signed-off-by: Jason Dudash <jdudash@nvidia.com>
Convert six previously-fatal die() sites to error-or-warning logs that
let the build continue past per-plugin misconfigurations. The principle
matches the expand_skill_paths soft-fail introduced in e6335f0: one bad
input no longer takes down the whole catalog.

Helpers:
  error()              non-exiting stderr logger (companion to die())
  read_yaml_lenient()  parse-or-None YAML reader for per-plugin files

Sites converted (build_plugins.py):
  A. read_yaml(<plugins.d/x.yml>)         malformed → log + skip file
  B. invalid skill_files=<value>          → warning + fallback to "copy"
  C. plugin name not kebab-case           → log + skip plugin (moved
                                             into discover so the bad
                                             entry never enters the
                                             catalog dict)
  D. curated .skills-manifest.yml empty   → warning + materialize empty
                                             skills/ tree
  E. missing 'name' in plugins.d file     → log + skip file
  F. duplicate plugin name across         → log + first-alphabetically
     plugins.d/                              wins, the rest are skipped
                                             (matches expand_skill_paths)

Sites kept as die() (structural / guard / manual-invocation):
  - read_yaml(_defaults.yml)
  - missing .claude-plugin/marketplace.json
  - missing .agents/plugins/marketplace.json
  - --only NAME with no match
  - --check drift detection

Smoke-tested all six locally: each malformed input emits a loud
error/warning line and the build still completes with the rest of
the catalog intact.

Signed-off-by: Jason Dudash <jdudash@nvidia.com>
sayalinvidia added a commit to sayalinvidia/sayali-skills-test that referenced this pull request May 27, 2026
Squash-merge of test PR #42 — validates plugin catalog build OOTB against
this repo's skills/ tree. validate-plugins.yml ran green pre-merge.

Source: NVIDIA/skills PR #91 (feat/plugin-build-modes by @jasonnvidia)

Signed-off-by: Sayali Kandarkar <skandarkar@nvidia.com>
@sayalinvidia
Copy link
Copy Markdown
Collaborator

LGTM! approved workflows to run!

Copy link
Copy Markdown
Collaborator

@sayalinvidia sayalinvidia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! merging

@sayalinvidia sayalinvidia merged commit d7b16ae into NVIDIA:main May 27, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants