feat(skills): add nemoclaw-maintainer-issue-autopilot skill by cjagwani · Pull Request #3521 · NVIDIA/NemoClaw

cjagwani · 2026-05-14T16:23:10Z

Summary

Coder autopilot — takes the simplest in-scope NemoClaw issue and ships a minimum-scope PR end-to-end through nine gated stages. Resumable state survives conversation breaks.

Behavior

Local-only by default — drafts only, never posts to GitHub.
Emits a JSON sidecar (/tmp/nemoclaw-skill-output-issue-autopilot-<run_id>.json) for chaining with sibling maintainer skills in the suite.

Conformance audit — Claude Agent Skills best practices

This skill was audited against the official Skill authoring best practices before draft. Per-item evidence:

Core quality

Item	Status	Evidence
Description specific + key terms	✓	`description` is 670 chars (under 1024 cap), first word `Ships` (third-person, per spec)
Description has WHAT + WHEN	✓	Explicit `Use when…` trigger phrase present in the description
SKILL.md body under 500 lines	✓	Currently 250 lines (50%)
Additional details in separate files	✓	Supporting files: `MULTI-MODEL-TESTING.md`, `STAGE-9-ACCEPTANCE-GATE.md`, `RESUMABLE-STATE.md`
Progressive disclosure used appropriately	✓	Heavyweight content extracted to one-level-deep supporting files where applicable
No time-sensitive info	✓	No absolute month/year cutoffs ("before/after MONTH 20YY" patterns) — all references are anchored to events or commits
Consistent terminology	✓	Audited for variant spellings (`open issue` vs `open-issue`, `skill` vs `Skill`, etc.)
Examples are concrete	✓	8 real PR/issue references in SKILL.md: #2757, #3115, #3265, #3280, #3295 (+3 more)
File references one level deep	✓	All supporting files linked directly from SKILL.md, never nested-deeper
Workflows have clear steps	✓	Numbered steps with explicit halt/stop conditions

Code and scripts

Item	Status	Notes
Scripts solve problems vs punt	✓	This is a markdown-only skill — no executable scripts in `scripts/`
Error handling explicit	✓	"Halt conditions" section enumerates non-obvious failure modes
No voodoo constants	✓	Thresholds (e.g. `--min-confidence 0.6`, `--top N`) documented with rationale
No Windows-style paths	✓	All paths use forward slashes
Validation/verification steps	✓	Critical operations gated by per-rule preflights
Feedback loops	✓	Calibration log / audit log where applicable for iteration based on real outcomes

Testing

Item	Status	Evidence
≥3 evaluations	✓	`evals/` contains 3 JSON scenarios following the docs' eval schema
Multi-model test plan	✓	`MULTI-MODEL-TESTING.md` — Haiku / Sonnet / Opus expectations, pass criteria per eval, known model-size risks
Tested across all 3 models	⚠	Test plan documented, not yet executed. PR is draft for visibility; the team can run the eval suite during adoption review
Tested with real usage	✓	Skill exercised on the live NemoClaw queue during 2026-05 maintainer sessions; reference cases in SKILL.md

Frontmatter constraints (validated)

name: nemoclaw-maintainer-issue-autopilot — under 64 chars, lowercase + hyphens, no reserved words ("anthropic" / "claude")
description: third-person verb-initial, under 1024 chars, no XML tags, includes explicit Use when… trigger

Notes for reviewers

Part of an 11-skill maintainer suite. Draft for visibility. The team's <10 open-PR policy means 6 are open and 5 are closed-but-branch-preserved; reopen via gh pr reopen <num> as slots free up.

🤖 Generated with Claude Code

Coder autopilot: takes the simplest in-scope NemoClaw issue and ships a minimum-scope PR end-to-end. Nine stages with user gates - local-branch precheck, selection, scope check, reproduce or refute, test-first implementation, PR open, batch self-review, CI watch, CodeRabbit fix loop, perfect-match acceptance gate. Resumable state file survives conversation breaks. Pre-commit identity check rejects 'Test User' fallbacks. Emits a JSON sidecar for chaining with sibling maintainer skills. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Charan Jagwani <cjagwani@nvidia.com>

copy-pr-bot · 2026-05-14T16:23:13Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-05-14T16:23:17Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1fa9d679-9771-48d4-92bc-28c9a463840d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch ship-skill-issue-autopilot

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-14T16:24:21Z

E2E Advisor Recommendation

Required E2E: cloud-inference-e2e
Optional E2E: skill-agent-e2e

Dispatch hint: cloud-inference-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: medium

Required E2E

cloud-inference-e2e (high): This is the closest existing E2E coverage for repository skill assets: it runs live cloud inference and includes repo .agents/skills validation for SKILL.md frontmatter/body via test/e2e/e2e-cloud-experimental/features/skill/lib/validate_repo_skills.sh. The PR adds a new SKILL.md under .agents/skills, so this should block merge for basic skill-loadability confidence.

Optional E2E

skill-agent-e2e (high): Useful adjacent confidence for the skill injection and agent-read path, but it uses a smoke fixture rather than the new issue-autopilot skill, so it is not directly merge-blocking for this PR.

New E2E recommendations

maintainer-skill-semantic-evals (high): The new issue-autopilot skill defines critical semantic behavior—Stage 0 must run first, state must be atomically persisted and resumed, identity must be checked before commits, and Stage 9 must halt on missing clauses—but no existing E2E job appears to execute the added .agents/skills/.../evals/*.json scenarios against the actual skill.
- Suggested test: Add a workflow-dispatchable maintainer-skill-evals E2E job that runs the new eval JSON fixtures in a sandboxed git repository with fake gh responses and asserts the expected_behavior clauses for Stage 0 ordering, --resume handling, and Stage 9 perfect-match halting.
repo-skill-validation (medium): Existing repo skill validation checks only SKILL.md frontmatter/body and does not validate auxiliary skill files or eval JSON schema. This PR adds multiple companion markdown files and eval JSON assets that could drift or become malformed without an E2E/CI guard.
- Suggested test: Extend the skill validation coverage to discover .agents/skills//evals/.json, validate JSON shape, ensure referenced skill ids match the parent skill, and check that linked companion markdown files referenced by SKILL.md exist.

Dispatch hint

Workflow: nightly-e2e.yaml
jobs input: cloud-inference-e2e

Adds the following to satisfy the Claude Agent Skills best-practices checklist (https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices): - Three evaluation scenarios in evals/ following the docs' eval schema - Multi-model test plan in MULTI-MODEL-TESTING.md (Haiku / Sonnet / Opus expectations, pass criteria, known risks) - Terminology normalized to single canonical form - Concrete reference cases (real-but-anonymized examples) where the prior SKILL.md was abstract - Progressive-disclosure splits where SKILL.md was approaching the 500-line soft limit (issue-autopilot, scope-issues) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Charan Jagwani <cjagwani@nvidia.com>

jyaunches · 2026-05-15T13:13:53Z

Superseded by the NemoClaw team skills GitLab repository snapshot: https://gitlab-master.nvidia.com/jyaunches/nemoclaw-team-skills. Closing this PR so skill sharing continues in the dedicated team-skills repo instead of merging these skills into NemoClaw directly.

jyaunches closed this May 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(skills): add nemoclaw-maintainer-issue-autopilot skill#3521

feat(skills): add nemoclaw-maintainer-issue-autopilot skill#3521
cjagwani wants to merge 2 commits into
mainfrom
ship-skill-issue-autopilot

cjagwani commented May 14, 2026 •

edited

Loading

Uh oh!

copy-pr-bot Bot commented May 14, 2026

Uh oh!

coderabbitai Bot commented May 14, 2026 •

edited

Loading

Review skipped

Uh oh!

github-actions Bot commented May 14, 2026 •

edited

Loading

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Dispatch hint

Uh oh!

jyaunches commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

cjagwani commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Behavior

Conformance audit — Claude Agent Skills best practices

Core quality

Code and scripts

Testing

Frontmatter constraints (validated)

Notes for reviewers

Uh oh!

copy-pr-bot Bot commented May 14, 2026

Uh oh!

coderabbitai Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

github-actions Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Dispatch hint

Uh oh!

jyaunches commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cjagwani commented May 14, 2026 •

edited

Loading

coderabbitai Bot commented May 14, 2026 •

edited

Loading

github-actions Bot commented May 14, 2026 •

edited

Loading