Skip to content

feat(skills): add nemoclaw-maintainer-issue-autopilot skill#3521

Closed
cjagwani wants to merge 2 commits into
mainfrom
ship-skill-issue-autopilot
Closed

feat(skills): add nemoclaw-maintainer-issue-autopilot skill#3521
cjagwani wants to merge 2 commits into
mainfrom
ship-skill-issue-autopilot

Conversation

@cjagwani
Copy link
Copy Markdown
Contributor

@cjagwani cjagwani commented May 14, 2026

Summary

Coder autopilot — takes the simplest in-scope NemoClaw issue and ships a minimum-scope PR end-to-end through nine gated stages. Resumable state survives conversation breaks.

Behavior

  • Local-only by default — drafts only, never posts to GitHub.
  • Emits a JSON sidecar (/tmp/nemoclaw-skill-output-issue-autopilot-<run_id>.json) for chaining with sibling maintainer skills in the suite.

Conformance audit — Claude Agent Skills best practices

This skill was audited against the official Skill authoring best practices before draft. Per-item evidence:

Core quality

Item Status Evidence
Description specific + key terms description is 670 chars (under 1024 cap), first word Ships (third-person, per spec)
Description has WHAT + WHEN Explicit Use when… trigger phrase present in the description
SKILL.md body under 500 lines Currently 250 lines (50%)
Additional details in separate files Supporting files: MULTI-MODEL-TESTING.md, STAGE-9-ACCEPTANCE-GATE.md, RESUMABLE-STATE.md
Progressive disclosure used appropriately Heavyweight content extracted to one-level-deep supporting files where applicable
No time-sensitive info No absolute month/year cutoffs ("before/after MONTH 20YY" patterns) — all references are anchored to events or commits
Consistent terminology Audited for variant spellings (open issue vs open-issue, skill vs Skill, etc.)
Examples are concrete 8 real PR/issue references in SKILL.md: #2757, #3115, #3265, #3280, #3295 (+3 more)
File references one level deep All supporting files linked directly from SKILL.md, never nested-deeper
Workflows have clear steps Numbered steps with explicit halt/stop conditions

Code and scripts

Item Status Notes
Scripts solve problems vs punt This is a markdown-only skill — no executable scripts in scripts/
Error handling explicit "Halt conditions" section enumerates non-obvious failure modes
No voodoo constants Thresholds (e.g. --min-confidence 0.6, --top N) documented with rationale
No Windows-style paths All paths use forward slashes
Validation/verification steps Critical operations gated by per-rule preflights
Feedback loops Calibration log / audit log where applicable for iteration based on real outcomes

Testing

Item Status Evidence
≥3 evaluations evals/ contains 3 JSON scenarios following the docs' eval schema
Multi-model test plan MULTI-MODEL-TESTING.md — Haiku / Sonnet / Opus expectations, pass criteria per eval, known model-size risks
Tested across all 3 models Test plan documented, not yet executed. PR is draft for visibility; the team can run the eval suite during adoption review
Tested with real usage Skill exercised on the live NemoClaw queue during 2026-05 maintainer sessions; reference cases in SKILL.md

Frontmatter constraints (validated)

  • name: nemoclaw-maintainer-issue-autopilot — under 64 chars, lowercase + hyphens, no reserved words ("anthropic" / "claude")
  • description: third-person verb-initial, under 1024 chars, no XML tags, includes explicit Use when… trigger

Notes for reviewers

Part of an 11-skill maintainer suite. Draft for visibility. The team's <10 open-PR policy means 6 are open and 5 are closed-but-branch-preserved; reopen via gh pr reopen <num> as slots free up.

🤖 Generated with Claude Code

Coder autopilot: takes the simplest in-scope NemoClaw issue and
ships a minimum-scope PR end-to-end. Nine stages with user gates -
local-branch precheck, selection, scope check, reproduce or refute,
test-first implementation, PR open, batch self-review, CI watch,
CodeRabbit fix loop, perfect-match acceptance gate.

Resumable state file survives conversation breaks. Pre-commit
identity check rejects 'Test User' fallbacks. Emits a JSON sidecar
for chaining with sibling maintainer skills.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Charan Jagwani <cjagwani@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 14, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1fa9d679-9771-48d4-92bc-28c9a463840d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ship-skill-issue-autopilot

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 14, 2026

E2E Advisor Recommendation

Required E2E: cloud-inference-e2e
Optional E2E: skill-agent-e2e

Dispatch hint: cloud-inference-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: medium

Required E2E

  • cloud-inference-e2e (high): This is the closest existing E2E coverage for repository skill assets: it runs live cloud inference and includes repo .agents/skills validation for SKILL.md frontmatter/body via test/e2e/e2e-cloud-experimental/features/skill/lib/validate_repo_skills.sh. The PR adds a new SKILL.md under .agents/skills, so this should block merge for basic skill-loadability confidence.

Optional E2E

  • skill-agent-e2e (high): Useful adjacent confidence for the skill injection and agent-read path, but it uses a smoke fixture rather than the new issue-autopilot skill, so it is not directly merge-blocking for this PR.

New E2E recommendations

  • maintainer-skill-semantic-evals (high): The new issue-autopilot skill defines critical semantic behavior—Stage 0 must run first, state must be atomically persisted and resumed, identity must be checked before commits, and Stage 9 must halt on missing clauses—but no existing E2E job appears to execute the added .agents/skills/.../evals/*.json scenarios against the actual skill.
    • Suggested test: Add a workflow-dispatchable maintainer-skill-evals E2E job that runs the new eval JSON fixtures in a sandboxed git repository with fake gh responses and asserts the expected_behavior clauses for Stage 0 ordering, --resume handling, and Stage 9 perfect-match halting.
  • repo-skill-validation (medium): Existing repo skill validation checks only SKILL.md frontmatter/body and does not validate auxiliary skill files or eval JSON schema. This PR adds multiple companion markdown files and eval JSON assets that could drift or become malformed without an E2E/CI guard.
    • Suggested test: Extend the skill validation coverage to discover .agents/skills//evals/.json, validate JSON shape, ensure referenced skill ids match the parent skill, and check that linked companion markdown files referenced by SKILL.md exist.

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: cloud-inference-e2e

Adds the following to satisfy the Claude Agent Skills best-practices
checklist (https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices):

- Three evaluation scenarios in evals/ following the docs' eval schema
- Multi-model test plan in MULTI-MODEL-TESTING.md (Haiku / Sonnet /
  Opus expectations, pass criteria, known risks)
- Terminology normalized to single canonical form
- Concrete reference cases (real-but-anonymized examples) where the
  prior SKILL.md was abstract
- Progressive-disclosure splits where SKILL.md was approaching the
  500-line soft limit (issue-autopilot, scope-issues)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Charan Jagwani <cjagwani@nvidia.com>
@jyaunches
Copy link
Copy Markdown
Contributor

Superseded by the NemoClaw team skills GitLab repository snapshot: https://gitlab-master.nvidia.com/jyaunches/nemoclaw-team-skills. Closing this PR so skill sharing continues in the dedicated team-skills repo instead of merging these skills into NemoClaw directly.

@jyaunches jyaunches closed this May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants