Skip to content

Add gh-aw agent factory automation suite#1631

Draft
IEvangelist wants to merge 2 commits into
mainfrom
dapine/agent-factory-automation
Draft

Add gh-aw agent factory automation suite#1631
IEvangelist wants to merge 2 commits into
mainfrom
dapine/agent-factory-automation

Conversation

@IEvangelist
Copy link
Copy Markdown
Member

@IEvangelist IEvangelist commented May 18, 2026

Summary

Adopts GitHub Agentic Workflows (gh aw) and a tuned set of workflows from Peli's Agent Factory, covering the categories I want to lean on first: incoming-issue triage, stale-backlog triage, CI failure diagnosis, doc maintenance, and security compliance.

Workflow Trigger Effect
issue-triage issues: [opened, reopened] Labels, comments, types, and (for obvious spam) closes the issue. Rubric tightened to this repo's actual label setarea-* (or needs-area-label), at-most-one of bug / enhancement / documentation, plus azure / external / good first issue where clearly applicable. Explicitly forbidden to invent labels.
backlog-triage (new) schedule: weekly on monday + workflow_dispatch Picks the oldest open issues that have no area label, no needs-area-label, and no type label, and walks at most 10 per run (dispatch input allows up to 25) through the same rubric. Idempotent via a <!-- backlog-triage --> marker comment so we don't re-triage. Aimed at the ~60 stale issues that pre-date this automation.
ci-doctor workflow_run on ["Aspire Samples CI"] completed on main Files a [ci-doctor] issue (labels automation, ci) with root-cause analysis when the protected branch breaks.
doc-updater schedule: weekly on monday + workflow_dispatch Scope-narrowed: opens a [docs] PR updating only samples/*/README.md files whose corresponding sample folder had material code changes in the last 7 days. Explicitly forbidden to touch the root README.md, CODE_OF_CONDUCT.md, SECURITY.md, LICENSE, or anything outside samples/.
malicious-code-scan schedule: daily + workflow_dispatch Reviews the last 3 days of code changes for supply-chain / exfiltration patterns and files Code Scanning alerts (not issues).

A sixth workflow, agentics-maintenance.yml, is auto-generated by gh aw compile because doc-updater uses expires; it sweeps expired draft items on a schedule.

Also in this PR: labeler workflows now actually run

The existing labeler-cache-retention.yml, labeler-predict-issues.yml, and labeler-predict-pulls.yml workflows were imported from the dotnet/issue-labeler onboarding template, including its github.repository_owner == 'dotnet' guard. Since this repo lives under microsoft/, those guards were silently no-opping every scheduled and issues: opened run — the ML labeler hasn't actually been labeling anything here.

This PR flips the three guards from 'dotnet' to 'microsoft' so the existing labeler infrastructure (already trained and pinned by SHA) actually fires. labeler-train.yml and labeler-promote.yml have no such guard and are intentionally left alone.

Known caveats

A few things worth calling out before this gets enabled in earnest:

  • set-issue-type may silently no-op. The triage workflows attempt to set a repo-level issue type (Bug/Feature/Task), which depends on whether the parent organization has issue types configured and exposed to this repo. If it isn't, the call fails silently and the agent falls back to a type label (bug / enhancement / documentation). This is intentional and documented in the prompt body.
  • ci-doctor only investigates main failures. PR-CI failures are deliberately out of scope: the doctor's job is to alert maintainers when the protected branch breaks, not to comment on every red PR. If we want a per-PR fix-it agent later, pr-fix from the same catalog is a drop-in candidate.
  • malicious-code-scan requires Code Scanning to be enabled. Microsoft repos generally have Advanced Security (and therefore Code Scanning) enabled, but if alerts don't appear in the Security tab after the first run, check that before assuming a workflow bug.

Authentication model

Two credential systems. Both must be configured by a maintainer before these workflows can do anything other than fail-fast.

1. AI engine — COPILOT_GITHUB_TOKEN

All Copilot-engine workflows need a fine-grained PAT with Copilot Requests: Read permission, owned by a user with an active Copilot license. GitHub App tokens are not accepted for the engine (gh-aw auth reference).

2. GitHub tools + safe-outputs — ASPIRE_BOT GitHub App

Every write-back operation (labels, comments, issues, PRs, code-scanning alerts) is wired to mint a short-lived, scope-minimized token via actions/create-github-app-token using the new ASPIRE_BOT GitHub App. Two new repo secrets:

  • ASPIRE_BOT_APP_ID — App ID (or Client ID) of the ASPIRE_BOT App.
  • ASPIRE_BOT_PRIVATE_KEY — PEM-encoded private key for the App.

The ASPIRE_BOT App needs to be installed on this repo with at minimum:

  • Contents: Read & write (for doc-updater's PR branch pushes)
  • Issues: Read & write (issue-triage, backlog-triage, ci-doctor)
  • Pull requests: Read & write (doc-updater)
  • Metadata: Read
  • Code scanning alerts: Read & write (malicious-code-scan)
  • Actions: Read (ci-doctor reading workflow run logs)

Security review

gh aw flagged the following deltas at compile time. I've reviewed each:

New restricted secrets — both intentional, both consumed only by actions/create-github-app-token to mint scoped App tokens:

  • ASPIRE_BOT_APP_ID
  • ASPIRE_BOT_PRIVATE_KEY

New pinned action (in malicious-code-scan.lock.yml) — required for filing code-scanning alerts:

  • github/codeql-action/upload-sarif@e46ed2cbd01164d986452f91f178727624ae40d7 (v4.35.3)

All actions in the generated .lock.yml files are pinned by SHA, and the pin manifest is recorded in .github/aw/actions-lock.json.

How to develop these workflows

The CLI: gh extension install github/gh-aw. Then from the repo root:

gh aw validate                 # schema-check all workflows
gh aw compile                  # regenerate *.lock.yml (commit both .md and .lock.yml)
gh aw status                   # list registered workflows
gh aw run <workflow-name>      # trigger on GitHub Actions
gh aw audit <run-id-or-url>    # download logs and generate a report
gh aw disable|enable <name>    # toggle individual workflows

*.lock.yml files are marked linguist-generated=true merge=ours in .gitattributes. Never hand-edit them; re-run gh aw compile after changing the source .md. The prompt bodies of the .md files are imported at run time via {{#runtime-import ...}}, so prose-only edits don't churn the lock file.

For the index/readme that ships in this PR, see .github/workflows/agentic.md — it mirrors the style of the existing labeler.md.

Why these workflows?

These map to the most-trafficked patterns from the agent factory tour: triage (both new and stale), fault investigation, doc hygiene, security compliance. Other useful agents (daily-repo-status, pr-fix, ChatOps-style assistants, etc.) can be added later via gh aw add githubnext/agentics/<workflow-name> and are intentionally deferred until this baseline is proven.

Marking as draft

I'm leaving this PR as a draft until the three secrets above are configured on the repo — otherwise the workflows will all fail their first scheduled tick. Once secrets are in place, ready for review.

IEvangelist and others added 2 commits May 18, 2026 08:30
Adopt GitHub Agentic Workflows (gh aw) and patterns from Peli's Agent
Factory. Initial suite covers four categories:

- issue-triage          - triage and label new issues, close obvious spam
- ci-doctor             - investigate 'Aspire Samples CI' failures on main
                          and file a [ci-doctor] issue with root-cause
- doc-updater           - retuned for this repo: keeps each sample's
                          samples/<name>/README.md in sync with code
                          changes in its own folder; strictly scoped, never
                          touches the root README, CODE_OF_CONDUCT, SECURITY,
                          LICENSE, or anything outside samples/
- malicious-code-scan   - daily supply-chain / exfiltration review; files
                          Code Scanning alerts, not issues

All four workflows use the Copilot engine. All GitHub write-back operations
(labels, comments, issues, PRs, code-scanning alerts) go through a dedicated
GitHub App (ASPIRE_BOT) via �ctions/create-github-app-token, so each job
gets a short-lived, scope-minimized token instead of a broad PAT.

Adds .github/workflows/agentic.md as an index documenting the suite,
required secrets, and required App permissions - parallel to the existing
labeler.md.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…labelers

Followup to the initial agent factory PR addressing a self-review pass:

issue-triage.md
- Replace generic Step 3 rubric (priority labels, platform labels,
  needs-info/question/duplicate/spam labels) with this repo's actual
  label set: at-most-one area-* (or needs-area-label), at-most-one of
  bug/enhancement/documentation, plus azure/external/good-first-issue
  where clearly applicable. Explicitly forbid inventing labels.
- Document that set_issue_type may silently no-op if org-level issue
  types aren't exposed; in that case fall back to a type *label*.
- Adjust incomplete-issue guidance for Aspire/.NET context.

doc-updater.md
- Drop schedule from daily to weekly-on-monday. Daily README churn was
  too aggressive for a samples repo with low daily commit volume.
- Update the prompt body (24h window -> 7d window, "yesterday" date
  computation -> 7-day window) to match the new cadence.

backlog-triage.md (NEW)
- Fifth workflow that attacks the 60+ stale, never-triaged issues a
  small batch at a time. Weekly schedule + workflow_dispatch (max 10
  issues per run, dispatch input allows up to 25).
- Pre-activation step picks the oldest open issues that have no area
  label, no needs-area-label, and no type label, then filters out any
  issue that already carries a "<!-- backlog-triage -->" marker
  comment. Idempotent across runs.
- Reuses the same rubric as issue-triage. Adds a "Staleness check"
  step that recommends closure (but does NOT close) for issues
  referencing removed samples or fixed-elsewhere bugs, leaving final
  judgment to maintainers.

labeler-cache-retention.yml, labeler-predict-issues.yml,
labeler-predict-pulls.yml
- Flip the github.repository_owner == 'dotnet' guard to 'microsoft'
  so the dotnet/issue-labeler ML labeler actually runs on this repo
  instead of being a permanent no-op. Comment in each file updated to
  match. labeler-train.yml and labeler-promote.yml have no such guard
  and are intentionally left alone.

agentic.md
- Add backlog-triage row to the catalog.
- Note the new doc-updater cadence (weekly).
- Add a "Known caveats" section covering: set-issue-type may no-op
  without org issue types, ci-doctor only fires on main (PR-CI is
  intentionally out of scope), and malicious-code-scan requires Code
  Scanning to be enabled on the repo.

Validated: gh aw compile --approve -> 5 workflows, 0 errors, 0
warnings; gh aw validate -> 5 workflows, 0 errors.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant