Skip to content

patch: Add evals category for agent / workflow projects (closes #44)#46

Merged
rdwj merged 1 commit into
mainfrom
feat/44-evals-patch-category
May 7, 2026
Merged

patch: Add evals category for agent / workflow projects (closes #44)#46
rdwj merged 1 commit into
mainfrom
feat/44-evals-patch-category

Conversation

@rdwj
Copy link
Copy Markdown
Collaborator

@rdwj rdwj commented May 6, 2026

Step 2 of #42 — adds an evals category to AGENT_FILE_CATEGORIES.

Stacked on top of #43; please merge that first or the diff in this PR will be noisy.

What

  • New evals category covers evals/__init__.py, evals/assertions.py, evals/discovery.py, evals/mock_factory.py, evals/run_evals.py, evals/README.md. ask_before_patch=True since users may have customized the harness.
  • evals/evals.yaml (the test plan) and evals/fixtures/** (test data) go to AGENT_NEVER_PATCH — those are user-authored.
  • New patch evals subcommand wires through the existing _patch_category helper.
  • Updated CLAUDE.md with the new agent/workflow category list.

Out of scope

Per-template .fips-template.yaml manifests — tracked in #45.

Test plan

  • 4 new unit tests in TestEvalsCategory covering category placement, pattern coverage, ask_before_patch behavior, and never-patch entries for user inputs.
  • Full suite passes: 290 passed.
  • black src tests clean.
  • ruff check src tests clean.

Closes #44.

The agent and workflow templates ship a full eval harness under
`evals/` (assertions, discovery, mock_factory, runner, package
init, README). None of those files were covered by any patch
category, so updates were invisible to `fips-agents patch check`.

This adds an `evals` category to AGENT_FILE_CATEGORIES covering
just the harness machinery and registers a `patch evals`
subcommand. Set ask_before_patch=True since users may have
customized the harness.

User-authored eval inputs (`evals/evals.yaml` and `evals/fixtures/`)
go to AGENT_NEVER_PATCH so the test plan and data fixtures stay
under the user's control.

Stacks on top of #43.

Assisted-by: Claude Code (Opus 4.7)
@rdwj rdwj force-pushed the feat/44-evals-patch-category branch from 7baed48 to 13fc468 Compare May 7, 2026 00:10
@rdwj rdwj merged commit 94aac71 into main May 7, 2026
rdwj added a commit that referenced this pull request May 7, 2026
- Add v0.12.0 changelog entry (manifest loader, evals category, MCP
  claude category, never-patch matcher fix, pattern gap fills).
- Update Patch Commands section: list .fips-template.yaml manifest
  support, add Gateway/UI category table, refresh per-type tables to
  match the actual category surface after #43, #46, #48, #49.
- Expand the user-customized-files paragraph to cover the new
  AGENT_NEVER_PATCH entries and the gateway/UI never-patch list.

Assisted-by: Claude Code (Opus 4.7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant