Skip to content

Update claude and codex skills. Fix corresponding sections in README etc#104

Merged
anagainaru merged 3 commits into
mainfrom
skills_update
Jun 6, 2026
Merged

Update claude and codex skills. Fix corresponding sections in README etc#104
anagainaru merged 3 commits into
mainfrom
skills_update

Conversation

@ScSteffen

@ScSteffen ScSteffen commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

Summary

Consolidates the agent-skill set into four workflow-oriented skills maintained in parallel for both Claude Code and Codex, and removes the now-dead visualization config (baseline/output) that no bundled renderer consumes. Docs (CLAUDE.md, README.md) and example configs are brought in line with the current src/apeiron/ package layout.

Motivation & Context

The old skills were fine-grained, per-task scaffolds (new-detector,
new-updater, new-config, visualize, lint-check, …) that no longer matched how people actually use the framework, and several referenced a visualization entry point (src/visualize.py) and VisualizationCfg fields that the package no longer ships. This PR replaces them with task-level workflows and prunes the stale config/doc references so the repo describes what actually exists.

Approach

  • Skills, reworked into 4 workflows (each maintained for both .claude/skills/
    and .codex/skills/):
    • install-apeiron — add Apeiron as a dependency to another project.
    • explore-examples — run a bundled MNIST/CIFAR example.
    • custom-experiment — scaffold harness + data utils + TOML for your own
      data/model and run it.
    • integrate-apeiron — bolt drift detection / CL onto an existing training loop.
    • Removed the old granular Claude skills (debug-experiment, explain,
      lint-check, new-config, new-detector, new-harness, new-updater,
      run-experiment, visualize).
  • Config cleanup: VisualizationCfg keeps only input (the CSV path run
    metrics are written to); the unused baseline and output fields are dropped.
    Example TOMLs (mnist, cifar10_vgg11, cifar10_vit) are updated to match.
  • Docs: CLAUDE.md and README.md updated to the src/apeiron/ package
    paths, MLflow/WandB logging backends, the EnsembleDetector
    NotImplementedError caveat, and a new "Agent Skills" README section. Removed
    references to the dropped src/visualize.py entry point.

Screenshots / Logs (optional)

N/A — docs/config/tooling change.

API / CLI Changes

  • VisualizationCfg — removed fields baseline: float and output: str; only
    input: str remains (default "output/cl_only.csv").

Breaking Changes

  • TOML configs that set [visualization] baseline = … or [visualization] output = …
    must drop those keys; only input is still recognized. (The bundled example
    configs are already updated.)

Performance (optional)

N/A.

Security & Privacy

  • No secrets committed
  • Input validation added where needed (N/A — no new input paths)

Dependencies

  • None added or removed. poetry.lock content-hash/generator header updated only.

Testing Plan

  • Unit tests — tests/test_config.py::test_visualization_cfg updated to assert
    the remaining input default.
  • Integration tests
  • e2e / smoke test
  • Manual steps: poetry run pytest, poetry run ruff check .,
    poetry run mypy .

Documentation

  • Docstrings updated
  • User docs / README updated
  • CHANGELOG entry

Checklist

  • Code formatted (Ruff) → ruff format --check
  • Lint passes (Ruff) → ruff check .
  • Types pass (mypy/pyright) → mypy src
  • Tests pass (pytest) → pytest -q
  • Backward compatibility considered (see Breaking Changes)
  • Adequate comments for tricky parts
  • CI green

Risk & Rollback Plan

Low risk — docs, skills, and a small config-field removal. Rollback by reverting
the PR.

Notes for Reviewers

  • Start with src/apeiron/config/configuration.py + tests/test_config.py for
    the one behavioral change.
  • The bulk of the diff is skill markdown moving from many granular skills to four
    workflow skills, mirrored across .claude/skills/ and .codex/skills/.
  • Note the README's reminder to keep the two skill trees in sync.

@anagainaru anagainaru left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, I'll test them in a little bit and come back to approve. This is a good base on which to add the driver, drift detection and CL skills.

@anagainaru anagainaru left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the skills except the custom-experiment went well. I had some issues creating a good harness with the custom experiment. I think once we have a better documentation for the model harness this might go better. Let's merge this for now.

@anagainaru anagainaru merged commit 4fb7939 into main Jun 6, 2026
3 checks passed
@anagainaru anagainaru deleted the skills_update branch June 6, 2026 02:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants