AI-ModCon · anagainaru · Jun 6, 2026 · Jun 4, 2026 · Jun 4, 2026 · Jun 4, 2026
diff --git a/.claude/skills/custom-experiment/SKILL.md b/.claude/skills/custom-experiment/SKILL.md
@@ -0,0 +1,90 @@
+---
+name: custom-experiment
+description: |
+  Run an apeiron experiment on the user's OWN dataset and model end-to-end.
+  Use when the user wants to bring their own data + architecture (beyond the
+  shipped MNIST/CIFAR examples), scaffold a custom model harness, write a config
+  for it, smoke-test it, and run the full experiment. Self-contained: it creates
+  the harness, data utilities, and TOML, registers them in the example factory,
+  and runs. For trying the bundled examples instead, use explore-examples; for
+  adding apeiron to a separate project's training loop, use integrate-apeiron.
+argument-hint: "<short_name> [config_output_path]"
+user-invocable: true
+allowed-tools:
+  - Bash
+  - Read
+  - Write
+  - Edit
+  - Glob
+  - Grep
+---
+
+Scaffold and run an apeiron experiment on the user's own data and model.
+
+## Arguments
+- `$1`: Short name for the dataset/harness (lowercase, e.g. `fashionmnist`, `mytabular`). Used for the `examples/$1/` dir and the `data.name` factory key.
+- `$2`: Optional output path for the TOML config. Defaults to `examples/$1/$1.toml`.
+
+## Procedure
+
+### 1. Gather the specifics from the user
+Ask only for what isn't already provided:
+- Dataset source and how to load it (torchvision, HuggingFace, local files, custom `Dataset`).
+- Model architecture (CNN, MLP, ViT, …), input shape, number of classes/outputs.
+- Type of drift to simulate on the stream (e.g. affine transforms for images, feature noise for tabular). apeiron's examples simulate drift inside `update_data_stream()`.
+- Pretrained weights? Path if so (optional — harness should tolerate their absence).
+- Which drift detector and CL updater to start with (default `ADWINDetector` + `base`).
+
+### 2. Read the current patterns (don't hardcode signatures — they rot)
+Mirror the live source rather than assuming method names:
+```bash
+cat src/apeiron/model/torch_model_harness.py   # the ABC + abstract methods to implement
+cat examples/mnist/model.py                      # canonical harness
+cat examples/mnist/utils.py                       # data-loading + drift-sim pattern
+cat examples/utils.py                             # get_example() factory to extend
+grep -nA6 "class .*Cfg" src/apeiron/config/configuration.py  # config fields
+```
+Implement exactly the `@abstractmethod`s the ABC declares (currently includes `get_optmizer` — note that spelling — `update_data_stream`, `get_stream_dataloader`, `get_hist_dataloaders`, `get_train_dataloaders`, `get_criterion`). Set `self.eval_metrics` with at least an `accuracy` entry from `apeiron.evaluation.metrics`.
+
+### 3. Scaffold the files
+- `examples/$1/__init__.py` — empty.
+- `examples/$1/model.py` — `BaseModelHarness` subclass calling `super().__init__(cfg=cfg, model=<nn.Module>)`, implementing every abstract method, applying cumulative drift in `update_data_stream()`, and returning `(None, None)` from `get_hist_dataloaders()` on the first task.
+- `examples/$1/utils.py` — dataset loaders, a deterministic drift transform, a `TransformedView` wrapper, and a `make_loader(...)` factory (follow the MNIST utils structure).
+- Config at `$2` (default `examples/$1/$1.toml`) with `[model]`, `[data]` (`name = "$1"`), `[train]`, `[drift_detection]`, optional `[continual_learning]`, `[visualization]`. Read an existing config for the exact key set.
+
+### 4. Register in the factory (in-repo)
+Add a branch to `get_example()` in `examples/utils.py`:
+```python
+elif cfg.data.name == "$1":
+    from examples.$1.model import <HarnessClass>
+    return <HarnessClass>(cfg=cfg)
+```
+
+### 5. Validate
+```bash
+python -c "import tomllib; tomllib.load(open('$2','rb')); print('TOML OK')"
+poetry run python -c "from examples.utils import get_example; print('factory OK')"
+```
+If `pretrained_path` is set, confirm the file exists; warn if missing (run will train from scratch).
+
+### 6. Smoke-test before the full run
+Run a tiny, fast pass to catch wiring errors cheaply, then **confirm with the user** before the real run:
+```bash
+poetry run python -m src.main --config $2 \
+  --set train.max_iter=2 \
+  --set drift_detection.max_stream_updates=2 \
+  --set drift_detection.detection_interval=1 \
+  --set device=cpu \
+  --set logging.backend=none
+```
+If it fails, read the traceback, fix the harness/config, and re-run the smoke test. Do not proceed until it completes cleanly.
+
+### 7. Full run and report
+```bash
+poetry run python -m src.main --config $2
+```
+Report drift events, final accuracy, and the output CSV path (the config's `visualization.input`). Note the package emits this CSV for inspection; it does not ship a built-in dashboard renderer.
+
+## Notes
+- Registration here uses the in-repo factory pattern. To instead drive apeiron from your *own* project without editing this repo, use the integrate-apeiron skill.
+- The repo also has older `new-harness` / `new-config` skills covering pieces of this; they are stale (pre-`src/apeiron/` layout) and slated for refresh — prefer this skill.
diff --git a/.claude/skills/debug-experiment/SKILL.md b/.claude/skills/debug-experiment/SKILL.md
diff --git a/.claude/skills/explain/SKILL.md b/.claude/skills/explain/SKILL.md
diff --git a/.claude/skills/explore-examples/SKILL.md b/.claude/skills/explore-examples/SKILL.md
@@ -0,0 +1,64 @@
+---
+name: explore-examples
+description: |
+  Run a bundled apeiron example experiment to explore the framework's
+  capabilities. Use when the user wants to try the software, run a default/demo
+  experiment, see drift detection and continual learning in action, or pick from
+  the shipped MNIST/CIFAR configs. Presents a menu of available example configs,
+  runs the chosen one, and reports where the metrics CSV landed. For running the
+  user's OWN data/model/config, use the custom-experiment skill instead.
+argument-hint: "[config_path]"
+user-invocable: true
+allowed-tools:
+  - Bash
+  - Read
+  - Glob
+  - Grep
+---
+
+Run a bundled apeiron example so the user can see the framework working end-to-end.
+
+## Arguments
+- `$1`: Optional path to a specific bundled config. If given, skip the menu and run it directly (still apply steps 3–5). If omitted, present the menu (step 1).
+
+## Procedure
+
+### 1. Build the menu dynamically (do not hardcode the list — it rots)
+Discover the shipped configs and summarize each from its own contents:
+```bash
+find examples -name "*.toml" -type f | sort
+```
+For each config, read the key fields to describe it (`data.name`, `model.name`, `drift_detection.detector_name`, `continual_learning.update_mode`). Present a numbered menu like:
+`1) examples/mnist/mnist.toml — MNIST, ADWIN detector, base updater`
+Then ask the user which to run.
+
+### 2. Default to MNIST; flag missing pretrained weights for others
+- **MNIST is the guaranteed hands-off path** — `examples/mnist/mnist.pth` ships with the repo. Recommend it for a first run.
+- For any non-MNIST choice (e.g. CIFAR), check the config's `pretrained_path` before running:
+  ```bash
+  ls -la <pretrained_path> 2>/dev/null || echo "MISSING"
+  ```
+  If the weight file is missing, tell the user plainly: this example needs weights that don't ship with the repo, so the run will train from scratch (slow) or fail to load. Let them decide whether to continue or switch to MNIST.
+
+### 3. Ask which metrics-logging backend to use (per run)
+The config default is `wandb`. Before running, ask the user to choose, and pass it as an override so no edits are needed:
+- **none** — `--set logging.backend=none` (no account/network; best for a quick local look)
+- **wandb** — `--set logging.backend=wandb` (run `wandb login` first if not authenticated)
+- **mlflow** — `--set logging.backend=mlflow` (local tracking by default)
+
+### 4. Show the config and run it
+- Briefly summarize the chosen config (dataset, model, detector, updater, device, batch size) so the user can confirm.
+- Run from the project root:
+  ```bash
+  poetry run python -m src.main --config <config_path> --set logging.backend=<choice>
+  ```
+- This is a real training/monitoring run and may take a while. Stream output; do not silently background it.
+
+### 5. Report results
+- Summarize from the run output: whether drift was detected and how many times, final accuracy, and the output CSV path (the config's `visualization.input`).
+- The package emits this CSV for inspection; it does not ship a built-in dashboard renderer, so point the user at the CSV for further plotting.
+
+## Notes
+- Quick first run, copy-paste safe: `poetry run python -m src.main --config examples/mnist/mnist.toml --set logging.backend=none`
+- Useful overrides to demonstrate capabilities: `--set drift_detection.detector_name=PageHinkleyDetector`, `--set continual_learning.update_mode=ewc_online`, `--set device=cpu`.
+- If `poetry` isn't set up yet, point the user at the install/dev-setup step first.