Skip to content

feat: capture campaign metadata in enriched campaign.yaml at init#116

Open
susiejojo wants to merge 1 commit into
AI-native-Systems-Research:mainfrom
susiejojo:feat/campaign-metadata-enrichment
Open

feat: capture campaign metadata in enriched campaign.yaml at init#116
susiejojo wants to merge 1 commit into
AI-native-Systems-Research:mainfrom
susiejojo:feat/campaign-metadata-enrichment

Conversation

@susiejojo
Copy link
Copy Markdown

Summary

  • At campaign init, copies campaign.yaml into the work directory enriched with a runtime: block (target_repo, target_commit, nous_version, started_at)
  • Adds optional metadata field to campaign schema for user-supplied tags/goals
  • Each git/importlib call is individually failure-tolerant — missing git or non-repo targets degrade gracefully to null

Related Issue

Closes #115

Design

One enriched file per run captures everything — original config, user metadata, and runtime context. The source campaign.yaml is never modified. The enriched copy is only written on fresh init (not resume).

# Written to .nous/<run_id>/campaign.yaml
runtime:
  target_repo: "AI-native-Systems-Research/inference-sim"
  target_commit: "abc123..."
  nous_version: "0.1.0"
  started_at: "2026-05-21T13:20:00+00:00"

Test Plan

  • Enriched campaign.yaml written with runtime block (6 new tests)
  • User metadata passes through
  • No overwrite on resume
  • Graceful degradation without git
  • Real git repo captures target_commit
  • No enriched copy when campaign_path omitted
  • Full test suite passes (325 tests, 0 failures)

🤖 Generated with Claude Code

@susiejojo susiejojo force-pushed the feat/campaign-metadata-enrichment branch from 9e59a14 to c6c3d90 Compare May 21, 2026 19:15
@mtoslalibu
Copy link
Copy Markdown
Collaborator

PR Review — Summary of Findings

Critical (1)

URL parsing bug — HTTPS GitHub URLs misparse

orchestrator/iteration.py in _capture_runtime_meta: An HTTPS remote like https://github.com/org/repo.git satisfies ":" in remote and "github.com" in remote (because of the : in https://), enters the first branch, and split(":")[-1] produces //github.com/org/repo.git — not org/repo.

Fix: Use remote.startswith("git@github.com:") for the SSH branch, or use a regex.


Important (3)

  1. Revert example file changeexamples/campaign.yaml adds an optional metadata block. Examples should be minimal; the schema already documents the field.

  2. Unprotected yaml.safe_dump / atomic_write — If campaign data is non-serializable or disk fails, campaign init crashes with an unhelpful traceback. Wrap in try/except with a clear error message.

  3. stderr=subprocess.DEVNULL hides diagnostic info — User sees "could not capture X" but no reason why. Consider capture_output=True and logging stderr in the warning.


Design Question: GitHub-centric target_repo field

The URL parsing logic is GitHub-specific. For non-GitHub remotes, the else branch stores the raw remote URL. For local paths that aren't git repos at all, both target_repo and target_commit gracefully degrade to null (no crash). But the field name target_repo implies an org/repo identifier — this is misleading for non-GitHub users. Consider falling back to repo_path itself as the identifier when there's no parseable remote.


Test Gaps

  • Remote URL parsing untested — existing git-repo test has no remote. Need parametrized tests for SSH, HTTPS, ssh:// format, non-GitHub, and empty string.
  • nous_version git-SHA fallback untested — no test forces PackageNotFoundError and verifies the fallback.
  • Enrichment with repo_path set untested — all tests use repo_path=None, not the production path (.nous/<run_id>/).

Acceptable / No Action Needed

Item Verdict
Shallow copy (dict(campaign)) Safe — immediately serialized and discarded
Subprocess latency (up to 3 git calls) Runs once on fresh init only
Nested try/except pattern Correct for independent-degradation design
Function size (~40 lines) Single responsibility, don't split

Recommendation

Request changes for the URL parsing bug and example revert. The rest is solid.

At campaign init, writes an enriched copy of campaign.yaml into the work
directory with a `runtime:` block containing target_repo, target_commit,
nous_version, and started_at. Also adds optional `metadata` field to
campaign schema for user-supplied tags/goals.

Each git/importlib call is individually failure-tolerant — missing git
or non-repo targets degrade gracefully to null. The enriched copy is only
written on fresh init (not resume).

Closes AI-native-Systems-Research#115

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@susiejojo susiejojo force-pushed the feat/campaign-metadata-enrichment branch from c6c3d90 to 476474e Compare May 22, 2026 14:01
@susiejojo
Copy link
Copy Markdown
Author

Addressed in 476474e:

  • URL parsing bug: Fixed — SSH branch now uses remote.startswith("git@github.com:") instead of generic ":" in remote
  • Example reverted: Removed metadata block from examples/campaign.yaml
  • Enrichment write protected: Wrapped in try/except (OSError, yaml.YAMLError) with logger.warning
  • Test gaps filled: Added parametrized URL parsing tests (7 cases), nous_version git-SHA fallback test, and repo_path enrichment test

Left stderr=DEVNULL as-is — the warning already identifies which field/path failed, and raw git stderr adds noise the user can reproduce themselves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Capture campaign metadata in state.json at init time

2 participants