Skip to content

Build Shared CI Workflow for Agent Runs #2854

@FScholPer

Description

@FScholPer

Goal

Provide reusable workflow/action(s) for agent execution and validation.

Why

Developers need one standard pipeline shape across repos with clear required checks and artifact outputs.

Tasks

  • Add reusable workflow for the mandatory Lane A path.
  • Integrate stages: build -> test -> policy -> artifact publish.
  • Normalize failure reasons and status names across repos.
  • Document adoption steps for maintainers.
  • Add optional Lane B integration hooks that never affect merge eligibility.
  • Add a lightweight candidate validation step before any expensive benchmark or full task-set run.

Required Pipeline Outputs

  • Policy decision artifact
  • Test summary artifact
  • Traceability/quality metrics artifact
  • Run metadata (commit, lane, tool versions)
  • Validation status artifact for the cheap pre-benchmark candidate check

Docs-as-code pilot tasks

  • Implement workflow job sequence using existing traceability scripts.
  • Upload metrics and gate outputs as artifacts.
  • Expose check names intended for branch protection.
  • Add a fast harness validation step: import candidate, instantiate harness, run one tiny task spec, verify expected trace filenames.
  • Reuse the pilot's small summary-first trace layout so failed tasks can be inspected without replaying a whole run.

Done When

  • At least 2 pilot repos run the shared workflow end-to-end.
  • Required check names are stable and documented.
  • Lane B can be disabled without breaking merge-critical checks.
  • Malformed candidates fail in the cheap validation stage before they consume benchmark time.
  • The pilot workflow emits enough structured trace metadata for agents and maintainers to inspect failures from an index before drilling into per-task artifacts.

Parent: #2852

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions