Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# EvalOps org-default changes are broad by definition.
* @haasonsaas

# Workflow, contract, and helper-script changes deserve extra operator attention.
.github/contracts/ @haasonsaas
.github/scripts/ @haasonsaas
.github/workflows/ @haasonsaas
services.yaml @haasonsaas
201 changes: 201 additions & 0 deletions .github/contracts/engineering-practices.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
schema_version: evalops.engineering_practices.v1
contract_id: evalops.github.engineering-practices
owner_repo: evalops/.github
status: proposed
workflow:
name: engineering-practices-control-plane
correctness_model: >
EvalOps engineering practices are correct when they are backed by live
GitHub evidence, scoped by repository tier, and connected to a runnable
check or adoption ledger instead of living only as prose.
threat_model: >
The highest-risk failure mode is high-throughput agent-assisted change
landing without a durable review, release, security, or evidence contract.
Audits must degrade to non-mutating reports when credentials are missing
and must never publish a green report from partial or empty data.
source_records:
- id: evalops.github.engineering-practices.source.org-contract
path: profile/ENGINEERING_PRACTICES.md
digest: sha256
- id: evalops.github.engineering-practices.source.service-catalog
path: services.yaml
digest: sha256
- id: evalops.github.engineering-practices.source.control-plane-readme
path: README.md
digest: sha256
repo_tiers:
critical:
description: "Product, runtime, deployment, and org-control-plane repos that should block merges on practice drift."
repos:
- evalops/platform
- evalops/deploy
- evalops/ensemble
- evalops/maestro-internal
- evalops/maestro
- evalops/cerebro
- evalops/chat
- evalops/.github
required_controls:
- org-rulesets
- agent-review-lane
- backlog-lifecycle
- release-train-state
- security-slo
- operating-rails
- evidence-first-done
standard:
description: "Actively maintained product, SDK, data, and infrastructure repos that should report drift before enforcement."
repos:
- evalops/hopper
- evalops/nimbus
- evalops/kestrel
- evalops/diffscope
- evalops/conductor
- evalops/console
- evalops/eval2otel
- evalops/agent-pm
required_controls:
- backlog-lifecycle
- security-slo
- operating-rails
- evidence-first-done
experimental:
description: "Research and spike repos where lightweight reporting is preferred over blocking policy."
repos: []
required_controls:
- operating-rails
practices:
- id: org-rulesets
title: "GitHub-native rulesets for repo tiers"
why: "Branch protection is currently repo-local and uneven; org rulesets give EvalOps a central merge-safety contract."
adoption: "Start in evaluate mode for critical repos, then promote required checks once each repo has the matching workflows."
source:
path: profile/ENGINEERING_PRACTICES.md
heading: "Org Rulesets"
checked_by:
- .github/scripts/audit-engineering-practices.rb
- .github/workflows/engineering-practices-audit.yml
signals:
- org_ruleset_count
- protected_critical_repos
- id: backlog-lifecycle
title: "Generated backlog lifecycle"
why: "Guardrail and conformance issues are useful only when fingerprints, ownership, and close conditions stay machine-readable."
adoption: "Require generated backlog issues to carry a class key, source fingerprints, last-seen window, and explicit close evidence."
source:
path: profile/ENGINEERING_PRACTICES.md
heading: "Backlog Lifecycle"
checked_by:
- .github/scripts/audit-engineering-practices.rb
- .github/scripts/sweep-recent-review-feedback.rb
signals:
- open_guardrail_backlog_issues
- stale_closing_comments
- id: release-train-state
title: "Release-train state machine"
why: "Repeated hold and image-sync PRs should converge on a single desired-state record instead of multiplying operational PRs."
adoption: "Track one active train record per environment with owner, TTL, receipt, rollback receipt, and idempotent PR updates."
source:
path: profile/ENGINEERING_PRACTICES.md
heading: "Release Trains"
checked_by:
- .github/scripts/audit-engineering-practices.rb
signals:
- deploy_release_train_duplicate_prs
- deploy_image_sync_prs
- id: agent-review-lane
title: "Required agent review lane"
why: "Agent-assisted throughput is high enough that review-thread closure, EvalOpsBot review, and CODEOWNERS need to be standard rails."
adoption: "Critical repos require EvalOpsBot review request plumbing, review-thread guard, CODEOWNERS, and stable check contexts."
source:
path: profile/ENGINEERING_PRACTICES.md
heading: "Agent Review"
checked_by:
- .github/scripts/audit-engineering-practices.rb
- .github/scripts/verify-evalopsbot-review-setup.rb
signals:
- evalopsbot_workflow_adoption
- review_thread_guard_adoption
- codeowners_adoption
- id: security-slo
title: "Security remediation SLOs"
why: "Security defaults exist, but open alerts need explicit tiered owners, burn-down windows, and suppression evidence without enabling expensive default scanners."
adoption: "Critical repos should track critical/high Dependabot and secret-scanning alerts against age-based SLOs. CodeQL and GitHub default code scanning are explicitly not part of this baseline."
source:
path: profile/ENGINEERING_PRACTICES.md
heading: "Security SLO"
checked_by:
- .github/scripts/audit-engineering-practices.rb
signals:
- dependabot_open_alerts
- secret_scanning_open_alerts
- id: operating-rails
title: "Repo operating rails by class"
why: "AGENTS.md, CODEOWNERS, dependency policy, Codex rails, Pysa, and runner-label config should be applied by repo class, not memory."
adoption: "Critical repos get the full rail set; standard repos report missing rails until promoted."
source:
path: profile/ENGINEERING_PRACTICES.md
heading: "Operating Rails"
checked_by:
- .github/scripts/audit-engineering-practices.rb
- .github/workflows/codex-rails-check.yml
signals:
- agents_adoption
- codex_rails_adoption
- dependency_policy_adoption
- runner_label_config_adoption
- id: evidence-first-done
title: "Evidence-first definition of done"
why: "EvalOps sells governance and operational proof; engineering changes should leave smoke evidence, artifact receipts, and withheld-data notes."
adoption: "Every critical repo PR should connect user-visible changes to smoke fixtures, artifact receipts, telemetry, and rollback evidence."
source:
path: profile/ENGINEERING_PRACTICES.md
heading: "Evidence First"
checked_by:
- .github/scripts/audit-engineering-practices.rb
- .github/pull_request_template.md
signals:
- pr_template_evidence_checklist
- runtime_smoke_guardrail_backlog
live_audit:
owner: evalops
sampled_repos:
- evalops/platform
- evalops/deploy
- evalops/ensemble
- evalops/maestro-internal
- evalops/maestro
- evalops/cerebro
- evalops/chat
- evalops/.github
- evalops/hopper
- evalops/nimbus
- evalops/kestrel
required_files:
critical:
- AGENTS.md
- .github/CODEOWNERS
- .github/workflows/review-thread-guard.yml
- .github/workflows/evalopsbot-review-request.yml
- .github/workflows/codex-rails-check.yml
standard:
- AGENTS.md
issue_queries:
guardrail_candidate: 'org:evalops is:issue is:open archived:false "Guardrail candidate" in:title'
acceptance_harness: 'org:evalops is:issue is:open archived:false "Add a research-backed acceptance harness" in:title'
conformance_contract: 'org:evalops is:issue is:open archived:false "Promote latent specs into a documented conformance contract" in:title'
provenance_evidence: 'org:evalops is:issue is:open archived:false "Make provenance and evidence traceability first-class" in:title'
telemetry_slo: 'org:evalops is:issue is:open archived:false "Expose operational telemetry and SLO gates" in:title'
release_train_queries:
deploy_hold_prs: 'repo:evalops/deploy is:pr is:merged merged:>=2026-05-06 "Hold prod-continuous release train" in:title'
deploy_image_sync_prs: 'repo:evalops/deploy is:pr is:merged merged:>=2026-05-06 "sync" "image" in:title'
security_alert_slo:
critical_days: 1
high_days: 7
medium_days: 30
excluded_scanners:
- codeql
- github-code-scanning-default-setup
commands:
local_contract_check: "ruby .github/scripts/audit-engineering-practices.rb --contract-only"
live_report: "ruby .github/scripts/audit-engineering-practices.rb --json-output engineering-practices-audit.json --markdown-output engineering-practices-audit.md"
Loading
Loading