Skip to content

v9 roadmap completion: 48/53 items#125

Merged
toasterbook88 merged 7 commits into
mainfrom
v9-roadmap-completion
May 18, 2026
Merged

v9 roadmap completion: 48/53 items#125
toasterbook88 merged 7 commits into
mainfrom
v9-roadmap-completion

Conversation

@toasterbook88
Copy link
Copy Markdown
Owner

This PR completes 48 of 53 items from the AXIS v9 authority-first governance roadmap, plus benchmark evidence that blocks the remaining 5 Phase G performance items.

What is included

Production Hardening (B)

  • Remove 6 fmt.Printf DEBUG leaks from execution path (internal/execution/guarded.go)
  • Add ClusterState.Version with version-gated migrations (tombstones->failures, heartbeat normalization)
  • Add minimal RepairEvent types in internal/repairs/types.go

Surface Clarity (C)

  • docs/lifecycle.md -- 6-state taxonomy (stable/experimental/scaffolded/dormant/deprecated/internal-only) with inheritance rules
  • Label all 34 internal packages with lifecycle state via doc.go
  • Restructure README into stable vs. experimental sections
  • Fix 6 stale docs (phase-tracking, current-state, architecture, hybrid-ai-router-plan, distributed-cognitive-architecture)
  • hack/lifecycle-check.go -- CI enforcement for stable->stable import closure

Reservation CLI (D)

  • axis reservations list [--json|--ndjson] -- table or streaming output
  • axis reservations inspect [--json] -- full metadata
  • axis reservations release [--force] [--json] -- safe deletion with confirmation
  • 12 tests in cmd/axis/reservations_test.go
  • docs/reservations.md -- operator-facing semantics

Coverage + Benchmarks (E)

  • cmd/axis/exit_test.go -- 100% coverage
  • internal/versioncmp/versioncmp_test.go -- 100% coverage
  • cmd/axis/summary_test.go -- dashboard contract tests, 0%->90%
  • internal/mcp/server_tools_test.go -- MCP coverage 43.4%->88.7%
  • internal/execution/guarded_test.go -- error branch coverage 60.2%->79.8%
  • internal/facts/collectors_test.go + parsers_test.go -- edge cases
  • 8 placement benchmarks + 4 snapshot benchmarks
  • internal/transport/ssh_bench_test.go -- SSH reuse benchmark

Profiling (F)

  • axis serve --pprof / axis daemon --pprof -- runtime profiling endpoints
  • benchmarks/ -- baseline profiles (placement, snapshot, build)
  • docs/profiling.md -- operator profiling workflow

Structural Cleanup (H)

  • Mesh disable flag via discovery.enabled config (backward-compatible: default ON)
  • Delete dead Fatal() from cmd/axis/exit.go
  • Deduplicate dashboard UI constants into internal/ui/color.go
  • Compile-gate internal/safety/ with !safety_scaffolded build tag + no-op stubs

Authority Audit Docs (A)

10 governance documents establishing canonical ownership, mutation boundaries, and violation detection:

  • docs/authority-reservation.md
  • docs/authority-identity.md
  • docs/authority-freshness.md
  • docs/authority-observations.md
  • docs/authority-execution.md
  • docs/authority-config.md
  • docs/authority-secrets.md
  • docs/authority-cache.md
  • docs/authority-observability.md
  • docs/authority-violations.md
  • docs/authority-transition.md (6-phase transition protocol)

Future Design Docs

  • docs/future/reservation-doctor.md
  • docs/future/consistency-model.md

Metrics

Metric Before After
Total coverage 70.4% 74.1%
MCP coverage 43.4% 88.7%
Dashboard coverage 0% 90%
Execution coverage 60.2% 79.8%
Benchmarks 0 10
Authority docs 0 13
Lifecycle labels 0 34 packages + 33 commands/routes

Phase G -- Performance (BLOCKED)

5 items remain blocked per roadmap discipline (evidence before optimization):

Item Status Rationale
G-2 SSH cross-cycle reuse BLOCKED Benchmark shows AXIS already reuses connections within Collect(). Cross-cycle pooling adds complexity for marginal gain without production profile evidence.
G-4 Context caching BLOCKED Speculative without profile evidence
G-5 Double-buffer snapshot BLOCKED High danger; requires immutable snapshot prerequisite
G-6 sync.Pool JSON buffers BLOCKED Speculative without heap profile evidence
G-8 Struct field ordering BLOCKED Micro-optimization without allocation proof

Verification

All changes pass:

  • make build
  • make test (all 34 packages)
  • make test-race
  • make lint
  • make coverage (gates: knowledge 90.9%, api 80.2%, mcp 88.7%, ui 94.0%)
  • go run hack/lifecycle-check.go

William and others added 6 commits May 17, 2026 23:08
…ir events, and tests

B-1: Remove 6 fmt.Printf DEBUG leaks from internal/execution/guarded.go
B-3: Add cmd/axis/exit_test.go with 100% coverage of Error() and ExitCode()
E-1: Add internal/versioncmp/versioncmp_test.go with 100% coverage
B-4: Add internal/repairs/types.go with minimal RepairEvent types
B-5: Add Version field to ClusterState with version-gated migrations
G-3: Separate runMigrations() (version-gated) from runMaintenance() (always runs)

Also update golden files for state JSON output including version field.

All tests pass. Lint passes. Coverage gates pass (70.7%).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add Config.IsMeshEnabled() helper that preserves backward compat:
  mesh defaults ON when discovery config is absent, follows Enabled
  when explicitly configured.
- Gate mesh creation in daemon.NewDefault() behind IsMeshEnabled().
- Gate mesh startup in cmd/axis/serve.go behind IsMeshEnabled().
- Add unit tests for IsMeshEnabled() and NewDefault() mesh behavior.

Closes H-4.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Production hardening:
- Remove DEBUG fmt.Printf leaks from execution/guarded.go (B-1)
- Add ClusterState.Version with version-gated migrations (B-5)
- Add minimal RepairEvent types in internal/repairs/types.go (B-4)
- Add mesh disable flag via config Discovery.Enabled (H-4)
- Add pprof endpoints to axis serve --pprof (F-1)
- Delete dead Fatal() code from exit.go (H-6)

Authority audits (11 docs):
- docs/authority-reservation.md (AUTH-1)
- docs/authority-identity.md (AUTH-2)
- docs/authority-freshness.md (AUTH-3)
- docs/authority-observations.md (AUTH-4)
- docs/authority-execution.md (AUTH-5)
- docs/authority-config.md (AUTH-6)
- docs/authority-secrets.md (AUTH-7)
- docs/authority-cache.md (AUTH-8)
- docs/authority-observability.md (AUTH-9)
- docs/authority-violations.md (AUTH-11)

Surface clarity:
- docs/lifecycle.md — taxonomy with inheritance rules (C-1)
- Label all 34 internal packages with lifecycle state (C-2)
- Restructure README into stable/experimental sections (C-5)
- Fix stale docs: phase-tracking, current-state, architecture,
  distributed-cognitive-architecture, hybrid-ai-router-plan,
  AGENTS.md (C-6)

Coverage + benchmarks:
- cmd/axis/exit_test.go — 100% coverage (B-3)
- internal/versioncmp/versioncmp_test.go — 100% coverage (E-1)
- internal/mcp/server_tools_test.go — 43.4% → 88.7% (E-2)
- internal/facts/ tests — 70.7% → 74.0% (E-3)
- internal/execution/guarded_test.go — 60.2% → 79.8% (E-4)
- cmd/axis/summary_test.go — dashboard contract tests (E-5)
- internal/placement/placement_bench_test.go (E-6)
- internal/snapshot/snapshot_bench_test.go (E-6)

Structural:
- examples/ directory with nodes.yaml, basic-usage.md,
  reservations.md, daemon-setup.md (H-1)

All tests pass. Lint passes. Coverage gates pass (70.7%).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Features:
- axis reservations list [--json|--ndjson] (D-1)
- axis reservations inspect <id> [--json] (D-2)
- axis reservations release <id> [--force] [--json] (D-3)
- pprof endpoints via axis serve --pprof (F-1)
- Baseline benchmarks and profiles in benchmarks/ (F-2)
- docs/profiling.md profiling workflow documentation (F-3)

Governance:
- docs/lifecycle.md taxonomy with inheritance rules (C-1)
- Label all 34 internal packages with lifecycle state (C-2)
- Label all CLI commands and API routes (C-3, C-4)
- Restructure README into stable/experimental (C-5)
- Fix stale docs: phase-tracking, current-state, architecture,
  distributed-cognitive-architecture, hybrid-ai-router-plan (C-6)
- hack/lifecycle-check.go CI enforcement (C-7)
- docs/authority-transition.md transition protocol (A-10)

Authority audits (10 docs):
- docs/authority-reservation.md (AUTH-1)
- docs/authority-identity.md (AUTH-2)
- docs/authority-freshness.md (AUTH-3)
- docs/authority-observations.md (AUTH-4)
- docs/authority-execution.md (AUTH-5)
- docs/authority-config.md (AUTH-6)
- docs/authority-secrets.md (AUTH-7)
- docs/authority-cache.md (AUTH-8)
- docs/authority-observability.md (AUTH-9)
- docs/authority-violations.md (AUTH-11)

Structural:
- examples/ directory (H-1)
- docs/decisions/dashboard-command.md → KEEP (H-2)
- docs/decisions/v2-reservations-endpoint.md → STABILIZE (H-3)
- Compile-gate safety/structured with build tag (H-5)
- Delete dead Fatal() code (H-6)
- Deduplicate dashboard UI constants (H-7)
- docs/reservations.md operator documentation (D-4)

Optimizations:
- Precompute rank keys → 54% faster for 50 nodes (G-1)
- Pre-lowercase script registry strings (G-7)

All tests pass. Lint passes. Coverage gates pass (70.7%).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- docs/future/reservation-doctor.md (D-5)
- docs/future/consistency-model.md (H-9)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds internal/transport/ssh_bench_test.go with BenchmarkSSHConnectionReuse
and BenchmarkSSHConnectionReusePerCommand.

Results show ~15x per-command speedup from connection reuse. AXIS already
achieves this within each Collect() (30+ commands over one connection).
Cross-cycle reuse would add complexity for marginal gain in typical
deployments; no profile evidence justifies it.

Also converts test helpers in ssh_lifecycle_test.go to accept testing.TB
so they can be reused from benchmarks.

Phase G status: BLOCKED — insufficient evidence for cross-cycle SSH reuse.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant structural and functional improvements to the AXIS CLI and daemon. Key changes include the addition of a reservation ledger and associated CLI commands (reservations list, inspect, release), the addition of an --pprof flag for daemon profiling, and the implementation of structured safety evaluation scaffolding. The pull request also formalizes package lifecycle states (stable, experimental, scaffolded, etc.) and introduces a lifecycle-check tool to enforce dependency constraints. My feedback highlights that the Load() function in internal/state/state.go now performs maintenance tasks (pruning and reclaiming stale state) and saves the state file if changes are made, which makes Load() non-idempotent and potentially slow for CLI commands; I recommend moving this maintenance to an explicit Maintain() function called only by the daemon or a dedicated maintenance task.

Comment thread internal/state/state.go Outdated
Removes the time-dependent maintenance side effect from state.Load(), making
it idempotent and safe for any caller. Adds public state.Maintain(s) that
performs prune/reclaim/normalize and returns whether state was modified.

Callers updated:
- daemon refresh loop: Load + Maintain + Save
- daemon Meta(): Load + Maintain
- axis task context: Load + Maintain (in-memory cleanup for display)
- axis context show: Load + Maintain (in-memory cleanup for display)
- degraded contract tests: Load + Maintain

Fixes 5 failing tests that previously relied on Load() doing maintenance.
Addresses PR review feedback from gemini-code-assist.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@toasterbook88 toasterbook88 merged commit 09da0ef into main May 18, 2026
8 checks passed
@toasterbook88 toasterbook88 deleted the v9-roadmap-completion branch May 18, 2026 12:38
toasterbook88 added a commit that referenced this pull request May 18, 2026
…#126)

Addresses review feedback on PR #125.

- state.Load() no longer performs maintenance or re-saves
- Maintenance is now explicit via state.Maintain()
- Daemon, CLI, and tests updated to call Maintain() after Load()

All tests, race tests, lint, and coverage gates pass.

---------

Co-authored-by: William <cranium@cranium.lan>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant