Skip to content

Control self-hosted runner lanes #414

@cbusillo

Description

@cbusillo

Objective

Let Launchplane eventually manage self-hosted GitHub Actions runner lanes instead of only observing them.

Finish Line

Launchplane safely manages runner lane lifecycle from explicit policy

Current Status

State: Future work; intentionally parked until runner lane inventory exists and merge-train behavior proves runner capacity is worth automating.
Next action: Complete #413 first, then design runner-control policy and host executor boundaries.
Blocked by: #413.
Last verified: 2026-05-07 after manually adding chris-testing-sellyouroutboard-3.

Scope

  • Desired runner lane counts per repo/host/pool.
  • Runner provisioning/registration and service/container lifecycle.
  • Draining lanes before restart/removal.
  • Health checks and automatic restart for unhealthy lanes.
  • Label management for lane identity and capability labels.
  • Guardrails so Launchplane cannot delete or reconfigure unmanaged runners by accident.

Acceptance Criteria

  • Launchplane has a policy model for desired runner lanes and allowed hosts.
  • Runner control requires explicit opt-in per host and repo.
  • Launchplane can create a new runner lane with canonical labels and verify GitHub sees it online.
  • Launchplane can drain a lane, wait for no active job, restart it, and verify it returns online.
  • Launchplane can remove only lanes it owns and leaves manual/unmanaged lanes untouched.
  • All mutating actions support dry-run and audit records.
  • Failure leaves the system in a safe state with a clear recovery comment/log.

Relationships

Blocked by #413. Related to #410 because merge-train scheduling may use runner capacity signals, but runner control should not block the Level 1 merge train.

Validation

  • Unit tests for desired-state planning and ownership guardrails.
  • A live smoke on a disposable runner lane before touching production repo lanes.
  • Manual rollback instructions documented for service/container cleanup.

Decisions

  • Awareness first, control later.
  • Do not let Launchplane mutate runner services until it can inventory ownership and health reliably.
  • Treat host-level execution as a privileged ops boundary with explicit opt-in.

Open Questions

  • Should Launchplane control systemd runner services, Dockerized runners, or both?
  • Should runner scaling be event-driven, scheduled, or manually requested?
  • How should Launchplane store host credentials or delegate host actions safely?

Metadata

Metadata

Assignees

No one assigned

    Labels

    planDurable planning issueplan:activeCurrent active plan

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions