-
Notifications
You must be signed in to change notification settings - Fork 13
Description
π€ Kelos Strategist Agent @gjkim42
Area: New Use Cases + API Extension
Summary
One of the most requested AI agent use cases in 2025-2026 is automatic CI failure diagnosis and remediation β when CI fails on a PR, an agent investigates the failure, proposes a fix, and pushes it to the branch. Kelos is uniquely positioned to serve this use case but currently lacks the API surface to trigger agents based on CI check outcomes. This proposal describes the use case, identifies the specific gap in githubPullRequests, and proposes a checkConclusion filter and {{.FailedChecks}} template variable to enable it.
The Use Case: Auto-Fix Broken CI
Who benefits
- Any team with CI pipelines β the most common developer friction point is a red CI check that blocks a PR
- Open source maintainers β contributor PRs frequently fail CI due to lint, formatting, or test issues that are mechanical to fix
- Large organizations β at scale, CI failures consume significant developer time on repetitive debugging
What the workflow looks like
- A developer opens a PR
- CI runs and fails (lint error, test failure, build break, etc.)
- Kelos discovers the PR has a failing check and spawns an agent
- The agent reads the CI failure logs, diagnoses the problem, and pushes a fix commit
- CI re-runs on the updated branch
- If CI passes, the PR is ready for review β developer time saved
Why Kelos is uniquely suited
Unlike standalone CI-fix bots, Kelos provides:
- Full codebase context: Agents clone the repo and understand the full project, not just the diff
- Configurable agent types and models: Use a fast/cheap model for lint fixes, a capable model for test failures
- Concurrency controls:
maxConcurrencyprevents overwhelming CI with fix attempts - Branch locking: Only one agent works on a branch at a time (already implemented)
- Task pipelines: Can chain diagnosis β fix β verification steps via
dependsOn - Cost governance:
maxTotalTasks,ttlSecondsAfterFinished, and proposedcostBudget(API: Add costBudget to TaskSpawner for spending limits based on actual token usage and USD costΒ #788) - Priority scheduling:
priorityLabelscan prioritize fixing important PRs first
Current Gap
1. No CI check status filtering on githubPullRequests
The GitHubPullRequestsSpec currently filters by: labels, excludeLabels, state, reviewState, commentPolicy, author, draft, priorityLabels (api/v1alpha1/taskspawner_types.go:185-260).
There is no way to filter PRs by their CI check conclusion. A spawner watching for open PRs will discover ALL PRs regardless of their CI status, which means:
- Agents would be spawned for PRs where CI hasn't even started yet
- Agents would be spawned for PRs where CI is passing (wasting credits)
- No way to distinguish between different failure types
2. No check failure details in prompt template variables
The WorkItem struct (internal/source/source.go:10-33) provides template variables for GitHub PRs: {{.Branch}}, {{.ReviewState}}, {{.ReviewComments}}. There are no variables for CI check status or failure details. An agent spawned to fix CI would need to discover the failure independently, rather than receiving it in the prompt.
3. The workaround is fragile
Today, the closest approximation is:
- Use a GitHub Action that comments
/kelos fix-ciwhen checks fail - Use
triggerComment: "/kelos fix-ci"on the spawner
This is fragile (requires maintaining a separate GHA workflow), doesn't pass failure context to the agent, and loses the structured auditability that a first-class source filter would provide.
Proposed API Changes
1. Add checkConclusion filter to GitHubPullRequestsSpec
// In api/v1alpha1/taskspawner_types.go, add to GitHubPullRequestsSpec:
// CheckConclusion filters pull requests by the aggregate conclusion of
// their latest check suite. Only PRs where at least one check run
// matches the specified conclusion are discovered.
// Supported values: "failure", "success", "neutral", "cancelled",
// "timed_out", "action_required", "any".
// When unset or "any", check conclusion does not gate discovery.
// +kubebuilder:validation:Enum=failure;success;neutral;cancelled;timed_out;action_required;any
// +optional
CheckConclusion string `json:"checkConclusion,omitempty"`
// CheckNames optionally restricts which check runs are considered
// when evaluating checkConclusion. When empty, all check runs are
// considered. When set, only checks whose name matches one of these
// values are evaluated.
// Example: ["ci", "lint", "test"] would only trigger on failures
// from checks named "ci", "lint", or "test".
// +optional
CheckNames []string `json:"checkNames,omitempty"`2. Add check failure details to WorkItem and prompt templates
// In internal/source/source.go, add to WorkItem:
// FailedChecks contains formatted details of failed check runs for
// GitHub PR sources when checkConclusion filtering is enabled.
// Includes check name, conclusion, and output summary.
FailedChecks string
// CheckConclusion is the aggregate check conclusion for GitHub PR sources.
CheckConclusion stringThis exposes {{.FailedChecks}} and {{.CheckConclusion}} in prompt templates.
3. Implementation in GitHub PR source
The github_pr.go source already calls the GitHub API to discover PRs. To implement check filtering:
- After fetching PRs, call
GET /repos/{owner}/{repo}/commits/{ref}/check-runsfor each PR's head SHA - Filter check runs by
CheckNamesif specified - Evaluate aggregate conclusion against
CheckConclusionfilter - Populate
WorkItem.FailedCheckswith failure details (check name, conclusion, output title/summary) - Use conditional requests (ETag, already implemented via
ETagTransport) to minimize API calls
API call budget: One additional API call per PR per poll cycle. With maxConcurrency limiting the number of active tasks and typical poll intervals of 2-5 minutes, this adds modest API overhead. The check-runs endpoint also supports conditional requests.
Example Configurations
Example 1: Auto-fix lint failures
apiVersion: kelos.dev/v1alpha1
kind: TaskSpawner
metadata:
name: ci-lint-fixer
spec:
when:
githubPullRequests:
labels: ["ok-to-autofix"]
checkConclusion: failure
checkNames: ["lint", "fmt", "vet"]
reporting:
enabled: true
maxConcurrency: 3
taskTemplate:
type: claude-code
credentials:
type: oauth
secretRef:
name: claude-credentials
workspaceRef:
name: my-workspace
branch: "{{.Branch}}"
ttlSecondsAfterFinished: 3600
promptTemplate: |
A CI check has failed on PR #{{.Number}}: {{.Title}}
Failed checks:
{{.FailedChecks}}
Your job:
1. Check out the PR branch (already done)
2. Run the failing check locally to reproduce
3. Fix the issue (lint, formatting, or vet errors)
4. Commit and push the fix
Do NOT change any logic or behavior. Only fix the specific
lint/format/vet errors reported by CI.Example 2: Diagnose and fix test failures with a pipeline
apiVersion: kelos.dev/v1alpha1
kind: TaskSpawner
metadata:
name: ci-test-fixer
spec:
when:
githubPullRequests:
checkConclusion: failure
checkNames: ["test", "unit-tests", "integration-tests"]
commentPolicy:
triggerComment: "/kelos fix-tests"
minimumPermission: write
reporting:
enabled: true
maxConcurrency: 1
taskTemplate:
type: claude-code
credentials:
type: oauth
secretRef:
name: claude-credentials
workspaceRef:
name: my-workspace
branch: "{{.Branch}}"
ttlSecondsAfterFinished: 7200
promptTemplate: |
CI test failure on PR #{{.Number}}: {{.Title}}
Failed checks:
{{.FailedChecks}}
Steps:
1. Read the failing test output above
2. Reproduce the failure locally with `make test`
3. Determine if this is a test bug or a code bug
4. If it's a code bug, fix the code
5. If it's a test bug (test needs updating for new behavior), fix the test
6. Run `make test` to verify your fix
7. Commit and pushExample 3: Open-source maintainer β auto-fix contributor PRs
apiVersion: kelos.dev/v1alpha1
kind: TaskSpawner
metadata:
name: contributor-ci-helper
spec:
when:
githubPullRequests:
checkConclusion: failure
excludeLabels: ["do-not-autofix"]
draft: false
reporting:
enabled: true
maxConcurrency: 2
taskTemplate:
type: claude-code
model: sonnet
credentials:
type: oauth
secretRef:
name: claude-credentials
workspaceRef:
name: my-workspace
branch: "{{.Branch}}"
ttlSecondsAfterFinished: 3600
promptTemplate: |
A contributor's PR #{{.Number}} ("{{.Title}}") has failing CI.
Failed checks:
{{.FailedChecks}}
Help the contributor by fixing the CI failure. Focus only on
mechanical fixes (formatting, lint, missing imports, test updates
for API changes). If the failure requires design decisions or
significant code changes, leave a comment explaining the issue
instead of attempting a fix.Interaction with Existing and Proposed Features
| Feature | How it works with CI-failure remediation |
|---|---|
maxConcurrency (implemented) |
Prevents spawning too many fix agents at once |
| Branch locking (implemented) | Ensures only one fix agent per branch |
ttlSecondsAfterFinished (implemented) |
Cleans up completed fix tasks |
reporting (implemented) |
Posts progress comments on the PR |
priorityLabels (implemented) |
Prioritize fixing high-priority PRs first |
retriggerOnPush (#752) |
Would re-engage agent when CI fails again after a fix attempt |
filePatterns (#778) |
Could combine: only trigger when changed files match patterns AND CI fails |
costBudget (#788) |
Limits spending on CI fix attempts |
retryStrategy (#730) |
Could retry with a stronger model if fix attempt fails |
checkConclusion (this proposal) |
The core enabler β gates discovery on CI outcome |
Why This Is a Growth Opportunity
- Broad appeal: Every team with CI/CD pipelines is a potential user. CI failure is the single most common developer friction point.
- Easy to demo: "Watch this β I push broken code, and Kelos fixes it automatically" is a compelling demo.
- Incremental adoption: Teams can start by auto-fixing lint/format issues (low risk), then expand to test failures (medium risk). No big-bang adoption needed.
- Differentiator: Most AI coding tools operate at the IDE level. CI-failure remediation operates at the pipeline level β a space where Kelos's Kubernetes-native, event-driven architecture is a natural fit.
- Complements existing use cases: Can be combined with the PR review spawner (#kelos-reviewer) and dependency upgrade validation (New use case: Intelligent dependency upgrade validation with Dependabot/Renovate PR agent workflowsΒ #722) for a comprehensive PR automation story.
/kind feature