New use case: CI-failure-driven agent remediation with check status filtering on githubPullRequests

🤖 **Kelos Strategist Agent** @gjkim42

## Area: New Use Cases + API Extension

## Summary

One of the most requested AI agent use cases in 2025-2026 is **automatic CI failure diagnosis and remediation** — when CI fails on a PR, an agent investigates the failure, proposes a fix, and pushes it to the branch. Kelos is uniquely positioned to serve this use case but currently lacks the API surface to trigger agents based on CI check outcomes. This proposal describes the use case, identifies the specific gap in `githubPullRequests`, and proposes a `checkConclusion` filter and `{{.FailedChecks}}` template variable to enable it.

## The Use Case: Auto-Fix Broken CI

### Who benefits
- **Any team with CI pipelines** — the most common developer friction point is a red CI check that blocks a PR
- **Open source maintainers** — contributor PRs frequently fail CI due to lint, formatting, or test issues that are mechanical to fix
- **Large organizations** — at scale, CI failures consume significant developer time on repetitive debugging

### What the workflow looks like

1. A developer opens a PR
2. CI runs and fails (lint error, test failure, build break, etc.)
3. Kelos discovers the PR has a failing check and spawns an agent
4. The agent reads the CI failure logs, diagnoses the problem, and pushes a fix commit
5. CI re-runs on the updated branch
6. If CI passes, the PR is ready for review — developer time saved

### Why Kelos is uniquely suited

Unlike standalone CI-fix bots, Kelos provides:
- **Full codebase context**: Agents clone the repo and understand the full project, not just the diff
- **Configurable agent types and models**: Use a fast/cheap model for lint fixes, a capable model for test failures
- **Concurrency controls**: `maxConcurrency` prevents overwhelming CI with fix attempts
- **Branch locking**: Only one agent works on a branch at a time (already implemented)
- **Task pipelines**: Can chain diagnosis → fix → verification steps via `dependsOn`
- **Cost governance**: `maxTotalTasks`, `ttlSecondsAfterFinished`, and proposed `costBudget` (#788)
- **Priority scheduling**: `priorityLabels` can prioritize fixing important PRs first

## Current Gap

### 1. No CI check status filtering on `githubPullRequests`

The `GitHubPullRequestsSpec` currently filters by: `labels`, `excludeLabels`, `state`, `reviewState`, `commentPolicy`, `author`, `draft`, `priorityLabels` (`api/v1alpha1/taskspawner_types.go:185-260`).

There is no way to filter PRs by their CI check conclusion. A spawner watching for open PRs will discover ALL PRs regardless of their CI status, which means:
- Agents would be spawned for PRs where CI hasn't even started yet
- Agents would be spawned for PRs where CI is passing (wasting credits)
- No way to distinguish between different failure types

### 2. No check failure details in prompt template variables

The `WorkItem` struct (`internal/source/source.go:10-33`) provides template variables for GitHub PRs: `{{.Branch}}`, `{{.ReviewState}}`, `{{.ReviewComments}}`. There are no variables for CI check status or failure details. An agent spawned to fix CI would need to discover the failure independently, rather than receiving it in the prompt.

### 3. The workaround is fragile

Today, the closest approximation is:
1. Use a GitHub Action that comments `/kelos fix-ci` when checks fail
2. Use `triggerComment: "/kelos fix-ci"` on the spawner

This is fragile (requires maintaining a separate GHA workflow), doesn't pass failure context to the agent, and loses the structured auditability that a first-class source filter would provide.

## Proposed API Changes

### 1. Add `checkConclusion` filter to `GitHubPullRequestsSpec`

```go
// In api/v1alpha1/taskspawner_types.go, add to GitHubPullRequestsSpec:

// CheckConclusion filters pull requests by the aggregate conclusion of
// their latest check suite. Only PRs where at least one check run
// matches the specified conclusion are discovered.
// Supported values: "failure", "success", "neutral", "cancelled",
// "timed_out", "action_required", "any".
// When unset or "any", check conclusion does not gate discovery.
// +kubebuilder:validation:Enum=failure;success;neutral;cancelled;timed_out;action_required;any
// +optional
CheckConclusion string `json:"checkConclusion,omitempty"`

// CheckNames optionally restricts which check runs are considered
// when evaluating checkConclusion. When empty, all check runs are
// considered. When set, only checks whose name matches one of these
// values are evaluated.
// Example: ["ci", "lint", "test"] would only trigger on failures
// from checks named "ci", "lint", or "test".
// +optional
CheckNames []string `json:"checkNames,omitempty"`
```

### 2. Add check failure details to WorkItem and prompt templates

```go
// In internal/source/source.go, add to WorkItem:

// FailedChecks contains formatted details of failed check runs for
// GitHub PR sources when checkConclusion filtering is enabled.
// Includes check name, conclusion, and output summary.
FailedChecks string

// CheckConclusion is the aggregate check conclusion for GitHub PR sources.
CheckConclusion string
```

This exposes `{{.FailedChecks}}` and `{{.CheckConclusion}}` in prompt templates.

### 3. Implementation in GitHub PR source

The `github_pr.go` source already calls the GitHub API to discover PRs. To implement check filtering:

1. After fetching PRs, call `GET /repos/{owner}/{repo}/commits/{ref}/check-runs` for each PR's head SHA
2. Filter check runs by `CheckNames` if specified
3. Evaluate aggregate conclusion against `CheckConclusion` filter
4. Populate `WorkItem.FailedChecks` with failure details (check name, conclusion, output title/summary)
5. Use conditional requests (ETag, already implemented via `ETagTransport`) to minimize API calls

**API call budget**: One additional API call per PR per poll cycle. With `maxConcurrency` limiting the number of active tasks and typical poll intervals of 2-5 minutes, this adds modest API overhead. The check-runs endpoint also supports conditional requests.

## Example Configurations

### Example 1: Auto-fix lint failures

```yaml
apiVersion: kelos.dev/v1alpha1
kind: TaskSpawner
metadata:
  name: ci-lint-fixer
spec:
  when:
    githubPullRequests:
      labels: ["ok-to-autofix"]
      checkConclusion: failure
      checkNames: ["lint", "fmt", "vet"]
      reporting:
        enabled: true
  maxConcurrency: 3
  taskTemplate:
    type: claude-code
    credentials:
      type: oauth
      secretRef:
        name: claude-credentials
    workspaceRef:
      name: my-workspace
    branch: "{{.Branch}}"
    ttlSecondsAfterFinished: 3600
    promptTemplate: |
      A CI check has failed on PR #{{.Number}}: {{.Title}}

      Failed checks:
      {{.FailedChecks}}

      Your job:
      1. Check out the PR branch (already done)
      2. Run the failing check locally to reproduce
      3. Fix the issue (lint, formatting, or vet errors)
      4. Commit and push the fix

      Do NOT change any logic or behavior. Only fix the specific
      lint/format/vet errors reported by CI.
```

### Example 2: Diagnose and fix test failures with a pipeline

```yaml
apiVersion: kelos.dev/v1alpha1
kind: TaskSpawner
metadata:
  name: ci-test-fixer
spec:
  when:
    githubPullRequests:
      checkConclusion: failure
      checkNames: ["test", "unit-tests", "integration-tests"]
      commentPolicy:
        triggerComment: "/kelos fix-tests"
        minimumPermission: write
      reporting:
        enabled: true
  maxConcurrency: 1
  taskTemplate:
    type: claude-code
    credentials:
      type: oauth
      secretRef:
        name: claude-credentials
    workspaceRef:
      name: my-workspace
    branch: "{{.Branch}}"
    ttlSecondsAfterFinished: 7200
    promptTemplate: |
      CI test failure on PR #{{.Number}}: {{.Title}}

      Failed checks:
      {{.FailedChecks}}

      Steps:
      1. Read the failing test output above
      2. Reproduce the failure locally with `make test`
      3. Determine if this is a test bug or a code bug
      4. If it's a code bug, fix the code
      5. If it's a test bug (test needs updating for new behavior), fix the test
      6. Run `make test` to verify your fix
      7. Commit and push
```

### Example 3: Open-source maintainer — auto-fix contributor PRs

```yaml
apiVersion: kelos.dev/v1alpha1
kind: TaskSpawner
metadata:
  name: contributor-ci-helper
spec:
  when:
    githubPullRequests:
      checkConclusion: failure
      excludeLabels: ["do-not-autofix"]
      draft: false
      reporting:
        enabled: true
  maxConcurrency: 2
  taskTemplate:
    type: claude-code
    model: sonnet
    credentials:
      type: oauth
      secretRef:
        name: claude-credentials
    workspaceRef:
      name: my-workspace
    branch: "{{.Branch}}"
    ttlSecondsAfterFinished: 3600
    promptTemplate: |
      A contributor's PR #{{.Number}} ("{{.Title}}") has failing CI.

      Failed checks:
      {{.FailedChecks}}

      Help the contributor by fixing the CI failure. Focus only on
      mechanical fixes (formatting, lint, missing imports, test updates
      for API changes). If the failure requires design decisions or
      significant code changes, leave a comment explaining the issue
      instead of attempting a fix.
```

## Interaction with Existing and Proposed Features

| Feature | How it works with CI-failure remediation |
|---------|----------------------------------------|
| `maxConcurrency` (implemented) | Prevents spawning too many fix agents at once |
| Branch locking (implemented) | Ensures only one fix agent per branch |
| `ttlSecondsAfterFinished` (implemented) | Cleans up completed fix tasks |
| `reporting` (implemented) | Posts progress comments on the PR |
| `priorityLabels` (implemented) | Prioritize fixing high-priority PRs first |
| `retriggerOnPush` (#752) | Would re-engage agent when CI fails again after a fix attempt |
| `filePatterns` (#778) | Could combine: only trigger when changed files match patterns AND CI fails |
| `costBudget` (#788) | Limits spending on CI fix attempts |
| `retryStrategy` (#730) | Could retry with a stronger model if fix attempt fails |
| `checkConclusion` (this proposal) | The core enabler — gates discovery on CI outcome |

## Why This Is a Growth Opportunity

1. **Broad appeal**: Every team with CI/CD pipelines is a potential user. CI failure is the single most common developer friction point.
2. **Easy to demo**: "Watch this — I push broken code, and Kelos fixes it automatically" is a compelling demo.
3. **Incremental adoption**: Teams can start by auto-fixing lint/format issues (low risk), then expand to test failures (medium risk). No big-bang adoption needed.
4. **Differentiator**: Most AI coding tools operate at the IDE level. CI-failure remediation operates at the pipeline level — a space where Kelos's Kubernetes-native, event-driven architecture is a natural fit.
5. **Complements existing use cases**: Can be combined with the PR review spawner (#kelos-reviewer) and dependency upgrade validation (#722) for a comprehensive PR automation story.

/kind feature

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New use case: CI-failure-driven agent remediation with check status filtering on githubPullRequests #809

Area: New Use Cases + API Extension

Summary

The Use Case: Auto-Fix Broken CI

Who benefits

What the workflow looks like

Why Kelos is uniquely suited

Current Gap

1. No CI check status filtering on `githubPullRequests`

2. No check failure details in prompt template variables

3. The workaround is fragile

Proposed API Changes

1. Add `checkConclusion` filter to `GitHubPullRequestsSpec`

2. Add check failure details to WorkItem and prompt templates

3. Implementation in GitHub PR source

Example Configurations

Example 1: Auto-fix lint failures

Example 2: Diagnose and fix test failures with a pipeline

Example 3: Open-source maintainer — auto-fix contributor PRs

Interaction with Existing and Proposed Features

Why This Is a Growth Opportunity

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature	How it works with CI-failure remediation
`maxConcurrency` (implemented)	Prevents spawning too many fix agents at once
Branch locking (implemented)	Ensures only one fix agent per branch
`ttlSecondsAfterFinished` (implemented)	Cleans up completed fix tasks
`reporting` (implemented)	Posts progress comments on the PR
`priorityLabels` (implemented)	Prioritize fixing high-priority PRs first
`retriggerOnPush` (#752)	Would re-engage agent when CI fails again after a fix attempt
`filePatterns` (#778)	Could combine: only trigger when changed files match patterns AND CI fails
`costBudget` (#788)	Limits spending on CI fix attempts
`retryStrategy` (#730)	Could retry with a stronger model if fix attempt fails
`checkConclusion` (this proposal)	The core enabler — gates discovery on CI outcome

New use case: CI-failure-driven agent remediation with check status filtering on githubPullRequests #809

Description

Area: New Use Cases + API Extension

Summary

The Use Case: Auto-Fix Broken CI

Who benefits

What the workflow looks like

Why Kelos is uniquely suited

Current Gap

1. No CI check status filtering on githubPullRequests

2. No check failure details in prompt template variables

3. The workaround is fragile

Proposed API Changes

1. Add checkConclusion filter to GitHubPullRequestsSpec

2. Add check failure details to WorkItem and prompt templates

3. Implementation in GitHub PR source

Example Configurations

Example 1: Auto-fix lint failures

Example 2: Diagnose and fix test failures with a pipeline

Example 3: Open-source maintainer — auto-fix contributor PRs

Interaction with Existing and Proposed Features

Why This Is a Growth Opportunity

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. No CI check status filtering on `githubPullRequests`

1. Add `checkConclusion` filter to `GitHubPullRequestsSpec`