Skip to content

Commit 7d64d1b

Browse files
Add autosolve actions and workflows for automated issue resolution
Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
1 parent 197c84a commit 7d64d1b

15 files changed

Lines changed: 225 additions & 227 deletions

.github/workflows/github-issue-autosolve.yml

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,10 @@ on:
7777
type: string
7878
required: false
7979
default: "autosolve[bot]@users.noreply.github.com"
80+
timeout_minutes:
81+
type: number
82+
required: false
83+
default: 20
8084
secrets:
8185
repo_token:
8286
required: true
@@ -148,7 +152,7 @@ jobs:
148152
needs: check
149153
if: needs.check.outputs.pr_exists != 'true'
150154
runs-on: ubuntu-latest
151-
timeout-minutes: 120
155+
timeout-minutes: ${{ inputs.timeout_minutes }}
152156
permissions:
153157
contents: read
154158
issues: write
@@ -165,6 +169,8 @@ jobs:
165169
- uses: actions/checkout@v5
166170
with:
167171
fetch-depth: 0
172+
# Prevent the checkout credential helper from overriding the
173+
# fork_push_token used later for git push to the fork.
168174
persist-credentials: false
169175

170176
# Checkout cockroachdb/actions at the ref the caller used in their
@@ -224,13 +230,12 @@ jobs:
224230
CLAUDE_CODE_USE_VERTEX: ${{ inputs.auth_mode == 'vertex' && '1' || '' }}
225231
ANTHROPIC_VERTEX_PROJECT_ID: ${{ inputs.auth_mode == 'vertex' && inputs.vertex_project_id || '' }}
226232
CLOUD_ML_REGION: ${{ inputs.auth_mode == 'vertex' && inputs.vertex_region || '' }}
227-
CLAUDE_CODE_USE_BEDROCK: ${{ inputs.auth_mode == 'bedrock' && '1' || '' }}
228233

229234
- name: Install Claude CLI
230235
shell: bash
231236
run: ${{ env.ACTIONS_DIR }}/run_step.sh shared install_claude
232237
env:
233-
CLAUDE_CLI_VERSION: "2.1.76"
238+
CLAUDE_CLI_VERSION: "2.1.79"
234239

235240
- name: Build assessment prompt
236241
id: assess_prompt
@@ -252,7 +257,6 @@ jobs:
252257
CLAUDE_CODE_USE_VERTEX: ${{ inputs.auth_mode == 'vertex' && '1' || '' }}
253258
ANTHROPIC_VERTEX_PROJECT_ID: ${{ inputs.auth_mode == 'vertex' && inputs.vertex_project_id || '' }}
254259
CLOUD_ML_REGION: ${{ inputs.auth_mode == 'vertex' && inputs.vertex_region || '' }}
255-
CLAUDE_CODE_USE_BEDROCK: ${{ inputs.auth_mode == 'bedrock' && '1' || '' }}
256260
PROMPT_FILE: ${{ steps.assess_prompt.outputs.prompt_file }}
257261
INPUT_MODEL: ${{ inputs.model }}
258262

@@ -298,7 +302,6 @@ jobs:
298302
CLAUDE_CODE_USE_VERTEX: ${{ inputs.auth_mode == 'vertex' && '1' || '' }}
299303
ANTHROPIC_VERTEX_PROJECT_ID: ${{ inputs.auth_mode == 'vertex' && inputs.vertex_project_id || '' }}
300304
CLOUD_ML_REGION: ${{ inputs.auth_mode == 'vertex' && inputs.vertex_region || '' }}
301-
CLAUDE_CODE_USE_BEDROCK: ${{ inputs.auth_mode == 'bedrock' && '1' || '' }}
302305
PROMPT_FILE: ${{ steps.impl_prompt.outputs.prompt_file }}
303306
INPUT_MODEL: ${{ inputs.model }}
304307
INPUT_ALLOWED_TOOLS: ${{ inputs.allowed_tools }}

CHANGELOG.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,5 @@ Breaking changes are prefixed with "Breaking Change: ".
1616
security, push to fork, and create PRs using Claude.
1717
- `github-issue-autosolve` reusable workflow: turnkey GitHub Issues
1818
integration with issue comments and label management.
19-
- `jira-autosolve` reusable workflow: turnkey Jira integration composing
20-
autosolve/assess + autosolve/implement with ticket comments and transitions.
2119
- `autotag-from-changelog` action: tag and push from CHANGELOG.md version
2220
change.

README.md

Lines changed: 6 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,8 @@ task is suitable for automated resolution.
6363
| `result` | Full Claude result text |
6464

6565
**`autosolve/implement`** — Runs Claude to implement a solution, validates
66-
changes against blocked paths, pushes to a fork, and creates a PR.
66+
changes against blocked paths, pushes to a fork, and creates a single-commit
67+
PR.
6768

6869
```yaml
6970
- uses: cockroachdb/actions/autosolve/implement@v1
@@ -84,13 +85,12 @@ changes against blocked paths, pushes to a fork, and creates a PR.
8485
| `allowed_tools` | *(read/write/git tools)* | Claude `--allowedTools` string |
8586
| `model` | `claude-opus-4-6` | Claude model ID |
8687
| `max_retries` | `3` | Maximum implementation attempts |
87-
| `timeout_minutes` | `60` | Maximum wall-clock time |
8888
| `create_pr` | `true` | Whether to create a PR from the changes |
8989
| `pr_base_branch` | *(repo default)* | Base branch for the PR |
9090
| `pr_labels` | `autosolve` | Comma-separated labels to apply |
9191
| `pr_draft` | `true` | Whether to create as a draft PR |
9292
| `pr_title` | *(from commit)* | PR title |
93-
| `pr_body_template` | *(built-in)* | Template with `{{SUMMARY}}`, `{{STATS}}`, `{{BRANCH}}` placeholders |
93+
| `pr_body_template` | *(built-in)* | Template with `{{SUMMARY}}`, `{{BRANCH}}` placeholders |
9494
| `fork_owner` | | GitHub user/org that owns the fork |
9595
| `fork_repo` | | Fork repository name |
9696
| `fork_push_token` | | PAT with push access to the fork |
@@ -110,26 +110,6 @@ changes against blocked paths, pushes to a fork, and creates a PR.
110110

111111
#### Reusable Workflows
112112

113-
**Jira Autosolve** — Composes assess + implement with Jira comments and ticket
114-
transitions. Triggered via `workflow_call`.
115-
116-
```yaml
117-
jobs:
118-
solve:
119-
uses: cockroachdb/actions/.github/workflows/jira-autosolve.yml@v1
120-
with:
121-
ticket_id: PROJ-123
122-
title: ${{ needs.parse.outputs.title }}
123-
description: ${{ needs.parse.outputs.description }}
124-
jira_base_url: https://yourcompany.atlassian.net
125-
fork_owner: my-bot
126-
fork_repo: my-repo
127-
secrets:
128-
jira_token: ${{ secrets.JIRA_TOKEN }}
129-
fork_push_token: ${{ secrets.FORK_PUSH_TOKEN }}
130-
pr_create_token: ${{ secrets.PR_CREATE_TOKEN }}
131-
```
132-
133113
**GitHub Issue Autosolve** — Composes assess + implement with GitHub issue
134114
comments and label management. Triggered via `workflow_call`.
135115

@@ -150,8 +130,8 @@ jobs:
150130

151131
#### Authentication
152132

153-
**Reusable workflows** accept `auth_mode` as an input (`vertex`, `bedrock`, or
154-
omit for API key) and handle env var setup internally.
133+
**Reusable workflows** accept `auth_mode` as an input (`vertex` or omit for API
134+
key) and handle env var setup internally.
155135

156136
**Direct composite action usage** requires the caller to set up auth and pass
157137
the env vars on each action step:
@@ -174,24 +154,7 @@ the env vars on each action step:
174154
```
175155

176156
Alternatively, set `ANTHROPIC_API_KEY` in the environment for direct API
177-
access, or configure Bedrock with `CLAUDE_CODE_USE_BEDROCK=1` and `AWS_REGION`.
178-
179-
#### Caller checkout
180-
181-
When using `workflow_dispatch`, `actions/checkout` defaults to the branch that
182-
triggered the workflow. This can include unrelated commits from that branch in
183-
the autosolve PR. Always check out the PR base branch explicitly:
184-
185-
```yaml
186-
- uses: actions/checkout@v5
187-
with:
188-
ref: main # checkout the PR base branch, not the trigger ref
189-
fetch-depth: 0
190-
persist-credentials: false # prevent checkout's credential helper from interfering with fork push
191-
```
192-
193-
The `issues: [labeled]` trigger doesn't have this problem since it always runs
194-
on the default branch.
157+
access.
195158

196159
## Development
197160

actions_helpers.sh

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,9 @@
55
log_error() { echo "::error::$*"; }
66
log_warning() { echo "::warning::$*"; }
77
log_notice() { echo "::notice::$*"; }
8+
# Plain informational output — no GitHub annotation, just step log output.
9+
# Use for multi-line diagnostic data where ::notice:: would be inappropriate.
10+
log_info() { echo "$*"; }
811

912
# Write a single-line output: set_output key value
1013
set_output() {

autosolve/assess/action.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ inputs:
2929
claude_cli_version:
3030
description: Claude CLI version to install.
3131
required: false
32-
default: "2.1.76"
32+
default: "2.1.79"
3333

3434
outputs:
3535
assessment:

autosolve/implement/action.yml

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -26,10 +26,6 @@ inputs:
2626
description: Maximum implementation attempts.
2727
required: false
2828
default: "3"
29-
timeout_minutes:
30-
description: Maximum wall-clock time for implementation.
31-
required: false
32-
default: "60"
3329
create_pr:
3430
description: Whether to create a PR from the changes.
3531
required: false
@@ -51,7 +47,7 @@ inputs:
5147
required: false
5248
default: ""
5349
pr_body_template:
54-
description: "Template for the PR body. Supports placeholders: {{SUMMARY}}, {{STATS}}, {{BRANCH}}."
50+
description: "Template for the PR body. Supports placeholders: {{SUMMARY}}, {{BRANCH}}."
5551
required: false
5652
default: ""
5753
fork_owner:
@@ -89,7 +85,7 @@ inputs:
8985
claude_cli_version:
9086
description: Claude CLI version to install.
9187
required: false
92-
default: "2.1.76"
88+
default: "2.1.79"
9389

9490
outputs:
9591
status:

autosolve/prompts/implementation-footer.md

Lines changed: 35 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -4,23 +4,45 @@ Implement the task described above.
44
1. Read CLAUDE.md (if it exists) for project conventions, build commands,
55
test commands, and commit message format.
66
2. Understand the codebase and the task requirements.
7-
3. Implement the minimal changes required. Prefer backwards-compatible
7+
3. When fixing bugs, prefer a test-first approach:
8+
a. Write a test that demonstrates the bug (verify it fails).
9+
b. Apply the fix.
10+
c. Verify the test passes.
11+
Skip writing a dedicated test when the fix is trivial and self-evident
12+
(e.g., adding a timeout, fixing a typo), the behavior is impractical to
13+
unit test (e.g., network timeouts, OS-level behavior), or the fix is a
14+
documentation-only change. The goal is to prove the bug existed and
15+
confirm it's resolved, not to test for testing's sake.
16+
4. Implement the minimal changes required. Prefer backwards-compatible
817
changes wherever possible — avoid breaking existing APIs, interfaces,
918
or behavior unless the task explicitly requires it.
10-
4. Run relevant tests to verify your changes work. Only test the specific
19+
5. Run relevant tests to verify your changes work. Only test the specific
1120
packages/files affected by your changes.
12-
5. If tests fail, fix the issues and re-run. Only report FAILED if you
21+
6. If tests fail, fix the issues and re-run. Only report FAILED if you
1322
cannot make tests pass after reasonable effort.
14-
6. Stage all your changes with `git add`. Do not commit — the action
15-
handles committing.
16-
7. Write a short commit message summary (one line, under 72 characters)
17-
and save it to `.autosolve-commit-message` in the repo root. Focus on
18-
*why* the change was made, not what files changed. Use imperative mood
19-
(e.g., "Fix timeout in retry loop" not "Fixed timeout" or "Changes to
20-
retry logic"). If CLAUDE.md specifies a commit message format, follow
21-
that instead.
22-
8. Write a PR description and save it to `.autosolve-pr-body` in the repo
23-
root. This will be used as the body of the pull request. Include:
23+
7. Stage all your changes with `git add`. Do not commit — the action
24+
handles committing. All changes will be squashed into a single commit,
25+
so organize your work accordingly.
26+
8. Write a commit message and save it to `.autosolve-commit-message` in
27+
the repo root. Use standard git format: a subject line (under 72
28+
characters, imperative mood), a blank line, then a body explaining
29+
what was changed and why. Since all changes go into a single commit,
30+
the message should cover the full scope of the change. Focus on
31+
helping a reviewer understand the commit — do NOT list individual
32+
files. Example:
33+
```
34+
Fix timeout in retry loop
35+
36+
The retry loop was using a hardcoded 5s timeout which was too short
37+
for large payloads. Increased to 30s and made it configurable via
38+
the RETRY_TIMEOUT env var. Added a test that verifies retry behavior
39+
with slow responses.
40+
```
41+
If CLAUDE.md specifies a commit message format, follow that instead.
42+
9. Write a PR description and save it to `.autosolve-pr-body` in the repo
43+
root. This will be used as the body of the pull request. The PR
44+
description and commit message serve similar purposes for single-commit
45+
PRs, but the PR description should be more reader-friendly. Include:
2446
- A brief summary of what was changed and why (2-3 sentences max).
2547
- What testing was done (tests added, tests run, manual verification).
2648
Do NOT include a list of changed files — reviewers can see that in the

autosolve/run_step.sh

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,17 @@
11
#!/usr/bin/env bash
22
# Entry point for autosolve action steps.
33
#
4-
# Usage: run_step.sh <script> <function> [args...]
4+
# Composite action steps run in a fresh shell, so sourcing scripts directly
5+
# would leave them cd'd to the scripts/ directory instead of the workspace.
6+
# This wrapper solves three problems:
7+
# 1. Sources the target script (which cd's to its own directory for clean
8+
# relative imports of shared.sh, actions_helpers.sh, etc.).
9+
# 2. Restores the original working directory so the function runs in the
10+
# caller's workspace (where the repo checkout lives).
11+
# 3. Manages a shared AUTOSOLVE_TMPDIR across composite action steps
12+
# (each step is a new shell process).
513
#
6-
# Sources autosolve/scripts/<script>.sh (which sources its own deps),
7-
# then calls <function> from the original working directory.
14+
# Usage: run_step.sh <script> <function> [args...]
815
#
916
# Examples:
1017
# run_step.sh shared validate_inputs

autosolve/scripts/assess.sh

Lines changed: 16 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -7,12 +7,12 @@ source ../../actions_helpers.sh
77
source ./shared.sh
88

99
run_assessment() {
10-
command -v claude >/dev/null || { log_error "claude CLI not found on PATH"; return 1; }
10+
require_command claude
1111
local prompt_file="${PROMPT_FILE:?PROMPT_FILE must be set}"
12-
local model="${INPUT_MODEL:-claude-opus-4-6}"
12+
local model="${INPUT_MODEL:?INPUT_MODEL must be set}"
1313
local output_file="$AUTOSOLVE_TMPDIR/assessment.json"
1414

15-
echo "Running assessment with model: $model"
15+
log_notice "Running assessment with model: $model"
1616

1717
local exit_code=0
1818
claude --print \
@@ -27,6 +27,8 @@ run_assessment() {
2727
fi
2828

2929
local result_text
30+
# extract_result returns non-zero when the marker isn't found; prevent
31+
# set -e from exiting so we can handle missing results below.
3032
result_text="$(extract_result "$output_file" "ASSESSMENT_RESULT")" || true
3133

3234
if [ -z "$result_text" ]; then
@@ -35,11 +37,14 @@ run_assessment() {
3537
return 1
3638
fi
3739

40+
# Log the full assessment result so it appears in the action run logs.
41+
log_info "$result_text"
42+
3843
if echo "$result_text" | grep --quiet "ASSESSMENT_RESULT - PROCEED"; then
39-
echo "Assessment: PROCEED"
44+
log_notice "Assessment: PROCEED"
4045
set_output "assessment" "PROCEED"
4146
elif echo "$result_text" | grep --quiet "ASSESSMENT_RESULT - SKIP"; then
42-
echo "Assessment: SKIP"
47+
log_notice "Assessment: SKIP"
4348
set_output "assessment" "SKIP"
4449
else
4550
log_error "Assessment result did not contain a valid PROCEED or SKIP marker"
@@ -60,18 +65,15 @@ set_assess_outputs() {
6065

6166
# Extract summary: everything before the ASSESSMENT_RESULT line
6267
local summary
63-
summary="$(echo "$result_text" | sed '/^ASSESSMENT_RESULT/d' | head -50)"
68+
summary="$(truncate_output 200 "$(echo "$result_text" | sed '/^ASSESSMENT_RESULT/d')")"
6469

6570
set_output "assessment" "$assessment"
6671
set_output_multiline "summary" "$summary"
6772
set_output_multiline "result" "$result_text"
6873

69-
{
70-
echo "## Autosolve Assessment"
71-
echo "**Result:** $assessment"
72-
if [ -n "$summary" ]; then
73-
echo "### Summary"
74-
echo "$summary"
75-
fi
76-
} >> "${GITHUB_STEP_SUMMARY:-/dev/null}"
74+
write_step_summary <<EOF
75+
## Autosolve Assessment
76+
**Result:** $assessment
77+
$([ -n "$summary" ] && printf '### Summary\n%s' "$summary")
78+
EOF
7779
}

autosolve/scripts/assess_test.sh

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22
# Tests for assess.sh functions.
33
# shellcheck disable=SC2034 # Variables are read by sourced functions
44
set -euo pipefail
5-
trap 'echo "Error occurred at line $LINENO"; exit 1' ERR
65

76
cd "$(dirname "${BASH_SOURCE[0]}")"
87
source ../../test_helpers.sh
@@ -23,7 +22,7 @@ test_outputs_proceed() {
2322
printf 'The task is clear and bounded.\nASSESSMENT_RESULT - PROCEED\n' > "$AUTOSOLVE_TMPDIR/assessment_result.txt"
2423
ASSESS_RESULT=PROCEED
2524
set_assess_outputs
26-
grep -q 'assessment=PROCEED' "$GITHUB_OUTPUT"
25+
check_contains 'assessment=PROCEED' "$GITHUB_OUTPUT"
2726
}
2827
expect_success "set_assess_outputs: PROCEED" test_outputs_proceed
2928

@@ -33,7 +32,7 @@ test_outputs_skip() {
3332
printf 'Too ambiguous for automation.\nASSESSMENT_RESULT - SKIP\n' > "$AUTOSOLVE_TMPDIR/assessment_result.txt"
3433
ASSESS_RESULT=SKIP
3534
set_assess_outputs
36-
grep -q 'assessment=SKIP' "$GITHUB_OUTPUT"
35+
check_contains 'assessment=SKIP' "$GITHUB_OUTPUT"
3736
}
3837
expect_success "set_assess_outputs: SKIP" test_outputs_skip
3938

@@ -46,7 +45,7 @@ test_outputs_summary_strips_marker() {
4645
# Extract just the summary block (between summary<<DELIM and DELIM) and verify marker is absent
4746
local summary
4847
summary=$(sed -n '/^summary<</,/^GHEOF_/p' "$GITHUB_OUTPUT")
49-
echo "$summary" | grep -q 'This is the reasoning' && ! echo "$summary" | grep -q 'ASSESSMENT_RESULT'
48+
echo "$summary" | check_contains 'This is the reasoning' && ! echo "$summary" | check_contains 'ASSESSMENT_RESULT'
5049
}
5150
expect_success "set_assess_outputs: summary strips marker" test_outputs_summary_strips_marker
5251

@@ -56,8 +55,8 @@ test_outputs_no_result_file() {
5655
rm -f "$AUTOSOLVE_TMPDIR/assessment_result.txt"
5756
ASSESS_RESULT=ERROR
5857
set_assess_outputs
59-
grep -q 'assessment=ERROR' "$GITHUB_OUTPUT"
58+
check_contains 'assessment=ERROR' "$GITHUB_OUTPUT"
6059
}
61-
expect_success "set_assess_outputs: no result file" test_outputs_no_result_file
60+
expect_success "set_assess_outputs: no result file results in ERROR" test_outputs_no_result_file
6261

6362
print_results

0 commit comments

Comments
 (0)