|
1 | 1 | --- |
2 | 2 | name: pr-build-status |
3 | | -description: "Retrieve Azure DevOps build information for GitHub Pull Requests, including build IDs, stage status, and failed jobs." |
| 3 | +description: "Retrieve and analyze Azure DevOps build failures for GitHub PRs. Use when CI fails. CRITICAL: Collect ALL errors from ALL platforms FIRST, write hypotheses to file, then fix systematically." |
4 | 4 | metadata: |
5 | 5 | author: dotnet-maui |
6 | | - version: "1.0" |
| 6 | + version: "2.0" |
7 | 7 | compatibility: Requires GitHub CLI (gh) authenticated with access to dotnet/fsharp repository. |
8 | 8 | --- |
9 | 9 |
|
10 | 10 | # PR Build Status Skill |
11 | 11 |
|
12 | | -Retrieve Azure DevOps build information for GitHub Pull Requests. |
| 12 | +Retrieve and systematically analyze Azure DevOps build failures for GitHub PRs. |
13 | 13 |
|
14 | | -## Tools Required |
| 14 | +## CRITICAL: Collect-First Workflow |
15 | 15 |
|
16 | | -This skill uses `bash` together with `pwsh` (PowerShell 7+) to run the PowerShell scripts. No file editing or other tools are required. |
| 16 | +**DO NOT push fixes until ALL errors are collected and reproduced locally.** |
17 | 17 |
|
18 | | -## When to Use |
| 18 | +LLMs tend to focus on the first error found and ignore others. This causes: |
| 19 | +- Multiple push/wait/fail cycles |
| 20 | +- CI results being overwritten before full analysis |
| 21 | +- Missing platform-specific failures (Linux vs Windows vs MacOS) |
19 | 22 |
|
20 | | -- User asks about CI/CD status for a PR |
21 | | -- User asks about failed checks or builds |
22 | | -- User asks "what's failing on PR #XXXXX" |
23 | | -- User wants to see test results |
| 23 | +### Mandatory Workflow |
| 24 | + |
| 25 | +``` |
| 26 | +1. COLLECT ALL → Get errors from ALL jobs across ALL platforms |
| 27 | +2. DOCUMENT → Write CI_ERRORS.md with hypotheses per platform |
| 28 | +3. REPRODUCE → Run each failing test LOCALLY (in isolation!) |
| 29 | +4. FIX → Fix each issue, verify locally |
| 30 | +5. PUSH → Only after ALL issues verified fixed |
| 31 | +``` |
24 | 32 |
|
25 | 33 | ## Scripts |
26 | 34 |
|
27 | 35 | All scripts are in `.github/skills/pr-build-status/scripts/` |
28 | 36 |
|
29 | 37 | ### 1. Get Build IDs for a PR |
30 | | -```bash |
| 38 | +```powershell |
31 | 39 | pwsh .github/skills/pr-build-status/scripts/Get-PrBuildIds.ps1 -PrNumber <PR_NUMBER> |
32 | 40 | ``` |
33 | 41 |
|
34 | | -### 2. Get Build Status |
35 | | -```bash |
| 42 | +### 2. Get Build Status (List ALL Failed Jobs) |
| 43 | +```powershell |
| 44 | +# Get overview of all stages and jobs |
36 | 45 | pwsh .github/skills/pr-build-status/scripts/Get-BuildInfo.ps1 -BuildId <BUILD_ID> |
37 | | -# For failed jobs only: |
| 46 | +
|
| 47 | +# Get ONLY failed jobs (use this to see all failing platforms) |
38 | 48 | pwsh .github/skills/pr-build-status/scripts/Get-BuildInfo.ps1 -BuildId <BUILD_ID> -FailedOnly |
39 | 49 | ``` |
40 | 50 |
|
41 | 51 | ### 3. Get Build Errors and Test Failures |
42 | | -```bash |
43 | | -# Get all errors (build errors + test failures) |
| 52 | +```powershell |
| 53 | +# Get ALL errors (build errors + test failures) - USE THIS FIRST |
44 | 54 | pwsh .github/skills/pr-build-status/scripts/Get-BuildErrors.ps1 -BuildId <BUILD_ID> |
45 | 55 |
|
46 | | -# Get only build/compilation errors |
47 | | -pwsh .github/skills/pr-build-status/scripts/Get-BuildErrors.ps1 -BuildId <BUILD_ID> -ErrorsOnly |
| 56 | +# Filter to specific job (after getting overview) |
| 57 | +pwsh .github/skills/pr-build-status/scripts/Get-BuildErrors.ps1 -BuildId <BUILD_ID> -JobFilter "*Linux*" |
| 58 | +pwsh .github/skills/pr-build-status/scripts/Get-BuildErrors.ps1 -BuildId <BUILD_ID> -JobFilter "*Windows*" |
| 59 | +pwsh .github/skills/pr-build-status/scripts/Get-BuildErrors.ps1 -BuildId <BUILD_ID> -JobFilter "*MacOS*" |
| 60 | +``` |
| 61 | + |
| 62 | +### 4. Direct API Access (for detailed logs) |
| 63 | +```powershell |
| 64 | +# Get timeline with all jobs |
| 65 | +$uri = "https://dev.azure.com/dnceng-public/public/_apis/build/builds/<BUILD_ID>/timeline?api-version=7.1" |
| 66 | +Invoke-RestMethod -Uri $uri | Select-Object -ExpandProperty records | Where-Object { $_.result -eq "failed" } |
48 | 67 |
|
49 | | -# Get only test failures |
50 | | -pwsh .github/skills/pr-build-status/scripts/Get-BuildErrors.ps1 -BuildId <BUILD_ID> -TestsOnly |
| 68 | +# Get specific log content |
| 69 | +$logUri = "https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_apis/build/builds/<BUILD_ID>/logs/<LOG_ID>" |
| 70 | +Invoke-RestMethod -Uri $logUri | Select-String "Failed|Error|FAIL" |
51 | 71 | ``` |
52 | 72 |
|
53 | | -## Workflow |
| 73 | +## Step-by-Step Analysis Procedure |
| 74 | + |
| 75 | +### Step 1: Get Failed Build ID |
| 76 | +```powershell |
| 77 | +pwsh .github/skills/pr-build-status/scripts/Get-PrBuildIds.ps1 -PrNumber XXXXX |
| 78 | +# Note the BuildId with FAILED state |
| 79 | +``` |
54 | 80 |
|
55 | | -1. Get build IDs: `scripts/Get-PrBuildIds.ps1 -PrNumber XXXXX` |
56 | | -2. For each build, get status: `scripts/Get-BuildInfo.ps1 -BuildId YYYYY` |
57 | | -3. For failed builds, get error details: `scripts/Get-BuildErrors.ps1 -BuildId YYYYY` |
| 81 | +### Step 2: List ALL Failed Jobs (Cross-Platform!) |
| 82 | +```powershell |
| 83 | +pwsh .github/skills/pr-build-status/scripts/Get-BuildInfo.ps1 -BuildId YYYYY -FailedOnly |
| 84 | +``` |
| 85 | +**IMPORTANT**: Note jobs from EACH platform: |
| 86 | +- Linux jobs |
| 87 | +- Windows jobs |
| 88 | +- MacOS jobs |
| 89 | +- Different test configurations (net10.0 vs net472, etc.) |
| 90 | + |
| 91 | +### Step 3: Get Errors Per Platform |
| 92 | +```powershell |
| 93 | +# Collect errors from EACH platform separately |
| 94 | +pwsh .github/skills/pr-build-status/scripts/Get-BuildErrors.ps1 -BuildId YYYYY -JobFilter "*Linux*" |
| 95 | +pwsh .github/skills/pr-build-status/scripts/Get-BuildErrors.ps1 -BuildId YYYYY -JobFilter "*Windows*" |
| 96 | +pwsh .github/skills/pr-build-status/scripts/Get-BuildErrors.ps1 -BuildId YYYYY -JobFilter "*MacOS*" |
| 97 | +``` |
| 98 | + |
| 99 | +### Step 4: Write CI_ERRORS.md |
| 100 | +Create a file in session workspace with ALL findings: |
| 101 | +```markdown |
| 102 | +# CI Errors for PR #XXXXX - Build YYYYY |
| 103 | + |
| 104 | +## Failed Jobs Summary |
| 105 | +| Platform | Job Name | Error Type | |
| 106 | +|----------|----------|------------| |
| 107 | +| Linux | ... | Test | |
| 108 | +| Windows | ... | Test | |
| 109 | + |
| 110 | +## Hypothesis Per Platform |
| 111 | + |
| 112 | +### Linux/MacOS Failures |
| 113 | +- Error: "The type 'int' is not defined" |
| 114 | +- Hypothesis: Missing FSharp.Core reference in test setup |
| 115 | +- Reproduction: `dotnet test ... -f net10.0` |
| 116 | + |
| 117 | +### Windows Failures |
| 118 | +- Error: "Expected cache hits for generic patterns" |
| 119 | +- Hypothesis: Flaky test assertion, passes with other tests |
| 120 | +- Reproduction: `dotnet test ... --filter "FullyQualifiedName~rigid generic"` |
| 121 | + |
| 122 | +## Reproduction Commands |
| 123 | +... |
| 124 | + |
| 125 | +## Fix Verification Checklist |
| 126 | +- [ ] Linux error reproduced locally |
| 127 | +- [ ] Windows error reproduced locally |
| 128 | +- [ ] Fix verified for Linux |
| 129 | +- [ ] Fix verified for Windows |
| 130 | +- [ ] Tests run IN ISOLATION (not just with other tests) |
| 131 | +``` |
| 132 | + |
| 133 | +### Step 5: Reproduce Locally BEFORE Fixing |
| 134 | +```powershell |
| 135 | +# Run failing tests IN ISOLATION (critical!) |
| 136 | +dotnet test ... --filter "FullyQualifiedName~FailingTestName" -f net10.0 |
| 137 | +
|
| 138 | +# Run multiple times to check for flakiness |
| 139 | +for ($i = 1; $i -le 3; $i++) { dotnet test ... } |
| 140 | +``` |
| 141 | + |
| 142 | +### Step 6: Fix and Verify |
| 143 | +Only after ALL issues reproduced: |
| 144 | +1. Fix each issue |
| 145 | +2. Verify each fix locally (run test in isolation!) |
| 146 | +3. Run full test suite |
| 147 | +4. Check formatting |
| 148 | +5. THEN push |
| 149 | + |
| 150 | +## Common Pitfalls |
| 151 | + |
| 152 | +### ❌ Mistake: Focus on First Error Only |
| 153 | +``` |
| 154 | +See Linux error → Fix → Push → Wait → See Windows error → Fix → Push → ... |
| 155 | +``` |
| 156 | + |
| 157 | +### ✅ Correct: Collect All First |
| 158 | +``` |
| 159 | +See Linux error → See Windows error → See MacOS error → Document all → |
| 160 | +Fix all → Verify all locally → Push once |
| 161 | +``` |
| 162 | + |
| 163 | +### ❌ Mistake: Run Tests Together |
| 164 | +``` |
| 165 | +dotnet test ... --filter "OverloadCacheTests" # All 8 pass together |
| 166 | +``` |
| 167 | + |
| 168 | +### ✅ Correct: Run Tests in Isolation |
| 169 | +``` |
| 170 | +dotnet test ... --filter "FullyQualifiedName~specific test name" # May fail alone! |
| 171 | +``` |
58 | 172 |
|
59 | 173 | ## Prerequisites |
60 | 174 |
|
61 | 175 | - `gh` (GitHub CLI) - authenticated |
62 | | -- `pwsh` (PowerShell 7+) |
| 176 | +- `pwsh` (PowerShell 7+) |
| 177 | +- Local build environment matching CI |
0 commit comments