Prevent silent regression of #17527: pin cascade + if/else ordering, add GA ship-build guard#17557
Draft
radical wants to merge 3 commits into
Draft
Conversation
…nel.ps1 PR microsoft#17528 fixed microsoft#17527 by reordering the channel-detect cascade in `eng/pipelines/templates/build_sign_native.yml` so a release-branch build with `DotNetFinalVersionKind=release` (the steady state during stabilization) resolves to `staging` rather than `stable`. That reorder is the single most load-bearing line in the fix, and it lived as inline `pwsh:` with zero test coverage. A future maintainer reordering the arms back to "obvious" order would re-introduce the staging-misroute and only the next stabilizing build would notice. Extract the cascade into `eng/scripts/compute-cli-channel.ps1` with the same parameters the YAML step previously bound (`-Reason`, `-SourceBranch`, `-PrNumber`, `-Override`, `-VersionKind`). The YAML step keeps the `DotNetFinalVersionKind` resolution (which needs the in-tree `dotnet` + `Versions.props`) and forwards everything else to the script. The script preserves the existing `Write-Host` diagnostics and the `##vso[task.setvariable variable=aspireCliChannel]` emission so the rest of `build_sign_native.yml` is unaffected; the resolved channel additionally shows up on stdout via `Write-Output` for non-AzDO callers. Add `tests/Infrastructure.Tests/PowerShellScripts/ComputeCliChannelTests.cs` with a Theory-based set of cases that reuses the existing `PowerShellCommand` helper. Cascade cases covered: - release-branch + versionKind=release → `staging` (the microsoft#17527 fix-pin) - release-branch + versionKind=prerelease → `staging` (early stabilization) - internal/release branch → `staging` - main + versionKind=release → `stable` (unchanged) - main + versionKind=prerelease → `daily` (unchanged) - override `stable` beats the release-branch arm (the GA ship-build path) - override `staging` and `daily` route as requested - override `auto` falls through to the cascade - PullRequest with numeric `prNumber` → `pr-<N>` Negative cases: - PullRequest with the unresolved `$(System.PullRequest.PullRequestNumber)` macro string → throws (defense-in-depth that was previously untested) - override `garbage` → throws (the `-notin` validation arm) - override `Stable` (mixed case) → normalized to `stable` (pins the `.ToLowerInvariant()` workaround for PowerShell's case-insensitive `-notin`; without it, the binary would build but `IdentityChannelReader.IsValidChannel` would reject it at CLI startup) No behavior change for the source build itself: the script's output (both the `##vso[task.setvariable]` and the human-readable diagnostics) matches the previous inline block byte-for-byte. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…icrosoft#17527 regression The fix that landed in PR microsoft#17528 reorders the `if/else` in `PackagingService.GetChannelsAsync` so the `stagingIdentityChannel` arm runs before `stagingChannelConfigured || stagingChannelRequested`. The new test added in that PR — `GetChannelsAsync_WhenIdentityChannelIsStagingOnStableShapedCli_DefaultsToStableQuality` — calls `GetChannelsAsync()` with no `requestedChannelName`, so a future refactor that put `stagingChannelRequested` first (and forced `Both`) would still pass that test, silently re-introducing the microsoft#17527 misroute on the combinations that actually occur in production: - `InitCommand.cs:671` — `GetChannelsAsync(ct, identityChannel)` on a staging CLI running `aspire init`. - `IntegrationPackageSearchService.cs:26` (used by `aspire add`) and `DotNetBasedAppHostServerProject.cs:334` once a project's aspire.config.json pins `channel: staging`. Add three tests that exercise both signals together: - identity=staging + requested=staging + stable-shaped → `Stable`. This is the explicit ordering pin: if `stagingChannelRequested` ran first the result would be `Both` and the test fails. - identity=staging + requested=staging + prerelease-shaped → `Both`. The companion case asserts the identity arm itself consults `_isStableShapedCliVersion`; a refactor that dropped the predicate from the identity arm would flip this prerelease case from `Both` to `Stable` and silently misroute Aspire.* away from the shared daily feed where prerelease-shaped staging packages actually live. - identity=staging + `configuration["channel"]=staging` + stable-shaped → `Stable`. Same shape but for the configured-arm path, which is what `aspire add` against a staging-pinned project hits. No production code changes. Verified by running just the new tests: `dotnet test --project tests/Aspire.Cli.Tests/Aspire.Cli.Tests.csproj --no-launch-profile -- --filter-class "*.PackagingServiceTests" --filter-not-trait "quarantined=true" --filter-not-trait "outerloop=true"` → 53/53 pass (50 existing + 3 new). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
After PR microsoft#17528 reordered the channel-detect cascade so release-branch source builds default to `AspireCliChannel=staging`, the GA ship path silently depends on the release manager remembering to queue `microsoft-aspire` (definition 1602) with `aspireCliChannelOverride=stable`. Following the release docs at face value still selects "any successful signed build", so a single missed override at queue time publishes a `staging`-baked CLI binary to nuget.org — `aspire init` then writes a nuget.config that maps Aspire.* to the staging feed for end users, and `aspire add` 404s once the SHA-derived darc-pub feed for that build expires. microsoft#17527 is the underlying issue. Add a release-pipeline guard that fails fast if the selected source build is not stable-channel. Mechanism, three pieces: - `eng/scripts/compute-cli-channel.ps1` additionally emits `##vso[build.addbuildtag]aspire-cli-channel - <channel>` so the source build's tag set carries the resolved channel. The tag shape mirrors the existing `release-version - X.Y.Z` and `BAR ID - <id>` tags so the consumer can reuse the same tag-fetch REST call. `build.addbuildtag` is idempotent across jobs in the same build, so per-RID `build_sign_native` invocations all setting the same tag is safe — AzDO dedupes them on the build. - `eng/pipelines/release-publish-nuget.yml` adds a `Validate Source Build CLI Channel` step right after the BAR ID extraction. It fetches the source build's tags, finds the `aspire-cli-channel - *` tag, and fails the pipeline unless the resolved channel is `stable`. The step is gated by a new `AllowNonStableCliChannel` advanced parameter (default `false`) — the documented escape hatch for an intentional non-stable ship. `DryRun` also bypasses the guard so the pipeline remains testable against any source build. Error messages name the cause (missing tag vs. non-stable channel), point at the `microsoft-aspire` definition and the `aspireCliChannelOverride=stable` queue-time parameter, and tell the operator how to bypass. - `docs/release-process.md` updates: the Signed Build prerequisite highlights the override requirement; the source-build selection step tells the release manager to verify the `aspire-cli-channel - stable` tag at pick time; the Advanced parameters table documents `AllowNonStableCliChannel`. Test coverage in `ComputeCliChannelTests.cs` extends the Theory-based output assertions to also pin the `##vso[build.addbuildtag]aspire-cli-channel - <channel>` emission, so a future refactor that drops it would break the test suite rather than silently disabling the pipeline guard. 13.5+ planning: this guard is a 13.4 safety net for the same risk that microsoft#17550 is scoped to remove entirely by decoupling channel identity from build-time baking via a sidecar manifest written at acquisition time. Once microsoft#17550 lands, the queue-time override and this guard become moot and can be removed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
|
🚀 Dogfood this PR with:
curl -fsSL https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 17557Or
iex "& { $(irm https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 17557" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Three follow-ups to PR #17528 (which fixed #17527). Each addresses an inline review thread on that PR; together they make a silent regression of the staging-feed misroute hard to re-introduce, and add a release-pipeline safety net for the related GA-ship foot-gun.
What was at risk
PR #17528 closed #17527 by reordering the
release/*-branch arm of the AspireCliChannel cascade ahead of theversionKind=releasearm ineng/pipelines/templates/build_sign_native.yml, and by routing the synthesized staging channel's quality through the CLI build's version shape on thestagingIdentityChannelarm ofPackagingService.GetChannelsAsync. The fix is correct, but it leaves three load-bearing invariants with no test coverage and one queue-time foot-gun on the GA ship path:That error block is what the GA ship path now raises if the operator forgets
aspireCliChannelOverride=stableat queue time. Before this PR, the same condition silently shipped a staging-baked CLI to nuget.org users.What this PR adds
Three independent commits, each landable on its own:
1.
refactor(cli): extract Aspire CLI channel cascade to compute-cli-channel.ps1The cascade lived as inline
pwsh:inbuild_sign_native.ymlwith zero test coverage. A future maintainer reordering the arms back to "obvious" order would re-introduce #17527 and only the next stabilizing build would notice.Moves the cascade to
eng/scripts/compute-cli-channel.ps1with-Reason,-SourceBranch,-PrNumber,-Override,-VersionKindparameters. The YAML step retains theDotNetFinalVersionKindMSBuild resolution (it needs the in-tree dotnet +Versions.props) and forwards the rest to the script.Write-Hostdiagnostics and the##vso[task.setvariable variable=aspireCliChannel]emission are preserved verbatim; a newWrite-Output $channelline gives non-AzDO callers (tests, dev runs) a stable stdout contract.Adds
tests/Infrastructure.Tests/PowerShellScripts/ComputeCliChannelTests.cs(13 cases, Theory-based, reuses the existingPowerShellCommandhelper thatStageNativeCliToolPackagesTestsetc. already use). Cases pin every cascade arm including the #17527 fix-pin, the override path (includingStable-mixed-case normalization), and boththrowpaths (non-digit PR number, invalid override).2.
test(cli): pin PackagingService identity+requested ordering against #17527 regressionThe new test added in #17528 (
GetChannelsAsync_WhenIdentityChannelIsStagingOnStableShapedCli_DefaultsToStableQuality) callsGetChannelsAsync()with norequestedChannelName, so a future refactor that putstagingChannelRequestedfirst would still pass it — silently re-introducing the misroute for the combinations that actually occur in production:InitCommand.cs:671—GetChannelsAsync(ct, identityChannel)on a staging CLI runningaspire initIntegrationPackageSearchService.cs:26andDotNetBasedAppHostServerProject.cs:334onceaspire.config.jsonpinschannel: stagingAdds three tests that exercise both signals together:
identity=staging + requested=stagingfor both stable-shaped (expectStable, pins ordering) and prerelease-shaped (expectBoth, pins the predicate) version shapes, plusidentity=staging + configuration["channel"]=staging + stable-shaped(pins the configured-arm path thataspire addhits). No production code change.3.
ci(release): guard GA ship build against AspireCliChannel != stableAfter #17528,
release/*source builds default toAspireCliChannel=staging. The GA ship path silently depends on the release manager queueingmicrosoft-aspire(definition 1602) withaspireCliChannelOverride=stable. Without this guard, one missed queue-time parameter ships astaging-baked CLI to nuget.org andaspire initwrites a staging-feed nuget.config for every GA user.Three pieces:
eng/scripts/compute-cli-channel.ps1additionally emits##vso[build.addbuildtag]aspire-cli-channel - <channel>so the source build's tag set carries the resolved channel. Tag shape mirrors the existingrelease-version - X.Y.ZandBAR ID - <id>tags so the consumer reuses the same tag-fetch REST call.build.addbuildtagis idempotent across jobs, so the per-RIDbuild_sign_nativeinvocations setting the same tag is safe.eng/pipelines/release-publish-nuget.ymladds aValidate Source Build CLI Channelstep right after BAR ID extraction. Fetches the source build's tags, findsaspire-cli-channel - *, and fails unless the resolved channel isstable. Gated by a newAllowNonStableCliChanneladvanced parameter (defaultfalse) — the documented escape hatch.DryRunalso bypasses the guard so the pipeline remains testable against any source build. Error messages name the cause, point at the right pipeline + queue-time override, and tell the operator how to bypass.docs/release-process.mdupdates: the Signed Build prerequisite highlightsaspireCliChannelOverride=stable; the source-build selection step tells the release manager to verifyaspire-cli-channel - stableat pick time; the Advanced parameters table documentsAllowNonStableCliChannel.The
ComputeCliChannelTestsTheory assertions extend to also pin the##vso[build.addbuildtag]emission, so a future refactor that drops it would break the test suite rather than silently disabling the pipeline guard.Surprises and call-outs
Builds queued before this PR lands will fail the guard because they have no
aspire-cli-channel - *tag. That's intentional — the guard's error message tells the operator to queue a fresh source build with the override. For the 13.4 GA, you'll want a fresh source build off this branch.The whole queue-time-override mechanism is intended to be temporary. Issue #17550 is scoped to remove it for 13.5+ by decoupling channel identity from build-time baking via a sidecar manifest written at acquisition time. When that lands, both the
aspireCliChannelOverrideparameter and theValidate Source Build CLI Channelguard become moot and should be removed together.No source-build behavior change in Items 1 and 2. Item 1 is a pure refactor (verified by smoke-testing every cascade case locally); Item 2 is pure test addition. The only source-build behavior change is in Item 3, where
compute-cli-channel.ps1adds theaddbuildtagemission — that's additive on the source build and a no-op for any caller other than the new release-publish-nuget guard.Three logical commits, one PR. Items are independently landable but bundled here for review economy. If reviewers prefer to ship Items 1 + 2 first and Item 3 separately, the commits are clean and can be split.
Validation
Also smoke-tested
compute-cli-channel.ps1locally against every cascade arm including both throw paths and the mixed-case override.