Fix flaky `CanRunOnIdleTask` by polling instead of sleeping by andyleejordan · Pull Request #2314 · PowerShell/PowerShellEditorServices

andyleejordan · 2026-06-16T22:24:42Z

Summary

CanRunOnIdleTask (and its twin CanRunOnIdleInProfileTask) in
PsesInternalHostTests.cs were intermittently failing on the net462 (Windows
PowerShell 5.1) CI leg. CanRunOnIdleTask was just caught failing on PR #2298's
Windows job (Assert.Collection() … Assert.False/Assert.True at line ~200). Both
tests carried a // TODO: Why is this racy? above a hard-coded Thread.Sleep(2000).

Root cause

PsesInternalHost.OnPowerShellIdle calls
_mainRunspaceEngineIntrinsics.Events.GenerateEvent(PSEngineEvent.OnIdle, ...),
which only enqueues the OnIdle event. For a subscriber registered with
-Action { ... } (exactly what these tests register), PowerShell does not run
the action scriptblock inline at GenerateEvent time — it becomes a pending
action that the engine's event manager dispatches asynchronously, on the
pipeline thread, at the next process-pending-actions point (around subsequent
pipeline invocations). OnPowerShellIdle then runs a tiny artificial pipeline
(…param() 0) to nudge event processing, but the action's actual execution is
never synchronized with that pipeline returning or with the test's later
$handled read.

So the fixed Thread.Sleep(2000) was only a timing guess. On a fast runner the
action finishes first; on the slower WinPS leg 2 seconds is sometimes not enough,
leaving $global:handled still $false at the assertion — hence the intermittent
failure.

The key realization: each additional pipeline execution gives the engine
another chance to drain the pending action, so re-reading the handler variable in
a loop both waits for and drives completion. That makes polling a real
deterministic fix rather than just a longer sleep.

Change

Add a shared WaitForHandledAsync(psesHost, variableName) helper that polls the
handler variable via ExecutePSCommandAsync<bool> until it reports $true
(~200 ms between polls, ~15 s ceiling via CancellationTokenSource). On timeout
it returns the last observed value so the existing Assert.Collection(handled, Assert.True) still fails loudly.
Use it in CanRunOnIdleTask ($handled) and CanRunOnIdleInProfileTask
($handledInProfile), replacing both Thread.Sleep(2000) calls and removing the
// TODO: Why is this racy? comments. Registration, the pre-assert
(Assert.False), and the OnPowerShellIdle delegate call are unchanged.

No production code or test-profile script changes — test file only.

Validation

Built and ran both tests on net8.0 (Invoke-Build TestPS74 -TestFilter 'FullyQualifiedName~CanRunOnIdle'): green across repeated runs, ~0.4 s each vs.
the old fixed 2 s.
net462 can't run on macOS, but the dispatch mechanism is identical across
targets — only latency differs. The 15 s ceiling is ~7.5× the old 2 s window and
self-terminates on success, so it's strictly safer on the slow leg without
slowing the fast one.

`CanRunOnIdleTask` (and its twin `CanRunOnIdleInProfileTask`) were flaky on the net462 (Windows PowerShell 5.1) CI leg — the former was just caught failing on PR #2298's Windows job. The root cause is that `PsesInternalHost.OnPowerShellIdle` calls `Events.GenerateEvent(PSEngineEvent.OnIdle, ...)`, which only *enqueues* the event. For a subscriber registered with `-Action {...}`, PowerShell doesn't run the action scriptblock inline; it becomes a pending action that the engine dispatches asynchronously on the pipeline thread, around subsequent pipeline invocations. So the action's execution was never synchronized with the test's `$handled` read, and the fixed `Thread.Sleep(2000)` was just a timing guess — sometimes too short on the slower WinPS leg, leaving `$global:handled` still `$false` at the assertion. The key realization is that each *additional* pipeline execution gives the engine another chance to drain the pending action, so re-reading the handler variable in a loop both waits for *and* drives completion. I replaced the sleep with a shared `WaitForHandledAsync` helper that polls the variable (~200ms apart, ~15s ceiling) until it reports `$true`, returning the last observed value on timeout so the assertion still fails loudly. This keeps the tests' intent intact and isn't merely a longer sleep. I validated both tests on net8.0 (green across repeated runs, ~0.4s each vs. the old fixed 2s); net462 can't run on macOS, but the mechanism is identical across targets and the 15s ceiling self-terminates on success, so it's strictly safer on the slow leg without slowing the fast one. Drafted by Copilot (Claude Opus 4.8). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

This PR fixes flaky CanRunOnIdleTask and CanRunOnIdleInProfileTask tests that intermittently failed on the net462 (Windows PowerShell 5.1) CI leg. The root cause was a hard-coded Thread.Sleep(2000) that was insufficiently long for the asynchronous event dispatch on slower runners. The fix replaces both sleeps with a shared polling helper that actively drives event processing while waiting.

Changes:

Adds a new OnIdleTestHelpers.WaitForHandledAsync static helper that polls a PowerShell handler variable via ExecutePSCommandAsync<bool> in a loop (200 ms between polls, 15 s timeout ceiling), returning the last observed value on timeout so assertions still fail loudly.
Replaces the Thread.Sleep(2000) + manual ExecutePSCommandAsync read in both CanRunOnIdleTask and CanRunOnIdleInProfileTask with calls to the new helper, also removing the // TODO: Why is this racy? comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings June 16, 2026 22:24

Copilot started reviewing on behalf of andyleejordan June 16, 2026 22:25 View session

Copilot AI reviewed Jun 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix flaky `CanRunOnIdleTask` by polling instead of sleeping#2314

Fix flaky `CanRunOnIdleTask` by polling instead of sleeping#2314
andyleejordan wants to merge 1 commit into
mainfrom
andyleejordan/fix-racy-idle-task-test

andyleejordan commented Jun 16, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

andyleejordan commented Jun 16, 2026

Summary

Root cause

Change

Validation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants