Observed shortly after #1880 was merged.
I expect this will haunt us until we tweak the test.
Cursor-generated writeup:
Description
The WSL CI lane Test linux-64 / py3.12, 13.2.0, wheels, rtx4090, wsl can fail in the Run cuda.bindings benchmarks (smoke test) step with a zero-duration pyperf measurement:
python run_pyperf.py --fast --loops 1 --min-time 0
...
ValueError: benchmark function returned zero
This does not look like a functional CUDA regression. In the failing run, the job had already completed:
Run cuda.pathfinder tests with see_what_works
Run cuda.bindings tests
and then failed specifically in:
Run cuda.bindings benchmarks (smoke test)
The benchmark smoke step currently comes from .github/workflows/test-wheel-linux.yml:
pip install pyperf
pushd cuda_bindings/benchmarks
python run_pyperf.py --fast --loops 1 --min-time 0
popd
The benchmark implementation that was called out in the PR #1817 analysis is:
def bench_pointer_get_attribute(loops: int) -> float:
_cuPointerGetAttribute = cuda.cuPointerGetAttribute
_attr = ATTRIBUTE
_ptr = PTR
t0 = time.perf_counter()
for _ in range(loops):
_cuPointerGetAttribute(_attr, _ptr)
return time.perf_counter() - t0
With --loops 1 --min-time 0, a zero-length timing sample on WSL and fast hardware seems plausible, so this looks like a benchmark flake rather than a deterministic product failure.
Failing example
Passing comparisons
Why this looks flaky
- The failure occurs in the benchmark smoke step, not in the regular
cuda.bindings test suite.
- Nearby runs of the same WSL lane passed.
- The exception comes from
pyperf rejecting a zero-duration sample, not from a CUDA API error.
Possible follow-ups
- Increase the amount of work in the smoke benchmark, for example by raising
--loops or using a positive --min-time.
- Retry the benchmark smoke step once before failing the whole job.
- Consider using different benchmark-smoke settings on WSL than on non-WSL Linux runners.
Observed shortly after #1880 was merged.
I expect this will haunt us until we tweak the test.
Cursor-generated writeup:
Description
The WSL CI lane
Test linux-64 / py3.12, 13.2.0, wheels, rtx4090, wslcan fail in theRun cuda.bindings benchmarks (smoke test)step with a zero-durationpyperfmeasurement:This does not look like a functional CUDA regression. In the failing run, the job had already completed:
Run cuda.pathfinder tests with see_what_worksRun cuda.bindings testsand then failed specifically in:
Run cuda.bindings benchmarks (smoke test)The benchmark smoke step currently comes from
.github/workflows/test-wheel-linux.yml:The benchmark implementation that was called out in the PR #1817 analysis is:
With
--loops 1 --min-time 0, a zero-length timing sample on WSL and fast hardware seems plausible, so this looks like a benchmark flake rather than a deterministic product failure.Failing example
Passing comparisons
mainrun: job 70467435657mainrun on the same lane: job 70460149358Why this looks flaky
cuda.bindingstest suite.pyperfrejecting a zero-duration sample, not from a CUDA API error.Possible follow-ups
--loopsor using a positive--min-time.