Fix flaky gamma_inc partials test by ChrisRackauckas-Claude · Pull Request #822 · JuliaDiff/ForwardDiff.jl

ChrisRackauckas-Claude · 2026-07-03T18:04:01Z

Summary

The gamma_inc partials assertions in test/DualTest.jl fail sporadically on CI —
roughly one job per full-matrix run, on arbitrary OS/Julia-version combinations (e.g.
"Julia lts - windows-latest" in the 2026-06-22 master run, "Julia 1 - ubuntu-latest" in
the 2026-06-07 run), always of the form

partials(pq[i]) ≈ PARTIALS * Calculus.derivative(x -> gamma_inc(a, x, ind...)[i], 1 + PRIMAL) rtol=tol

with V === Float32 and random a/PRIMAL narrowly missing rtol = 5f-4.

Diagnosis

The reference, not ForwardDiff, is at fault, for two reasons:

For V === Float32, Calculus.derivative central-differences the Float32
evaluation of gamma_inc, whose finite-difference error is of the same order as the
5f-4 tolerance — so random draws sporadically cross it (~4% of draws, measured).
For ind = 1 / ind = 2, gamma_inc deliberately evaluates a reduced-accuracy
approximation (~14 / ~6 significant digits). Finite-differencing a function with
~1e-6 intrinsic noise amplifies that noise by 1/h; the measured reference error
reaches 2000–3000% of the true derivative in the worst draws. The existing
tol^(1/2^ind) loosening (up to rtol ≈ 0.15) was compensating for this.

In every sampled failing draw, ForwardDiff's Float32 derivative was within 4e-8
relative of the true derivative (Float64 dual evaluation), while the finite-difference
reference was off by 2%–3000%.

Fix

ForwardDiff's rule (src/dual.jl) is the analytic exp(-x)·x^(a-1)/Γ(a), independent
of ind. So compare the partials against a finite difference of the full-accuracy
(ind = 0) Float64 evaluation, at the base tolerance — the ind-dependent
loosening remains only on the value comparison, where it belongs (values genuinely
differ by the requested approximation accuracy). Float64(1 + PRIMAL) converts after
the V-precision addition, so the reference is evaluated at exactly the primal of the
dual input.

Measured over 300000 random (a, PRIMAL, PARTIALS) draws × all ind variants
(2.4M assertions): old reference fails ~4% of draws; new reference fails 0, even at
the base (unloosened) tolerance.

test/DualTest.jl passes locally (Julia 1.12, --check-bounds=yes).

Note

Opened as a draft by an agent on behalf of @ChrisRackauckas. Please ignore
until reviewed by @ChrisRackauckas.

🤖 Generated with Claude Code

The partials assertions compared ForwardDiff's analytic gamma_inc derivative against Calculus.jl central differences of the V-precision, ind-approximation function. Two problems: for V === Float32 the finite difference of a Float32 function has error comparable to the 5f-4 tolerance, and for ind = 1/2 gamma_inc itself is a reduced-accuracy approximation whose intrinsic noise the finite difference amplifies far beyond any tolerance. With random `a`/`PRIMAL` this failed sporadically (~4% of random draws; roughly one CI job per master run). ForwardDiff's rule is exp(-x)x^(a-1)/gamma(a), independent of `ind`, so compare against a finite difference of the full-accuracy (ind = 0) Float64 evaluation instead, at the *base* tolerance (the ind-dependent loosening is only needed for the value comparison, which keeps it). Measured over 300000 random (a, PRIMAL, PARTIALS) draws x all ind variants (2.4M assertions): old reference fails ~4% of draws, new reference fails 0 - while every observed old-reference failure had the Float32 dual derivative within 4e-8 of the true derivative, i.e. the reference, not ForwardDiff, was wrong. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com> Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Vx7zQ96NYk4VV4ML2s3kAC

codecov · 2026-07-03T18:11:04Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.74%. Comparing base (090ddbb) to head (cb0e8ac).

Additional details and impacted files

@@           Coverage Diff           @@
##           master     #822   +/-   ##
=======================================
  Coverage   90.74%   90.74%           
=======================================
  Files          11       11           
  Lines        1070     1070           
=======================================
  Hits          971      971           
  Misses         99       99

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ChrisRackauckas-Claude · 2026-07-03T18:40:48Z

A fresh occurrence of this flake just hit CI on #823 ("Julia min-patch - macOS - NaN-safe disabled", 2026-07-03): Partials(0.0029103528, 0.0038809243, 0.002506475) ≈ Partials(0.0029121826, 0.0038833646, 0.0025080508) at rtol=0.0005 — a ~6e-4 relative miss of the Float32 finite-difference reference, exactly the failure mode this PR removes.

ChrisRackauckas-Claude mentioned this pull request Jul 3, 2026

Skip JET QA tests on Julia 1.13 until JET supports it #823

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix flaky gamma_inc partials test#822

Fix flaky gamma_inc partials test#822
ChrisRackauckas-Claude wants to merge 1 commit into
JuliaDiff:masterfrom
ChrisRackauckas-Claude:fix-gamma-inc-flaky-test

ChrisRackauckas-Claude commented Jul 3, 2026

Uh oh!

codecov Bot commented Jul 3, 2026 •

edited

Loading

Uh oh!

ChrisRackauckas-Claude commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ChrisRackauckas-Claude commented Jul 3, 2026

Summary

Diagnosis

Fix

Uh oh!

codecov Bot commented Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ChrisRackauckas-Claude commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov Bot commented Jul 3, 2026 •

edited

Loading