Fix flaky gamma_inc partials test#822
Draft
ChrisRackauckas-Claude wants to merge 1 commit into
Draft
Conversation
The partials assertions compared ForwardDiff's analytic gamma_inc derivative against Calculus.jl central differences of the V-precision, ind-approximation function. Two problems: for V === Float32 the finite difference of a Float32 function has error comparable to the 5f-4 tolerance, and for ind = 1/2 gamma_inc itself is a reduced-accuracy approximation whose intrinsic noise the finite difference amplifies far beyond any tolerance. With random `a`/`PRIMAL` this failed sporadically (~4% of random draws; roughly one CI job per master run). ForwardDiff's rule is exp(-x)x^(a-1)/gamma(a), independent of `ind`, so compare against a finite difference of the full-accuracy (ind = 0) Float64 evaluation instead, at the *base* tolerance (the ind-dependent loosening is only needed for the value comparison, which keeps it). Measured over 300000 random (a, PRIMAL, PARTIALS) draws x all ind variants (2.4M assertions): old reference fails ~4% of draws, new reference fails 0 - while every observed old-reference failure had the Float32 dual derivative within 4e-8 of the true derivative, i.e. the reference, not ForwardDiff, was wrong. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com> Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Vx7zQ96NYk4VV4ML2s3kAC
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #822 +/- ##
=======================================
Coverage 90.74% 90.74%
=======================================
Files 11 11
Lines 1070 1070
=======================================
Hits 971 971
Misses 99 99 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Contributor
Author
|
A fresh occurrence of this flake just hit CI on #823 ("Julia min-patch - macOS - NaN-safe disabled", 2026-07-03): |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The
gamma_incpartials assertions intest/DualTest.jlfail sporadically on CI —roughly one job per full-matrix run, on arbitrary OS/Julia-version combinations (e.g.
"Julia lts - windows-latest" in the 2026-06-22 master run, "Julia 1 - ubuntu-latest" in
the 2026-06-07 run), always of the form
with
V === Float32and randoma/PRIMALnarrowly missingrtol = 5f-4.Diagnosis
The reference, not ForwardDiff, is at fault, for two reasons:
V === Float32,Calculus.derivativecentral-differences the Float32evaluation of
gamma_inc, whose finite-difference error is of the same order as the5f-4tolerance — so random draws sporadically cross it (~4% of draws, measured).ind = 1/ind = 2,gamma_incdeliberately evaluates a reduced-accuracyapproximation (~14 / ~6 significant digits). Finite-differencing a function with
~1e-6 intrinsic noise amplifies that noise by
1/h; the measured reference errorreaches 2000–3000% of the true derivative in the worst draws. The existing
tol^(1/2^ind)loosening (up tortol ≈ 0.15) was compensating for this.In every sampled failing draw, ForwardDiff's Float32 derivative was within 4e-8
relative of the true derivative (Float64 dual evaluation), while the finite-difference
reference was off by 2%–3000%.
Fix
ForwardDiff's rule (
src/dual.jl) is the analyticexp(-x)·x^(a-1)/Γ(a), independentof
ind. So compare the partials against a finite difference of the full-accuracy(
ind = 0) Float64 evaluation, at the base tolerance — theind-dependentloosening remains only on the value comparison, where it belongs (values genuinely
differ by the requested approximation accuracy).
Float64(1 + PRIMAL)converts afterthe V-precision addition, so the reference is evaluated at exactly the primal of the
dual input.
Measured over 300000 random
(a, PRIMAL, PARTIALS)draws × allindvariants(2.4M assertions): old reference fails ~4% of draws; new reference fails 0, even at
the base (unloosened) tolerance.
test/DualTest.jlpasses locally (Julia 1.12,--check-bounds=yes).Note
Opened as a draft by an agent on behalf of @ChrisRackauckas. Please ignore
until reviewed by @ChrisRackauckas.
🤖 Generated with Claude Code