It appears that _use_closure exists because the pre-PR AD path had two ways to represent “differentiate log density with respect to x, while everything else is constant”:
- Build a one-argument closure over the model state.
- Call
DI with explicit constant arguments.
That intent is documented in src/logdensityfunction.jl. The performance tradeoff was backend-specific:
AutoForwardDiff, AutoMooncake, and AutoEnzyme were set to false, meaning “don’t use a closure”, because that was measured to be faster in the DI setup.
AutoReverseDiff{compile} was set to true because the compiled-tape path would otherwise keep recompiling if the constants were passed the DI way.
- For unknown backends, the default was
false because the non-closure DI path was generally better.
There was a performance reason for _use_closure, but it was tied to DI’s calling convention, not to AD in general. However, if we were to switch to the native gradient API in #1363:
- In the new
AbstractPPL path, prepare(adtype, problem, x) expects a stable problem object that then supports prepare(problem, x).
- If you wrap
LogDensityAt in an anonymous closure, you throw away that problem type and replace it with a compiler-generated function type.
Are there any genuine reasons for _use_closure other than as a workaround of current DI limitations?
It appears that
_use_closureexists because the pre-PR AD path had two ways to represent “differentiate log density with respect tox, while everything else is constant”:DIwith explicit constant arguments.That intent is documented in
src/logdensityfunction.jl. The performance tradeoff was backend-specific:AutoForwardDiff,AutoMooncake, andAutoEnzymewere set tofalse, meaning “don’t use a closure”, because that was measured to be faster in theDIsetup.AutoReverseDiff{compile}was set totruebecause the compiled-tape path would otherwise keep recompiling if the constants were passed theDIway.falsebecause the non-closureDIpath was generally better.There was a performance reason for
_use_closure, but it was tied toDI’s calling convention, not to AD in general. However, if we were to switch to the native gradient API in #1363:AbstractPPLpath,prepare(adtype, problem, x)expects a stable problem object that then supportsprepare(problem, x).LogDensityAtin an anonymous closure, you throw away that problem type and replace it with a compiler-generated function type.Are there any genuine reasons for
_use_closureother than as a workaround of current DI limitations?