Skip to content

VarName rework#150

Merged
penelopeysm merged 67 commits intomainfrom
py/newvarname
Jan 12, 2026
Merged

VarName rework#150
penelopeysm merged 67 commits intomainfrom
py/newvarname

Conversation

@penelopeysm
Copy link
Copy Markdown
Member

@penelopeysm penelopeysm commented Dec 23, 2025

Closes #49 (we can't always use views but this PR does so where possible; see the _maybe_view function for details)
Closes #136 (by removing the methods)
Closes #148 (note that we can't fully remove the dependency on Accessors.jl, not without reimplementing a ton of stuff, but we have removed the dependency on its types)

This PR modifies VarName to use in-house optics, instead of the optics from Accessors.jl. See the HISTORY.md for details. This PR also adds proper documentation, which you can read at https://turinglang.org/AbstractPPL.jl/previews/PR150/varname/.

I'm aware that this is practically impossible to review, but if you are seeing this, you can take heart from the fact that pretty much all the pre-existing tests for high-level functions like prefix, unprefix, varname_leaves, hasvalue, getvalue, have not been touched, and continue to pass. The only cases where I had to change the tests were those where the actual semantics were changed, e.g. begin is now no longer automatically concretised. (In fact, this PR also expands the test suite by quite a bit.) Besides, DynamicPPL CI passes just fine with only a handful of interface changes, and benchmarks don't show any performance regressions: TuringLang/DynamicPPL.jl#1185

Essentially, I think it's fair to say that once you have constructed a @varname(...) it will pretty much behave the same way it used to, for the most part you just need to change the types so Iden instead of typeof(identity), etc.


Of course, this is no longer true if you are digging into getoptic(vn) and things like that. The data structure changes are all inside src/varname/optic.jl and src/varname/varname.jl. It looks complicated, but that's because I have made it very general: it's more general than old VarName used to be (begin/end no longer need to be concretised so early), and also more general than Accessors.IndexLens is (it accepts keyword arguments). The reason for this is because, if we are undertaking a big refactoring, we may as well do it correctly. Otherwise in the future if we want Turing to work with other array types (looking at DimArray in particular) we will have to come back and fix it again.

Note that keyword argument support for getindex/setindex! can be a bit patchy, but:

  1. These can (probably) be added as non-breaking patches over time. For example, I think varname_and_value_leaves might need patches to make it work nicely, which I haven't implemented.
  2. Sometimes it's not our fault, the Julia ecosystem just hasn't caught up. For example, BangBang.setindex!! will error if you try to use keyword arguments (so in here I've defaulted to using Base.setindex!). In fact, if anything, I think it's good that we find cases like these to push the ecosystem forward.

@github-actions
Copy link
Copy Markdown
Contributor

AbstractPPL.jl documentation for PR #150 is available at:
https://TuringLang.github.io/AbstractPPL.jl/previews/PR150/

@codecov
Copy link
Copy Markdown

codecov bot commented Dec 23, 2025

Codecov Report

❌ Patch coverage is 86.73740% with 50 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.84%. Comparing base (0086beb) to head (7022297).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/varname/optic.jl 87.98% 25 Missing ⚠️
src/varname/varname.jl 84.93% 11 Missing ⚠️
src/varname/prefix.jl 0.00% 9 Missing ⚠️
src/varname/hasvalue.jl 89.28% 3 Missing ⚠️
src/varname/leaves.jl 75.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #150      +/-   ##
==========================================
- Coverage   86.40%   84.84%   -1.56%     
==========================================
  Files           9       10       +1     
  Lines         456      561     +105     
==========================================
+ Hits          394      476      +82     
- Misses         62       85      +23     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@penelopeysm penelopeysm changed the title [WIP] VarName rework VarName rework Dec 24, 2025
@penelopeysm penelopeysm marked this pull request as ready for review December 24, 2025 21:24
@yebai yebai requested a review from sunxd3 December 27, 2025 23:44
@yebai
Copy link
Copy Markdown
Member

yebai commented Dec 27, 2025

One very high-level comment—without having looked closely at the code changes—is that support for begin, end, and Colon indexing is not always desirable from a statistical perspective. MCMC algorithms generally assume that parameters indexed by VarName correspond to fixed-dimensional objects, and that this dimensionality remains constant throughout inference unless explicitly specified otherwise (e.g., via of-types).

Copy link
Copy Markdown
Member

@sunxd3 sunxd3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heroic effort!

some minor issues, in general, this is definitely the right way to go

@penelopeysm
Copy link
Copy Markdown
Member Author

@yebai: I don't think what you're describing is an issue with indexing with begin or end per se. The issue is the variable changing type or size between iterations. These are orthogonal problems. For example, if m has the same type and size across all iterations, then m[begin] will always consistently refer to the same thing in all versions of m, even though the begin index is not "concretised". Now conversely say m is a 2x2 matrix in one MCMC iteration, and a 1x5 matrix in another iteration. Then, even a "simple" / "fully concretised" varname like m[2] will refer to a different thing between iterations. The way we currently concretise colons (turning, say, m[:] into m[1:4]) doesn't solve that problem either, because the VarName cannot actually guard against the size/type of m changing: it will just return 1:4 regardless of whether m actually has four elements. We could try to handle (or forbid) the size/type changing thing in DynamicPPL or Turing, but I don't think that it should affect AbstractPPL or the VarName data type.

penelopeysm and others added 3 commits December 30, 2025 01:41
The idea of concretization is not new to AbstractPPL.
However, there are some differences:

- Colons are no longer concretized: they *always* remain as Colons, even after calling `concretize`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Doesn't this undermine the purpose of concretisation? I understand the comment in the DPPL PR of eventually getting rid of concretisation in DPPL, but this doesn't seem like the solution to that.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have thought about this a lot, and I haven't actually found any reason why concretisation is needed, as long as shadow arrays in VNT are implemented. (I know you don't need the link, just leaving it there for future readers.) I would rather get rid of the entire concretise function, but I think it's better to err on the safe side for now.

Copy link
Copy Markdown
Member Author

@penelopeysm penelopeysm Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, let's put it this way, colons are already quite broken in DPPL, so I can't be making anything worse.

But also I'm quite sure I'm making it better.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think if concretize exists, then it should really concretize, and I think that would involve getting rid of colons.

Copy link
Copy Markdown
Member Author

@penelopeysm penelopeysm Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I think maybe we have different notions of concretise, or maybe the old and new ones don't line up. A Colon is still something you can pass to getindex just fine -- just like any other index:

julia> x = rand(3)
3-element Vector{Float64}:
 0.8410817279085441
 0.21204011256805433
 0.288585774687376

julia> getindex(x, Colon())
3-element Vector{Float64}:
 0.8410817279085441
 0.21204011256805433
 0.288585774687376

Only begin and end are special because they are lowered, somewhere in Base Julia, so you can't call getindex with begin

julia> getindex(x, begin)
ERROR: ParseError:
# Error @ REPL[4]:1:18
getindex(x, begin)
#                ╙ ── unexpected `)`
Stacktrace:
 [1] top-level scope
   @ none:1

and instead

julia> @code_typed x[begin]
CodeInfo(
1%1  = $(Expr(:boundscheck))::Bool
└──       goto #5 if not %1
2%3  = Base.sub_int(i, 1)::Int64%4  = Base.bitcast(Base.UInt, %3)::UInt64%5  = Base.getfield(A, :size)::Tuple{Int64}%6  = $(Expr(:boundscheck, true))::Bool%7  = Base.getfield(%5, 1, %6)::Int64%8  = Base.bitcast(Base.UInt, %7)::UInt64%9  = Base.ult_int(%4, %8)::Bool
└──       goto #4 if not %9
3 ─       goto #5
4%12 = Core.tuple(i)::Tuple{Int64}
│         invoke Base.throw_boundserror(A::Vector{Float64}, %12::Tuple{Int64})::Union{}
└──       unreachable
5%15 = Base.getfield(A, :ref)::MemoryRef{Float64}%16 = Base.memoryrefnew(%15, i, false)::MemoryRef{Float64}%17 = Base.memoryrefget(%16, :not_atomic, false)::Float64
└──       return %17
) => Float64

I think concretisation really means to resolve these lowered expressions, which is the same approach that is used in Accessors itself (Accessors doesn't have any special Colon-handling code), but is not the same as what old AbstractPPL does. I think the point is, you don't need to know what x is to index into x[:], because you can just pass Colon to getindex and let Julia handle that for you. On the other hand, you can't do the equivalent of getindex(x, end) without knowing what x is.

If we took the old notion of AbstractPPL.concretize, then one could argue that linear indices should also be concretised to Cartesian indices.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, I hadn't realised our notion of concrete was changing.

The `Base.get` and `Accessors.set` methods for VarNames have been removed (these were responsible for method ambiguities).
Instead of using these methods you can first convert the `VarName` to an optic using `varname_to_optic(vn)`, and then use the getter and setter methods on the optics.

VarNames cannot be composed with optics now (compose the optics yourself).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I still prefix an optic with a VarName? That feels like it would be handy. I agree that compose is not the right term for it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of unnecessary identifier reuse, so I've called it append_optic and exported it.

@mhauru
Copy link
Copy Markdown
Member

mhauru commented Jan 5, 2026

I was a bit overly terse there: I'm generally a fan of this. I didn't read most of the code, but I like the idea of what is being done here. Just had the above two questions about interface changes.

@penelopeysm penelopeysm requested a review from sunxd3 January 12, 2026 11:37
canview(optic::Accessors.IndexLens, x) = false
function canview(optic::Accessors.IndexLens, x::AbstractArray)
return checkbounds(Bool, x, optic.indices...)
function canview(optic::Index, x::AbstractArray)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we call this with unconcretized optic?

julia> AbstractPPL.canview(@opticof(_[end]), rand(3))
ERROR: ArgumentError: unable to check bounds for indices of type AbstractPPL.DynamicIndex{Symbol, typeof(lastindex)}
Stacktrace:
 [1] checkindex(::Type{Bool}, inds::Base.OneTo{Int64}, i::AbstractPPL.DynamicIndex{Symbol, typeof(lastindex)})
   @ Base ./abstractarray.jl:751
 [2] checkbounds
   @ ./abstractarray.jl:689 [inlined]
 [3] canview(optic::Index{Tuple{AbstractPPL.DynamicIndex{Symbol, typeof(lastindex)}}, @NamedTuple{}, Iden}, x::Vector{Float64})
   @ AbstractPPL ~/TuringLang/AbstractPPL.jl/src/varname/hasvalue.jl:28
 [4] top-level scope
   @ REPL[6]:1

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah great spot, I didn't think to check that. I guess we have to concretise first and then recurse.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed now with tests!

julia> AbstractPPL.canview(@opticof(_[end]), rand(3))
true

Copy link
Copy Markdown
Member

@sunxd3 sunxd3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good other than one small question.

@penelopeysm
Copy link
Copy Markdown
Member Author

OK. I think we will wait for the dust to settle with DynamicPPL before I attempt to use this upstream, but I'm happy to release a version.

@penelopeysm penelopeysm merged commit b4157a4 into main Jan 12, 2026
10 of 11 checks passed
@penelopeysm penelopeysm deleted the py/newvarname branch January 12, 2026 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Drop Accessors dependency Fix method ambiguities Use views in concretization

4 participants