Rework index manipulation API by lkdvos · Pull Request #416 · QuantumKitHub/TensorKit.jl

lkdvos · 2026-04-26T02:44:55Z

Summary

This PR overhauls the index manipulation API in src/tensors/indexmanipulations.jl to match TensorOperations dispatch conventions, reduces code duplication in the implementation, and adds a dedicated documentation page.
The goal was a bunch of code simplification, (overall number of lines reduced, even though I added some docs 🎉 )

API changes

Unified in-place interface: permute!, braid!, transpose!, and repartition! now directly accept α, β, backend, and allocator as optional arguments (with defaults One(), Zero(), DefaultBackend(), DefaultAllocator()), following the TensorOperations dispatch pattern. The old add_permute!, add_braid!, and add_transpose! are deprecated and forward to the new functions.
allocator support: previously, the index manipulation functions did not support a custom allocator at all. It is now a positional argument in both the public and internal interfaces, consistent with TensorOperations convention.
Out-of-place functions (permute, braid, transpose, repartition) gain backend as a new keyword argument alongside the now-supported allocator keyword.

Implementation changes

All permutation operations now route through braid!, eliminating duplicate codepaths.
Dedicated braid! method added for AdjointTensorMap.
add_transform! kernels for TensorMap refactored to operate on the raw data vector rather than the full TensorMap. Because the data vector has no symmetry type, this avoids recompilation for every TensorMap type combination, improving compilation time.
Various minor simplifications throughout.

github-actions · 2026-04-26T02:54:42Z

After the build completes, the updated documentation will be available here

lkdvos · 2026-04-26T12:59:05Z

~~This might also resolve #413, where I tried to compute cond with eigh_vals instead of LinearAlgebra.eigvals.~~

codecov · 2026-04-26T22:16:21Z

Codecov Report

❌ Patch coverage is 89.83957% with 19 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/tensors/indexmanipulations.jl	89.44%	17 Missing ⚠️
ext/TensorKitMooncakeExt/indexmanipulations.jl	90.00%	1 Missing ⚠️
src/tensors/tensoroperations.jl	50.00%	1 Missing ⚠️

Files with missing lines	Coverage Δ
ext/TensorKitAMDGPUExt/roctensormap.jl	`52.11% <100.00%> (+0.68%)`	⬆️
ext/TensorKitCUDAExt/cutensormap.jl	`74.32% <100.00%> (-0.35%)`	⬇️
src/TensorKit.jl	`13.79% <ø> (ø)`
src/planar/planaroperations.jl	`72.79% <100.00%> (ø)`
src/tensors/braidingtensor.jl	`70.39% <100.00%> (ø)`
src/tensors/treetransformers.jl	`83.16% <100.00%> (+0.44%)`	⬆️
ext/TensorKitMooncakeExt/indexmanipulations.jl	`96.11% <90.00%> (ø)`
src/tensors/tensoroperations.jl	`93.95% <50.00%> (ø)`
src/tensors/indexmanipulations.jl	`90.04% <89.44%> (+16.10%)`	⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Co-authored-by: Jutho <Jutho@users.noreply.github.com>

[skip ci]

Docs content is being added back in a stacked follow-up PR to keep this one reviewable. The minimal docs/src/lib/tensors.md change is kept here because removing the @docs block for the now-deprecated add_permute!/add_braid!/add_transpose! wrappers is required for the docs build to succeed.

lkdvos · 2026-05-19T20:14:48Z

I've separated out the docs changes in an attempt to not conflate the two distinct changes in this PR. I think right now everything here should pass, so it should be ready for a round of review, before it can be merged.
The associated docs changes will then be for a separate PR.

Jutho · 2026-05-29T21:20:18Z

+function braid(
+        t::AdjointTensorMap, (p₁, p₂)::Index2Tuple, levels::IndexTuple;
+        kwargs...
+    )
+    p₁′ = adjointtensorindices(t, p₂)
+    p₂′ = adjointtensorindices(t, p₁)
+    perm = adjointtensorindices(adjoint(t), ntuple(identity, numind(t)))
+    levels′ = TupleTools.getindices(levels, perm)
+    return adjoint(braid(adjoint(t), (p₁′, p₂′), levels′; kwargs...))
 end


Is this a completely new definition. I am wondering about its correctness, in particular with respect to the definition of levels′. Given that the adjoint of an overbraid is an underbraid, it might be that we want to change levels to map(-, levels), in combination with applying the permuation perm?

Ok in some simple example that I tested it with, the current implementation is correct, so never mind.

I did end up adding some test for this since indeed it is a new implementation, I think I've convinced myself that while indeed you want underbraids on the adjoint, this ends up being the case because the index order gets reversed

What index order? I am still confused. The index mappings are such that you do the corresponding braid on the adjoint tensor? In what sense is that reversing index order?

As a side note (since I like derailing PRs 😄 ), one thing I noticed in trying to test this, is that copy(::AdjointTensor) produces a new ::AdjointTensor. This is different from Matrix, where copy of an adjoint produces a regular Matrix. Also TensorMap(::AdjointTensorMap) doesn't work, so it took me a while to find a good way to reinstantiate an AdjointTensorMap as its corresponding TensorMap.

Jutho · 2026-05-29T22:56:01Z

-                end
+            # 2. Recoupling: buffer_dst = buffer_src * U^T  (each output tree is a linear
+            #    combination of input trees weighted by the recoupling coefficients).
+            U′ = Adapt.adapt(typeof(data_dst), StridedView(U))


What is the point of wrapping U in a StridedView here before the adapt call? data_dst is not a StridedView at this point, right? Is this equivalent to StridedView(Adapt.adapt(typeof(data_dst), U))?

It is in spirit, the reason for the change in order is that storagetype yields something with ndims = 1, while U is a Matrix, so we have the freedom to be slightly more liberal with the strided implementation but the standard CuArray doesn't capture that

Jutho · 2026-05-29T22:57:58Z

+            #    using a trivial permutation so the layout is canonical before the matmul.
+            @inbounds for (i, struct_src_i) in enumerate(structs_src)
+                TO.tensoradd!(
+                    sreshape(buffer_src[:, i], sz_src), StridedView(data_src, sz_src, struct_src_i...),


This contains a getindex that relies on StridedView producing a view, so if we ever change this behavior, this is where we will have to be careful.

Jutho · 2026-05-29T23:02:02Z

+            # 1. Extract: copy each source block into column i of buffer_src as a flat vector,
+            #    using a trivial permutation so the layout is canonical before the matmul.
+            @inbounds for (i, struct_src_i) in enumerate(structs_src)
+                TO.tensoradd!(


Does it make sense to simply call copy! for the ptriv case?

Jutho · 2026-05-29T23:02:58Z

+                p, false, α * coeff, β, backend, allocator
+            )
+        else # Multi-tree block: pack → recoupling matmul → unpack.
+            rows, cols = size(U)


Is there ever a case where this is not square? If so, does it make sense to do the trivial permutation on the largest of the two (src vs dest), and the non-trivial permutation on the smallest?

Jutho · 2026-05-29T23:13:02Z

+                    p, false, α, β, backend, allocator
+                )
            end
+            @lock buffer_lock TO.tensorfree!(buffer, allocator)


I have forgotten how the allocators work. It is now fine that we free buffers in on order that is not the exact reverse as the order in which they were allocated?

Jutho · 2026-05-29T23:27:29Z

+    # buffers have to be created without race condition: err on the side of caution with a lock
+    buffer_lock = Threads.ReentrantLock()
+
+    OhMyThreads.@tasks for src in fusionblocks(tsrc)


There is some code duplication between this generic implementation, and the one for GenericTreeTransformer below (hence my questions there also apply here). But I think the code duplication is unavoidable (and anyway quite limited).

Jutho · 2026-05-29T23:29:42Z

+    return maximum(transformer.data; init = 0) do (basistransform, structures_dst, _)
+        return prod(structures_dst[1]) * size(basistransform, 1)
+    end
+end


Is this function used anywhere? I couldn't find a single calling instance? The add_transform_kernel! computes the buffer size manually.

Jutho

Ok, this looks great. I've left a few questions, but mostly just to get better understanding.

lkdvos added the documentation label Apr 26, 2026

lkdvos force-pushed the ld-indexmanipulations branch from f45ab42 to 2841304 Compare April 26, 2026 12:57

lkdvos force-pushed the ld-indexmanipulations branch 3 times, most recently from ff15f1e to c4408b3 Compare April 26, 2026 21:33

lkdvos marked this pull request as ready for review April 26, 2026 22:02

lkdvos requested review from Jutho and kshyatt April 26, 2026 22:03

lkdvos linked an issue Apr 26, 2026 that may be closed by this pull request

cond test fails depending on whether --fast is used in the testsuite #413

Closed

lkdvos removed a link to an issue Apr 26, 2026

cond test fails depending on whether --fast is used in the testsuite #413

Closed

lkdvos force-pushed the ld-indexmanipulations branch 3 times, most recently from a0bc84b to 2c44dca Compare April 28, 2026 20:41

lkdvos force-pushed the ld-indexmanipulations branch 2 times, most recently from b833340 to 5a576f0 Compare May 12, 2026 12:46

Jutho reviewed May 13, 2026

View reviewed changes

Comment thread src/tensors/braidingtensor.jl Outdated

Jutho reviewed May 13, 2026

View reviewed changes

Comment thread src/tensors/indexmanipulations.jl Outdated

Jutho reviewed May 13, 2026

View reviewed changes

Comment thread src/tensors/indexmanipulations.jl Outdated