We need to redesign the internals of QuantumInputOutput.jl. The current implementation works but leaves a lot of performance on the table due to type instabilities, unnecessary allocations, and duplicated code paths. Since systems are small (1-4 ports) and the ODE inner loop evaluates our closures thousands of times, every allocation and type-instability compounds.
Problem 1: Two parallel SLH types with duplicated logic
We currently have SLH (symbolic) and SLHqo (numeric/time-dependent). Both represent the same mathematical object — an (S, L, H) triple — and the composition rules (cascade, concatenate, feedback) are mathematically identical. Yet we maintain two complete implementations of ▷, ⊞, and feedback, each ~100 lines, because the symbolic path needs SQA._adjoint and simplify while the numeric path needs adjoint and handles Function types.
This duplication is a maintenance burden and a source of subtle divergence bugs.
Solution: A single parametric SLH{N, ST, LT, HT} type. The symbolic-vs-numeric difference reduces to four lines of dispatch:
_adj(x::SQA.QNumber) = SQA._adjoint(x)
_adj(x) = adjoint(x)
_post(x::BasicSymbolic) = simplify(x)
_post(x) = x
One implementation of cascade, concatenate, and feedback uses _adj and _post instead of hardcoding the symbolic or numeric variant. Julia's dispatch handles the rest.
Problem 2: Heap-allocated scattering matrices
The scattering matrix is currently a Matrix (heap-allocated). In concatenation it's even Matrix{Any}:
S_t = Matrix{Any}(undef, lS1+lS2, lS1+lS2) # SLH.jl line 262
Matrix{Any} means every element access boxes/unboxes, killing type inference for everything downstream. Even the non-Any paths use heap-allocated Matrix and Vector for objects that are 1x1 to 4x4 in practice.
Why this matters: Scattering matrices are accessed during cascade and feedback — operations that happen at model construction time. More importantly, the Lindblad vector is also a plain Vector, and its elements get composed into closures that run at every ODE timestep. Heap allocation for a 2-element vector is pure overhead.
Solution: Use StaticArrays. The port count N becomes a type parameter:
struct SLH{N, ST, LT, HT}
scattering::SMatrix{N, N, ST}
lindblad::SVector{N, LT}
hamiltonian::HT
end
All scattering and Lindblad storage lives on the stack. N is 1-4 in all real use cases (confirmed by scanning every test and example) — well within StaticArrays' sweet spot. Concatenation produces SLH{N1+N2} at compile time. Cascade enforces matching N at compile time (no more runtime @assert).
Convenience constructors preserve the current API:
SLH(1, √γ * a, H) # scalar → SLH{1}
SLH([-r η; η r], [0, 0], 0) # matrix/vector → SLH{2}
Problem 3: Time-dependent closures are type-unstable
The current SLHqo cascade creates closures like:
return t -> begin
L1_vec = [f(t) for f in L1f] # allocates a new Vector every call
L2_vec = [f(t) for f in L2f] # allocates another Vector every call
(L2_vec + S2 * L1_vec)[i] # allocates S2*L1_vec, then sum, then indexes
end
These closures run at every ODE timestep. Each call allocates 3+ vectors that immediately become garbage. Worse, because each closure is a unique anonymous type, storing multiple of them in a Vector{Function} is type-erased — the compiler can't specialize on the element type.
Solution: Two changes:
- Composition uses
_add/_mul helpers that build composed closures without intermediate vector allocation:
_add(f::Function, g::Function) = t -> f(t) + g(t)
_add(f::Function, x) = t -> f(t) + x
_add(x, f::Function) = t -> x + f(t)
_add(x, y) = x + y
_mul(s, f::Function) = t -> s * f(t)
_mul(f::Function, g::Function) = t -> f(t) * g(t)
_mul(x, y) = x * y
FunctionWrappers.jl gives all time-dependent elements a single concrete type:
const TimeDep{T} = FunctionWrapper{T, Tuple{Float64}}
When any Lindblad element is a Function, all elements are wrapped uniformly as TimeDep{Operator}. This makes SVector{N, TimeDep{Operator}} fully concrete — the compiler sees one type, not N different anonymous closure types. For fully static systems (symbolic or numeric without time-dependence), no wrapping happens.
Problem 4: Pulse functions return opaque closures
Every coupling_input, coupling_output, etc. (currently u_to_gu, v_to_gv, ...) computes a LinearInterpolation then wraps it:
gu_int = LinearInterpolation(gu_t, T; extrapolation = _extrapolate)
return t -> gu_int(t) # why?
LinearInterpolation is already callable. The wrapper creates an opaque closure type that prevents the compiler from specializing downstream code (e.g., coupling_matrix receiving a concrete LinearInterpolation vs. a generic Function).
Solution: Return the LinearInterpolation directly. All downstream code calls g(t) — works unchanged, but now Julia knows the concrete type.
Problem 5: coupling_matrix (currently interaction_picture_A) allocates every ODE step
function A(t)
g = [gf(t) for gf in gfs] # allocates
M = zeros(ComplexF64, n, n) # allocates
# ... fill M ...
return 0.5 * M # allocates again
end
This function is the RHS of the mode evolution ODE. It's called at every timestep. Three allocations per call, thousands of calls.
Solution: Pre-allocate buffers in the closure:
function coupling_matrix(gs)
gfs = _as_time_function.(gs)
n = length(gfs)
M = zeros(ComplexF64, n, n)
g = Vector{ComplexF64}(undef, n)
function A(t)
for k in 1:n; g[k] = gfs[k](t); end
fill!(M, zero(ComplexF64))
for i in 1:n, j in (i+1):n
val = 0.5 * g[i] * conj(g[j])
M[i, j] = val
M[j, i] = -conj(val)
end
return M
end
return A
end
Zero allocations per call.
Problem 6: solve_mode_evolution (currently interaction_picture_M) uses out-of-place ODE
f_M(u, p, t) = A(t) * u # allocates a new matrix every step
Solution: In-place form, and return the ODE solution directly (it's already callable as sol(t)):
function solve_mode_evolution(A::Function, T; alg = Tsit5(), kwargs...)
T0, Tend = T[1], T[end]
n = size(A(T0), 1)
M0 = Matrix{ComplexF64}(I, n, n)
function f_M!(du, u, p, t)
mul!(du, A(t), u)
end
prob = ODEProblem(f_M!, M0, (T0, Tend))
sol = solve(prob, alg; saveat = T, kwargs...)
return sol
end
Problem 7: Non-const module globals
_tol_div = 1e-10
_extrapolate = ExtrapolationType.Extension
_ϵu = 1e-10
_ϵv = 1e-10
Non-const globals in Julia are type-unstable. The compiler inserts type checks at every access because the value could change type at any time.
Solution: Add const.
Problem 8: translate (currently translate_qo) redundantly normalizes time parameters
_normalize_time_parameter(time_parameter) is called in every translate method, including recursive calls from QAdd → its arguments. For an expression with 20 terms, it runs 20 times doing the same work.
Solution: Normalize once at the public entry point, then pass the result through internal _translate methods:
function translate(op, b::Basis; parameter=Dict(), time_parameter=Dict(), kwargs...)
tp = _normalize_time_parameter(time_parameter)
return _translate(op, b, parameter, tp, kwargs...)
end
Also type the internal dict (Dict{KT, Function} instead of Dict{Any, Any}).
Problem 9: Correlation matrix allocates intermediate vectors
g1 = [expect(Ls_ls_dag[it+i-1], ρ_bar_τ[i]) for i = 1:length(τ_)] # new vector per outer iteration
g1_m[it, it:end] = g1
Solution: Write directly into the output matrix:
for i in eachindex(ρ_bar)
val = expect(Ls_dag, ρ_bar[i])
g1_m[it, it+i-1] = val
g1_m[it+i-1, it] = conj(val)
end
Problem 10: Cryptic API names
The current function names (u_to_gu, v_to_gv, u_eff, interaction_picture_A, etc.) are shorthand from specific papers. They mean nothing without reading Kiilerich & Mølmer. A package API should be self-documenting.
Solution: Rename everything to describe what it computes, not what variable letter it uses.
Full rename table
| Current |
New |
Why |
| Types |
|
|
SLH |
SLH |
Well-established in quantum optics |
SLHqo |
removed |
Unified into SLH |
| (new) |
Gaussian |
Pulse shape descriptor — enables dispatch instead of _Gauss suffix functions |
| Accessors |
|
|
get_scattering(G) |
scattering(G) |
Julia convention: no get_ prefix (length, size, basis, not get_length) |
get_lindblad(G) |
lindblad(G) |
same |
get_hamiltonian(G) |
hamiltonian(G) |
same |
| Composition |
|
|
▷ / cascade |
▷ / series |
"cascade" is jargon; series is universally understood |
⊞ / concatenate |
⊞ / parallel |
"concatenate" suggests appending strings; parallel matches the physics |
feedback |
feedback |
already clear |
| Pulse coupling |
|
|
u_to_gu(u, T) |
coupling_input(mode, T) |
describes what you compute, not variable letters |
v_to_gv(v, T) |
coupling_output(mode, T) |
same |
u_to_gu_Gauss(τ, σ; δ) |
coupling_input(g::Gaussian) |
dispatch on pulse type instead of name suffix |
v_to_gv_Gauss(τ, σ; δ) |
coupling_output(g::Gaussian) |
same |
uv_to_gin(u, v, T) |
coupling_delay_in(u, v, T) |
explicit about delay cavity context |
uv_to_gout(u, v, T) |
coupling_delay_out(u, v, T) |
same |
| Effective modes |
|
|
u_eff(modes, T, i) |
effective_input_mode(modes, T, i) |
says what it returns |
v_eff(modes, T, i) |
effective_output_mode(modes, T, i) |
same |
| Interaction picture |
|
|
interaction_picture_A(gs) |
coupling_matrix(gs) |
describes what it builds; the interaction picture context is implicit |
interaction_picture_M(A, T) |
solve_mode_evolution(A, T) |
describes what it does, not the textbook symbol |
interaction_picture_M_2modes_equal(u, T) |
solve_mode_evolution_symmetric(u, T) |
explicit about the u=v symmetry assumption |
| Translation |
|
|
translate_qo(op, b) |
translate(op, b) |
shorter; the _qo suffix is redundant since the basis argument already implies QuantumOptics |
substitute_operators(op, dict) |
substitute_operators(op, dict) |
already clear |
| Correlations |
|
|
two_time_corr_matrix(T, ρt, ...) |
correlation_matrix(T, ρt, ...) |
shorter; "two-time" is implied by returning a matrix over a time grid |
Gaussian pulse type
Instead of separate _Gauss functions, introduce a lightweight pulse descriptor:
struct Gaussian{T}
τ::T # center time
σ::T # width
δ::T # detuning
end
Gaussian(τ, σ; δ=0) = Gaussian(τ, σ, δ)
coupling_input(g::Gaussian) # analytical formula
coupling_output(g::Gaussian) # analytical formula
coupling_input(mode, T) # numerical from samples/function
coupling_output(mode, T) # numerical from samples/function
This is extensible — adding Exponential or SechSquared pulse shapes later just means adding new types and methods, not new function names.
File organization
Split the current files by concern:
src/
├── QuantumInputOutput.jl # module, exports, imports
├── types.jl # SLH, Gaussian, constructors, _adj, _post, _add, _mul
├── compose.jl # ▷, ⊞, feedback (one implementation each)
├── translate.jl # translate, _translate internal dispatch
├── pulses.jl # coupling_input, coupling_output, coupling_delay_in, coupling_delay_out
├── modes.jl # effective_input_mode, effective_output_mode
├── interaction_picture.jl # coupling_matrix, solve_mode_evolution, solve_mode_evolution_symmetric
├── correlations.jl # correlation_matrix
└── utils.jl # substitute_operators, numeric_average, expect, constants
The current pulses.jl mixes coupling computation with ODE-based effective modes — these are separate concerns. The current utils.jl mixes virtual cavity code with operator substitution.
New dependencies
- StaticArrays —
SMatrix/SVector for scattering and Lindblad storage
- FunctionWrappers —
FunctionWrapper for concrete time-dependent element types
Both lightweight, widely used, zero transitive dependencies.
Breaking changes
| What changes |
Why |
Migration |
SLHqo removed |
Unified into SLH |
SLHqo(...) → SLH(...), same arguments |
SLH.scattering is SMatrix |
Stack allocation, compile-time dimension safety |
Matrix(G.scattering) if plain matrix needed |
SLH.lindblad is SVector |
Stack allocation, concrete element types |
Indexing/iteration unchanged |
| All function renames |
Self-documenting API |
See rename table above |
coupling_input etc. return LinearInterpolation |
Removes opaque closure wrapper |
Already callable as f(t), transparent |
solve_mode_evolution returns ODE solution |
Removes redundant wrapper |
Already callable as sol(t), transparent |
cascade → series |
Clearer terminology |
Find-and-replace |
concatenate → parallel |
Clearer terminology |
Find-and-replace |
get_* → bare name |
Julia convention |
Drop get_ prefix |
Exports
export SLH, Gaussian,
scattering, lindblad, hamiltonian,
▷, series, ⊞, parallel, feedback,
translate,
coupling_input, coupling_output,
coupling_delay_in, coupling_delay_out,
effective_input_mode, effective_output_mode,
coupling_matrix, solve_mode_evolution, solve_mode_evolution_symmetric,
correlation_matrix,
substitute_operators
We need to redesign the internals of QuantumInputOutput.jl. The current implementation works but leaves a lot of performance on the table due to type instabilities, unnecessary allocations, and duplicated code paths. Since systems are small (1-4 ports) and the ODE inner loop evaluates our closures thousands of times, every allocation and type-instability compounds.
Problem 1: Two parallel SLH types with duplicated logic
We currently have
SLH(symbolic) andSLHqo(numeric/time-dependent). Both represent the same mathematical object — an (S, L, H) triple — and the composition rules (cascade, concatenate, feedback) are mathematically identical. Yet we maintain two complete implementations of▷,⊞, andfeedback, each ~100 lines, because the symbolic path needsSQA._adjointandsimplifywhile the numeric path needsadjointand handlesFunctiontypes.This duplication is a maintenance burden and a source of subtle divergence bugs.
Solution: A single parametric
SLH{N, ST, LT, HT}type. The symbolic-vs-numeric difference reduces to four lines of dispatch:One implementation of cascade, concatenate, and feedback uses
_adjand_postinstead of hardcoding the symbolic or numeric variant. Julia's dispatch handles the rest.Problem 2: Heap-allocated scattering matrices
The scattering matrix is currently a
Matrix(heap-allocated). In concatenation it's evenMatrix{Any}:Matrix{Any}means every element access boxes/unboxes, killing type inference for everything downstream. Even the non-Anypaths use heap-allocatedMatrixandVectorfor objects that are 1x1 to 4x4 in practice.Why this matters: Scattering matrices are accessed during cascade and feedback — operations that happen at model construction time. More importantly, the Lindblad vector is also a plain
Vector, and its elements get composed into closures that run at every ODE timestep. Heap allocation for a 2-element vector is pure overhead.Solution: Use
StaticArrays. The port countNbecomes a type parameter:All scattering and Lindblad storage lives on the stack.
Nis 1-4 in all real use cases (confirmed by scanning every test and example) — well within StaticArrays' sweet spot. Concatenation producesSLH{N1+N2}at compile time. Cascade enforces matchingNat compile time (no more runtime@assert).Convenience constructors preserve the current API:
Problem 3: Time-dependent closures are type-unstable
The current
SLHqocascade creates closures like:These closures run at every ODE timestep. Each call allocates 3+ vectors that immediately become garbage. Worse, because each closure is a unique anonymous type, storing multiple of them in a
Vector{Function}is type-erased — the compiler can't specialize on the element type.Solution: Two changes:
_add/_mulhelpers that build composed closures without intermediate vector allocation:FunctionWrappers.jlgives all time-dependent elements a single concrete type:When any Lindblad element is a
Function, all elements are wrapped uniformly asTimeDep{Operator}. This makesSVector{N, TimeDep{Operator}}fully concrete — the compiler sees one type, not N different anonymous closure types. For fully static systems (symbolic or numeric without time-dependence), no wrapping happens.Problem 4: Pulse functions return opaque closures
Every
coupling_input,coupling_output, etc. (currentlyu_to_gu,v_to_gv, ...) computes aLinearInterpolationthen wraps it:LinearInterpolationis already callable. The wrapper creates an opaque closure type that prevents the compiler from specializing downstream code (e.g.,coupling_matrixreceiving a concreteLinearInterpolationvs. a genericFunction).Solution: Return the
LinearInterpolationdirectly. All downstream code callsg(t)— works unchanged, but now Julia knows the concrete type.Problem 5:
coupling_matrix(currentlyinteraction_picture_A) allocates every ODE stepThis function is the RHS of the mode evolution ODE. It's called at every timestep. Three allocations per call, thousands of calls.
Solution: Pre-allocate buffers in the closure:
Zero allocations per call.
Problem 6:
solve_mode_evolution(currentlyinteraction_picture_M) uses out-of-place ODESolution: In-place form, and return the ODE solution directly (it's already callable as
sol(t)):Problem 7: Non-const module globals
Non-const globals in Julia are type-unstable. The compiler inserts type checks at every access because the value could change type at any time.
Solution: Add
const.Problem 8:
translate(currentlytranslate_qo) redundantly normalizes time parameters_normalize_time_parameter(time_parameter)is called in everytranslatemethod, including recursive calls fromQAdd→ its arguments. For an expression with 20 terms, it runs 20 times doing the same work.Solution: Normalize once at the public entry point, then pass the result through internal
_translatemethods:Also type the internal dict (
Dict{KT, Function}instead ofDict{Any, Any}).Problem 9: Correlation matrix allocates intermediate vectors
Solution: Write directly into the output matrix:
Problem 10: Cryptic API names
The current function names (
u_to_gu,v_to_gv,u_eff,interaction_picture_A, etc.) are shorthand from specific papers. They mean nothing without reading Kiilerich & Mølmer. A package API should be self-documenting.Solution: Rename everything to describe what it computes, not what variable letter it uses.
Full rename table
SLHSLHSLHqoSLHGaussian_Gausssuffix functionsget_scattering(G)scattering(G)get_prefix (length,size,basis, notget_length)get_lindblad(G)lindblad(G)get_hamiltonian(G)hamiltonian(G)▷/cascade▷/seriesseriesis universally understood⊞/concatenate⊞/parallelparallelmatches the physicsfeedbackfeedbacku_to_gu(u, T)coupling_input(mode, T)v_to_gv(v, T)coupling_output(mode, T)u_to_gu_Gauss(τ, σ; δ)coupling_input(g::Gaussian)v_to_gv_Gauss(τ, σ; δ)coupling_output(g::Gaussian)uv_to_gin(u, v, T)coupling_delay_in(u, v, T)uv_to_gout(u, v, T)coupling_delay_out(u, v, T)u_eff(modes, T, i)effective_input_mode(modes, T, i)v_eff(modes, T, i)effective_output_mode(modes, T, i)interaction_picture_A(gs)coupling_matrix(gs)interaction_picture_M(A, T)solve_mode_evolution(A, T)interaction_picture_M_2modes_equal(u, T)solve_mode_evolution_symmetric(u, T)translate_qo(op, b)translate(op, b)_qosuffix is redundant since the basis argument already implies QuantumOpticssubstitute_operators(op, dict)substitute_operators(op, dict)two_time_corr_matrix(T, ρt, ...)correlation_matrix(T, ρt, ...)Gaussianpulse typeInstead of separate
_Gaussfunctions, introduce a lightweight pulse descriptor:This is extensible — adding
ExponentialorSechSquaredpulse shapes later just means adding new types and methods, not new function names.File organization
Split the current files by concern:
The current
pulses.jlmixes coupling computation with ODE-based effective modes — these are separate concerns. The currentutils.jlmixes virtual cavity code with operator substitution.New dependencies
SMatrix/SVectorfor scattering and Lindblad storageFunctionWrapperfor concrete time-dependent element typesBoth lightweight, widely used, zero transitive dependencies.
Breaking changes
SLHqoremovedSLHSLHqo(...)→SLH(...), same argumentsSLH.scatteringisSMatrixMatrix(G.scattering)if plain matrix neededSLH.lindbladisSVectorcoupling_inputetc. returnLinearInterpolationf(t), transparentsolve_mode_evolutionreturns ODE solutionsol(t), transparentcascade→seriesconcatenate→parallelget_*→ bare nameget_prefixExports