Skip to content

(feat): new AutoSearch policy as an adaptive default search policy#44

Merged
mgyoo86 merged 16 commits intomasterfrom
feat/AutoSearch_policy
Feb 25, 2026
Merged

(feat): new AutoSearch policy as an adaptive default search policy#44
mgyoo86 merged 16 commits intomasterfrom
feat/AutoSearch_policy

Conversation

@mgyoo86
Copy link
Collaborator

@mgyoo86 mgyoo86 commented Feb 25, 2026

Summary

Introduce AutoSearch — an adaptive default search policy that resolves to Binary() for scalar queries and LinearBinary() for vector queries at call time. Also optimizes LinearBinary internals (branchless binary, window default 2→ optimal sweet spot) and ships comprehensive tests and documentation.

Motivation

Previously, Binary() was the unconditional default. This meant vector queries over sorted data — the most common batch use case — silently used O(log n) binary search instead of the much faster O(1)-amortized LinearBinary. Users had to opt in manually, and many didn't know to.

The new AutoSearch default removes this decision entirely for 95% of users:

Query type Old behavior New behavior Effect
Scalar (e.g., itp(0.5)) Binary() (was default) Binary() via AutoSearch Unchanged
Vector (e.g., itp(Vector)) Binary() (was default) LinearBinary() via AutoSearch Up to ~5× faster on sorted data

Key Changes

AutoSearch (new type)

itp = linear_interp(x, y)     # stores AutoSearch — the new default
itp(0.5)                       # → Binary() (scalar)
itp([0.1, 0.5, 0.9])           # → LinearBinary() (vector)
itp(0.5; search=Binary())      # explicit override still works

Resolution happens once per call, at the eval entry point, with negligible overhead.

LinearBinary optimization

  • Branchless binary core: replaced while loop with for _ in 1:iters where iters = 64 - leading_zeros(...). Precomputed trip count → constant-iteration loop + ifelse → ARM64 csel (branch-free). ~25–55% faster than the old loop on random queries.
  • Default window changed: LinearBinary() now defaults to linear_window=2 (was 8). Window=2 minimizes overhead for mixed/unknown patterns while still exploiting locality for sorted sequences.

Safety fix

_search_linear_binary! now clamps the hint before first use (ix = clamp(ix, 1, n-1)), guarding against user-provided out-of-range hints (Ref(0), stale hints from a different grid).

New Export

export AutoSearch

Impact

  • Zero breaking changes for existing code: explicit search=Binary() or search=LinearBinary() at constructor or call site is honored as-is.
  • Behavior change for users who relied on the default being Binary() for vector queries. Vector queries now use LinearBinary() via AutoSearch. For random vector queries, this is ~2–3× slower than Binary() — use search=Binary() explicitly to restore the old behavior.
  • All 1D, ND, series, and integration paths updated. 440+ new test assertions.

Optimize the LinearBinary search hot path:
- Remove redundant hint clamp (internal hints are always valid)
- Skip hint write on direct hit (ix unchanged)
- Use single comparison per linear step (direction already bounds one side)
- Change default linear_window from 8 to 2 (sweet spot for minimal
  random overhead while retaining sorted/clustered locality gains)

Benchmark results (Grid=2000, vector call):
- Random queries: ~2x slower than Binary (structural; branch predictor)
- Sorted queries: ~5x faster than Binary
- Clustered: ~4.7x faster | DenseLocal: ~5.9x faster
Update all interpolation constructors and docstrings across linear,
cubic, quadratic, and constant modules (1D and ND) to use
LinearBinary() as the default search policy instead of Binary().

This is a breaking change for code that relied on the default being
Binary() — most users won't notice since LinearBinary automatically
falls back to binary search when the linear walk misses.

Affected: 25 source files across all interpolation types.
Update test assertions to reflect the new default search policy:
- LinearBinary() now produces LinearBinary{2} (was {8})
- Default interpolant search_policy is LinearBinary{2} (was Binary)
- Show/format tests updated accordingly
Tests all 4 interpolation types × 5 query patterns × 2 grid sizes
in 3 calling modes (vector, scalar+hint, scalar no-hint).
Validates the Binary→LinearBinary default change with real interpolation
workloads rather than isolated search benchmarks.
AutoSearch resolves at call time: scalar→Binary(), vector→LinearBinary().
Includes _resolve_search dispatch, _to_searcher fallbacks, export, and show format.
Updates keyword defaults in constructors, oneshot functions, series constructors,
and ND constructors across all 4 interpolant types (linear, cubic, quadratic, constant).
Docstring signatures updated accordingly. Explicit user examples preserved.
Adds AutoSearch resolution before _to_searcher in all interpolant callables,
oneshot hot paths, and series eval methods. Scalar queries now resolve to
Binary(), vector queries to LinearBinary().
Adds per-axis AutoSearch resolution via map after _resolve_search_nd in all
ND eval callables and oneshot functions. Scalar ND queries resolve to Binary(),
SoA/AoS batch queries resolve to LinearBinary().
… assertions

- Add _resolve_search(::Searcher, _) passthrough after Searcher struct definition
  so pre-built Searcher objects injected via `search=` keyword skip resolution
- Update test_search.jl default policy assertion from LinearBinary{2} to AutoSearch
- Replace "uses LinearBinary() by default" with AutoSearch resolution docs
- Add AutoSearch format test and default interpolant show test
- Update Binary() docstring default reference
…ve_search_nd

Add _resolve_search_nd(search, Val(N), query_sample) overload that combines
dimension broadcast + AutoSearch resolution in one call. Replaces 26 two-line
patterns across all ND eval and oneshot files with a single-line call.

Zero allocation verified for both interpolant eval and oneshot paths.
… weak tests

- search.jl: clamp hint in _search_linear_binary! to guard against user-provided
  out-of-range hints (Ref(0), stale hints from shorter grids); mirrors _search_linear!
- vector_calculus.jl: all 5 @generated fns (gradient/gradient!/hessian/hessian!/laplacian)
  now use 3-arg _resolve_search_nd with first(query) so AutoSearch resolves correctly
- integrate_api.jl: all 8 integrate fns now call _resolve_search before _to_searcher,
  routing AutoSearch through the intended path instead of the safety-net fallback
- nd_utils.jl: document AoS batch dispatch semantics in _resolve_search_nd docstring
- test_search.jl: replace weak isfinite hint test with value+position assertions;
  add OOB clamp safety test; add gradient/hessian/laplacian AutoSearch resolution tests
…s + tests

- integrate_common_nd.jl: _integrate_nd_preamble now uses 3-arg
  _resolve_search_nd(search, Val(N), first(lo)), routing AutoSearch through
  the intended resolution path instead of the _to_searcher safety-net fallback
- search.jl: fix misleading dispatch ordering comment (Tuple{Vararg{Real}}
  never existed; the fallback is bare ::Tuple); add n>=2 precondition note
  on the clamp line
- vector_calculus.jl: add inline comment on all 5 @generated sites explaining
  why first(query)::Real is correct (scalar-point-only API)
- 9 ND eval/oneshot files: add inline comment at each AoS dispatch site
  documenting the AbstractVector{<:Tuple} <: AbstractVector trick
- test_search.jl: document expected indices in hint-tracking assertions;
  widen series bound from >=170 to >=160 (margin 11→21); improve clamp
  test comment to note valid range
Pass query containers directly to _resolve_search_nd instead of
extracting a sample element with first(). Julia dispatch on the
container type already gives the correct resolution:
  NTuple{N,Real}           → Tuple arm            → Binary/axis
  NTuple{N,AbstractVector} → Tuple{Vararg{...}} arm → LinearBinary/axis
  AbstractVector{<:Tuple}  → AbstractVector arm    → LinearBinary/axis

24 sites across 11 files updated. nd_utils.jl docstring updated to
document the direct-container passing convention.
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FastInterpolations.jl Benchmarks

Details
Benchmark suite Current: 4637ff6 Previous: 2bb287f Ratio
10_nd_construct/bicubic_2d 53871 ns 60974 ns 0.88
10_nd_construct/bilinear_2d 1542.08 ns 1972.88 ns 0.78
10_nd_construct/tricubic_3d 361834 ns 394165 ns 0.92
10_nd_construct/trilinear_3d 3792.08 ns 4669.5 ns 0.81
11_nd_eval/bicubic_2d_batch 1659.1 ns 1749.2 ns 0.95
11_nd_eval/bicubic_2d_scalar 29.55 ns 31.96 ns 0.92
11_nd_eval/bilinear_2d_scalar 25.95 ns 27.16 ns 0.96
11_nd_eval/tricubic_3d_batch 3357.2 ns 3613.7 ns 0.93
11_nd_eval/tricubic_3d_scalar 54.01 ns 56.1 ns 0.96
11_nd_eval/trilinear_3d_scalar 30.36 ns 32.16 ns 0.94
1_cubic_oneshot/q00001 525.78 ns 555.62 ns 0.95
1_cubic_oneshot/q10000 59598.1 ns 64381 ns 0.93
2_cubic_construct/g0100 1399.02 ns 1519.82 ns 0.92
2_cubic_construct/g1000 14107.3 ns 15638.2 ns 0.90
3_cubic_eval/q00001 36.97 ns 38.97 ns 0.95
3_cubic_eval/q00100 470.48 ns 507.74 ns 0.93
3_cubic_eval/q10000 41837 ns 45413.6 ns 0.92
4_linear_oneshot/q00001 40.88 ns 41.68 ns 0.98
4_linear_oneshot/q10000 34354.1 ns 37083.1 ns 0.93
5_linear_construct/g0100 14.12 ns 12.73 ns 1.11
5_linear_construct/g1000 13.23 ns 13.03 ns 1.02
6_linear_eval/q00001 22.64 ns 21.24 ns 1.07
6_linear_eval/q00100 378.7 ns 405.34 ns 0.93
6_linear_eval/q10000 33342.4 ns 36142.3 ns 0.92
7_cubic_range/scalar_query 23.44 ns 25.85 ns 0.91
7_cubic_vec/scalar_query 15.22 ns 16.33 ns 0.93
8_cubic_multi/construct_s001_q100 1248.72 ns 1404.42 ns 0.89
8_cubic_multi/construct_s010_q100 6198.52 ns 6296.14 ns 0.98
8_cubic_multi/construct_s100_q100 45937.6 ns 47930.3 ns 0.96
8_cubic_multi/eval_s001_q100 1963.26 ns 2097.9 ns 0.94
8_cubic_multi/eval_s010_q100 3059.7 ns 3467.06 ns 0.88
8_cubic_multi/eval_s010_q100_scalar_loop 3399.94 ns 3654.2 ns 0.93
8_cubic_multi/eval_s100_q100 14745.5 ns 17088.8 ns 0.86
8_cubic_multi/eval_s100_q100_scalar_loop 4537.4 ns 4782.9 ns 0.95
9_nd_oneshot/bicubic_2d 38124.4 ns 42177.5 ns 0.90
9_nd_oneshot/bilinear_2d 1559.7 ns 1679.54 ns 0.93
9_nd_oneshot/tricubic_3d 343814.3 ns 379146.6 ns 0.91
9_nd_oneshot/trilinear_3d 2997.6 ns 3157.9 ns 0.95

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an adaptive default search policy (AutoSearch) across the FastInterpolations API so scalar queries use Binary() while vector/batch queries use LinearBinary() (now defaulting to linear_window=2), alongside internal search optimizations, documentation updates, and expanded test coverage.

Changes:

  • Introduce AutoSearch and integrate call-time resolution (_resolve_search, _resolve_search_nd(..., query_sample)) throughout 1D/ND/series/oneshot/integration/vector-calculus paths.
  • Optimize search internals (branchless _search_binary, safer/faster _search_linear_binary!, change LinearBinary() default window 8 → 2).
  • Update display/exports and add extensive tests + a benchmark harness.

Reviewed changes

Copilot reviewed 40 out of 40 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
test/test_show.jl Updates display expectations for LinearBinary{2} and adds AutoSearch show coverage.
test/test_search.jl Adds/updates tests for AutoSearch resolution, new defaults, and hint clamping.
src/vector_calculus.jl Ensures ND vector-calculus entry points resolve AutoSearch based on query type.
src/quadratic/quadratic_types.jl Switches quadratic interpolant default search policy to AutoSearch and updates docs.
src/quadratic/quadratic_series_interp.jl Propagates AutoSearch defaults + resolves policy before building searchers in series eval.
src/quadratic/quadratic_oneshot.jl Applies AutoSearch default and resolves search based on scalar vs vector queries.
src/quadratic/quadratic_interpolant.jl Resolves search policy at call-time before _to_searcher for scalar/vector calls.
src/quadratic/nd/quadratic_nd_oneshot.jl Updates ND oneshot APIs to default to AutoSearch and resolve per query container.
src/quadratic/nd/quadratic_nd_interpolant.jl Updates ND quadratic interpolant API docs/defaults to AutoSearch.
src/quadratic/nd/quadratic_nd_eval.jl Resolves ND search tuple using query sample for correct scalar vs batch policy selection.
src/linear/nd/linear_nd_oneshot.jl Updates ND linear oneshot defaults and resolves AutoSearch using the query container.
src/linear/nd/linear_nd_interpolant.jl Updates ND linear interpolant docs/default search policy to AutoSearch.
src/linear/nd/linear_nd_eval.jl Ensures ND linear eval resolves search tuple based on query type.
src/linear/linear_types.jl Updates linear interpolant defaults/docs to AutoSearch.
src/linear/linear_series_interp.jl Applies AutoSearch defaults and resolves policy before anchor creation/fill.
src/linear/linear_oneshot.jl Updates oneshot linear APIs to default to AutoSearch and resolve before _to_searcher.
src/linear/linear_interpolant.jl Resolves AutoSearch at the main 1D interpolant call entry points.
src/integral/integrate_common_nd.jl Resolves ND search tuple using a scalar query sample (lo) for integration domain setup.
src/integral/integrate_api.jl Resolves search before _to_searcher in integrate entry points (scalar sample x0).
src/derivative_view.jl Adjusts docs/examples to reflect new default policy behavior.
src/cubic/nd/cubic_nd_oneshot.jl Updates ND cubic oneshot defaults and resolves search tuple based on query container.
src/cubic/nd/cubic_nd_interpolant.jl Updates ND cubic interpolant docs/defaults to AutoSearch.
src/cubic/nd/cubic_nd_eval.jl Resolves ND cubic search tuple based on query type.
src/cubic/cubic_types.jl Updates cubic interpolant default search policy/docs to AutoSearch.
src/cubic/cubic_series_interp.jl Applies AutoSearch defaults and resolves policy before anchor creation/fill.
src/cubic/cubic_oneshot.jl Updates cubic oneshot defaults and resolves search prior to creating a searcher.
src/cubic/cubic_interpolant.jl Resolves AutoSearch at cubic interpolant call boundaries.
src/cubic/cubic_eval.jl Resolves search policy for scalar eval path before _to_searcher.
src/core/show.jl Adds formatting for AutoSearch in show output.
src/core/search.jl Implements AutoSearch, updates LinearBinary default window, and optimizes/clamps search internals.
src/core/nd_utils.jl Adds a 3-arg _resolve_search_nd that resolves AutoSearch using a query sample.
src/constant/nd/constant_nd_oneshot.jl Updates ND constant oneshot defaults and resolves per query container.
src/constant/nd/constant_nd_interpolant.jl Updates ND constant interpolant docs/defaults to AutoSearch.
src/constant/nd/constant_nd_eval.jl Resolves ND constant search tuple based on query type.
src/constant/constant_types.jl Updates constant interpolant default policy/docs to AutoSearch.
src/constant/constant_series_interp.jl Applies AutoSearch defaults and resolves policy before anchor creation/fill.
src/constant/constant_oneshot.jl Updates constant oneshot defaults and resolves search before _to_searcher.
src/constant/constant_interpolant.jl Resolves AutoSearch at constant interpolant call boundaries.
src/FastInterpolations.jl Exports AutoSearch.
benchmark/default_search_comparison.jl Adds a benchmark script comparing old vs new default behaviors end-to-end.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov
Copy link

codecov bot commented Feb 25, 2026

Codecov Report

❌ Patch coverage is 98.64865% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.81%. Comparing base (2bb287f) to head (4637ff6).
⚠️ Report is 17 commits behind head on master.

Files with missing lines Patch % Lines
src/core/search.jl 93.10% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #44      +/-   ##
==========================================
+ Coverage   97.72%   97.81%   +0.09%     
==========================================
  Files          71       71              
  Lines        5679     5731      +52     
==========================================
+ Hits         5550     5606      +56     
+ Misses        129      125       -4     
Files with missing lines Coverage Δ
src/FastInterpolations.jl 100.00% <ø> (ø)
src/constant/constant_interpolant.jl 100.00% <100.00%> (ø)
src/constant/constant_oneshot.jl 98.52% <100.00%> (+0.04%) ⬆️
src/constant/constant_series_interp.jl 95.93% <100.00%> (+0.04%) ⬆️
src/constant/constant_types.jl 100.00% <ø> (ø)
src/constant/nd/constant_nd_eval.jl 98.86% <100.00%> (ø)
src/constant/nd/constant_nd_interpolant.jl 100.00% <ø> (ø)
src/constant/nd/constant_nd_oneshot.jl 92.78% <100.00%> (ø)
src/core/nd_utils.jl 97.47% <100.00%> (+0.03%) ⬆️
src/core/show.jl 98.48% <100.00%> (+<0.01%) ⬆️
... and 27 more
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

- linear_types.jl: replace AutoSearch() example with Binary() in "custom
  search policy" docstring (was showing the default, not an override)
- test_search.jl: simplify redundant test expression — collapse double
  sin.(2π .* xq) computation + redundant .|| into single atol check

Note: did NOT apply search.jl suggestion — `(hi - lo - 1) % UInt64` is
valid Julia (rem-based unchecked reinterpret, not UInt64() constructor).
Changing to UInt64() would reintroduce InexactError cold path in ASM.
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'FastInterpolations.jl Benchmarks'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.10.

Benchmark suite Current: 4637ff6 Previous: 2bb287f Ratio
5_linear_construct/g0100 14.12 ns 12.73 ns 1.11

This comment was automatically generated by workflow using github-action-benchmark.

- Random LB penalty: ~2-3x → ~2.5-3x (measured: 2.4-2.8x for vector batch)
- Monotonic LB gain: ~5x → ~4-6x (measured: 3.7-4.8x at 500pt, 4.5-6.1x at 2000pt)
- Expand note: clarify "vector batch calls" context, mention scalar-no-hint
  is only ~1.2x slower (hint walk overhead absent without persistence)
@mgyoo86
Copy link
Collaborator Author

mgyoo86 commented Feb 25, 2026

Performance Benchmarks

Measured on Apple M-series (ARM64), Julia 1.12.5, non-uniform AbstractVector grids. All times are median ns/query.

1. Branchless binary: no regression, genuine improvement

The new _search_binary (branchless for+ifelse) vs master's while-loop:

Grid size Random (old→new) Sorted (old→new) Speedup
n=20 3.3 → 2.9 ns 3.2 → 2.9 ns 1.12–1.14×
n=50 7.1 → 3.8 ns 6.0 → 3.8 ns 1.60–1.87×
n=100 10.4 → 4.8 ns 8.7 → 4.8 ns 1.81–2.16×
n=200 15.7 → 5.8 ns 10.5 → 5.8 ns 1.81–2.72×
n=500 9.6 → 7.1 ns 9.2 → 7.1 ns 1.30–1.36×

The largest gains are at medium grids (n=50–200) — the typical use range. The branchless version never over-iterates: iters = 64 - leading_zeros(n-2) computes exactly ⌈log₂(n-1)⌉, identical to the while-loop's worst case. The speedup comes from the loop-exit branch being predictable (constant trip count) rather than data-dependent.

2. LinearBinary: window=2 vs master's window=8

Comparing master LB{8} vs new LB{2} (end-to-end vector call, itp(out, queries)):

Grid Pattern Old LB{8} New LB{2} Speedup
n=50 Random 12.7 ns 11.2 ns 1.14×
n=200 Random 25.3 ns 19.7 ns 1.29×
n=500 Random 30.3 ns 24.6 ns 1.23×
n=50 Sorted 2.2 ns 2.2 ns ≈1.0×
n=200 Sorted 1.9 ns 1.9 ns ≈1.0×
n=500 Sorted 1.9 ns 1.9 ns ≈1.0×

For sorted/monotonic queries the window never matters — the hint hits on the first or second step regardless of window size. For random queries, window=8 wastes 8 walk steps before falling back to binary; window=2 minimizes that overhead. Net: ~20–29% faster worst case, no regression on sorted.

3. AutoSearch overhead: negligible for scalar

Scalar itp(q) comparison (AutoSearch default vs explicit Binary, random queries):

Grid AutoSearch Binary Difference
n=50 5.6 ns 5.6 ns 0%
n=200 7.8 ns 7.8 ns 0%
n=500 9.0 ns 9.0 ns 0%
n=2000 11.5 ns 11.5 ns 0%

Resolution to Binary() is resolved at the call-site dispatch level and disappears entirely after JIT compilation — zero overhead.

4. AutoSearch vector call: trade-off summary

End-to-end vector call itp(out, queries) on n=500 grid:

Pattern Binary AutoSearch (→ LB{2}) Effect
Random 8.7 ns 24.6 ns 2.8× slower — use search=Binary()
Sorted 8.7 ns 1.9 ns 4.5× faster
Clustered 8.7 ns 2.5 ns 3.5× faster
Reverse 8.7 ns 1.9 ns 4.5× faster
DenseLocal 8.7 ns 1.8 ns 4.8× faster

The random penalty is structural — LinearBinary always walks before the binary fallback. For random batch queries, override with search=Binary() at the call site or constructor.

@mgyoo86 mgyoo86 merged commit ee1fdff into master Feb 25, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants