Skip to content

perf: PikeVM sparse-dispatch for dot patterns — 2.8-4.8x speedup (#132)#134

Merged
kolkov merged 1 commit intomainfrom
feature/pikevm-sparse-dispatch
Mar 10, 2026
Merged

perf: PikeVM sparse-dispatch for dot patterns — 2.8-4.8x speedup (#132)#134
kolkov merged 1 commit intomainfrom
feature/pikevm-sparse-dispatch

Conversation

@kolkov
Copy link
Contributor

@kolkov kolkov commented Mar 10, 2026

Summary

  • NFA compiler generated ~9 split states chaining UTF-8 byte-range alternation branches for each . (AnyCharNotNL). PikeVM had to DFS-traverse the entire split chain at every byte position — O(branches) work per byte.
  • New compileUTF8AnySparse() compiles . as a single sparse state mapping each leading byte range directly to its continuation chain — O(1) dispatch instead of O(branches) split-chain traversal. Same approach as Rust regex's State::Sparse.
  • Sparse NFA (runeNFA) compiled only for patterns containing .. PikeVM and BoundedBacktracker use it; DFA continues with byte-level NFA (unaffected).
  • PikeVM speedup: 2.8-4.8x on dot-heavy patterns (.*?, .+, .*).

Reported by @kostya via LangArena benchmarks (Issue #124). Closes #132.

Test plan

  • go test ./... — all 11 packages pass
  • gofmt -l . — clean
  • golangci-lint run — 0 issues
  • BenchmarkFind — no regressions
  • Full meta benchmark suite — PASS, no regressions
  • CI: tests + benchmark comparison
  • regex-bench validation (post-merge)

NFA compiler generated ~9 split states chaining UTF-8 byte-range alternation
branches for each '.' (AnyCharNotNL). PikeVM DFS-traversed the entire split
chain at every byte position.

New compileUTF8AnySparse() compiles '.' as a single sparse state mapping each
leading byte range directly to its continuation chain — O(1) dispatch instead
of O(branches) split-chain traversal. Same approach as Rust regex State::Sparse.

Closes #132
@github-actions
Copy link

Benchmark Comparison

Comparing main → PR #134

Summary: geomean 121.6n 119.0n -2.08%

⚠️ Potential regressions detected:

LazyDFASimpleLiteral-4     59.95n ± ∞ ¹   62.21n ± ∞ ¹  +3.77% (p=0.008 n=5)
LazyDFAAlternation-4       66.12n ± ∞ ¹   66.79n ± ∞ ¹  +1.01% (p=0.008 n=5)
LazyDFARepetition-4        85.48n ± ∞ ¹   87.51n ± ∞ ¹  +2.37% (p=0.008 n=5)
geomean                               ³                +0.00%               ³
geomean                               ³                +0.00%               ³
geomean                         ³                +0.00%               ³
geomean                         ³                +0.00%               ³
AhoCorasickManyPatterns/coregex_10_patterns-4           59.16n ± ∞ ¹    67.34n ± ∞ ¹   +13.83% (p=0.008 n=5)
AhoCorasickManyPatterns/coregex_25_patterns-4           72.55n ± ∞ ¹    81.59n ± ∞ ¹   +12.46% (p=0.008 n=5)
AhoCorasickManyPatterns/coregex_50_patterns-4           88.08n ± ∞ ¹    91.52n ± ∞ ¹    +3.91% (p=0.008 n=5)

Full results available in workflow artifacts. CI runners have ~10-20% variance.
For accurate benchmarks, run locally: ./scripts/bench.sh --compare

@kolkov kolkov merged commit 5d43429 into main Mar 10, 2026
8 checks passed
@kolkov kolkov deleted the feature/pikevm-sparse-dispatch branch March 10, 2026 16:09
@codecov
Copy link

codecov bot commented Mar 10, 2026

Codecov Report

❌ Patch coverage is 44.57831% with 46 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
nfa/compile.go 0.00% 38 Missing and 2 partials ⚠️
meta/compile.go 86.04% 5 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: PikeVM UTF-8 dot alternation overhead (3x slower than stdlib on Template::Regex)

1 participant