Skip to content

Apply GCD bound transform to sorted numeric rangeIntoBitSet#16285

Open
costin wants to merge 2 commits into
apache:mainfrom
costin:lucene/sorted-numeric-gcd-range
Open

Apply GCD bound transform to sorted numeric rangeIntoBitSet#16285
costin wants to merge 2 commits into
apache:mainfrom
costin:lucene/sorted-numeric-gcd-range

Conversation

@costin

@costin costin commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

GCD- and delta-encoded multi-value SortedNumericDocValues decode every packed value during range evaluation. This transforms query bounds into the encoded domain and compares raw values directly, matching the recipe used for single-value NumericDocValues.

Uses the same approach as #16160 (the same utility methods are used, will be removed once that PR gets merged)

Benchmark

SortedNumericGcdRangeIntoBitSetBenchmark, 1M docs, JDK 25.0.3.

AMD EPYC 7R32 (c5a.2xlarge) — AVX2, 256-bit

cardinality encoding selectivity baseline (ops/s) candidate (ops/s) ratio
3 delta_only 0.01 71.5 86.4 1.21x
3 delta_only 0.1 67.3 80.8 1.20x
3 delta_only 0.5 69.2 84.2 1.22x
3 gcd_1000 0.01 69.7 74.7 1.07x
3 gcd_1000 0.1 64.1 80.7 1.26x
3 gcd_1000 0.5 64.6 68.9 1.07x
3 gcd_100_delta 0.01 69.0 85.9 1.24x
3 gcd_100_delta 0.1 71.8 81.1 1.13x
3 gcd_100_delta 0.5 72.0 71.3 0.99x
5 delta_only 0.01 72.0 79.2 1.10x
5 delta_only 0.1 59.0 68.5 1.16x
5 delta_only 0.5 68.3 80.1 1.17x
5 gcd_1000 0.01 47.7 79.9 1.68x
5 gcd_1000 0.1 47.3 68.7 1.45x
5 gcd_1000 0.5 58.2 79.5 1.37x
5 gcd_100_delta 0.01 48.4 73.2 1.51x
5 gcd_100_delta 0.1 47.6 68.4 1.44x
5 gcd_100_delta 0.5 58.0 67.7 1.17x

Intel Xeon 8375C (c6i.2xlarge) — AVX-512, 512-bit

cardinality encoding selectivity baseline (ops/s) candidate (ops/s) ratio
3 delta_only 0.01 86.3 93.0 1.08x
3 delta_only 0.1 80.1 84.0 1.05x
3 delta_only 0.5 84.6 92.2 1.09x
3 gcd_1000 0.01 74.9 92.9 1.24x
3 gcd_1000 0.1 76.3 86.3 1.13x
3 gcd_1000 0.5 78.9 88.9 1.13x
3 gcd_100_delta 0.01 74.9 93.1 1.24x
3 gcd_100_delta 0.1 76.1 86.4 1.13x
3 gcd_100_delta 0.5 72.2 89.4 1.24x
5 delta_only 0.01 78.9 85.4 1.08x
5 delta_only 0.1 78.1 79.4 1.02x
5 delta_only 0.5 81.8 84.6 1.03x
5 gcd_1000 0.01 48.6 86.1 1.77x
5 gcd_1000 0.1 47.6 79.0 1.66x
5 gcd_1000 0.5 56.4 83.4 1.48x
5 gcd_100_delta 0.01 48.7 83.3 1.71x
5 gcd_100_delta 0.1 47.9 79.1 1.65x
5 gcd_100_delta 0.5 56.6 84.3 1.49x

GCD- and delta-encoded multi-value SortedNumericDocValues decode
every packed value during range evaluation. This transforms query
bounds into the encoded domain and compares raw values directly,
matching the recipe used for single-value NumericDocValues.
@github-actions github-actions Bot added this to the 10.6.0 milestone Jun 23, 2026
@sgup432

sgup432 commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Nice! I was also working on this and had a PR ready before I found out that you already did this! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants