[Improvement] Optimize CharRange hashCode: Benchmark Objects.hash vs bitwise operations #1526
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hello @garydgregory
I have submitted a new PR that addresses this
I’ve conducted a detailed performa hashCode implementations for the CharRange class (package: org.apache.commons.lang3) and wanted to share the results:
1. Test Overview
I benchmarked two hashCode implementations for CharRange (coreorg.apache.commons.lang3.CharRange class):
Baseline: hashCodeObjects() (using Objects.hash(end, negated, start) – standard general-purpose implementat
Optimized: hashCodeBitwise() (bitwise splicing of startendnegated
2. Key Test Results
2.1 Performance Benchmark (100 million iterations/scenario)
The bitwise implementation achieves a 98.17% reduction in execution time compared to the Objects.hash
2.2 Hash Collision Rate Test (1 million unique CharRange instances)
CharRangeTest#testHashCodeCollisionRateTo verify the hash distribution quality (a critical factor for hash table performance), I conducted a collision rate test with 1 million unique CharRange instances (covering normal/negated ranges, single-character ranges, and extreme value ranges). The results are as follows:
The bitwise implementation reduces the hash collision rate by approximately 98.88% (from 1.0281% to 0.0115%) compared to the Objects.hash version, representing a near 99% reduction in colliding instances and drastically improving hash distribution quality.
3. Why the Bitwise Version is Superior