Skip to content

Conversation

@IcoreE
Copy link
Contributor

@IcoreE IcoreE commented Dec 15, 2025

Hello @garydgregory
I have submitted a new PR that addresses this

  • Full unit tests are included, and a comparison of two hashCode algorithms (Objects.hash vs. bitwise operations) is added with both contract validation and performance benchmarks.

I’ve conducted a detailed performa hashCode implementations for the CharRange class (package: org.apache.commons.lang3) and wanted to share the results:
1. Test Overview
I benchmarked two hashCode implementations for CharRange (coreorg.apache.commons.lang3.CharRange class):
Baseline: hashCodeObjects() (using Objects.hash(end, negated, start) – standard general-purpose implementat
Optimized: hashCodeBitwise() (bitwise splicing of startendnegated
2. Key Test Results
2.1 Performance Benchmark (100 million iterations/scenario)

CharRangeTest#testHashCode_ContractCompliance
CharRangeTest#testHashCode_PerformanceComparison
===== CharRange hashCode Efficiency Comparison (Execution count: 100000000 iterations/scenario) =====
Total time for Objects.hash version: 5139 ms, accumulated result: 6360454000000000
Total time for bitwise operation version: 94 ms, accumulated result: 1087927400000000
Efficiency improvement of bitwise version over Objects.hash version: 98.17%

The bitwise implementation achieves a 98.17% reduction in execution time compared to the Objects.hash

2.2 Hash Collision Rate Test (1 million unique CharRange instances)
CharRangeTest#testHashCodeCollisionRate
To verify the hash distribution quality (a critical factor for hash table performance), I conducted a collision rate test with 1 million unique CharRange instances (covering normal/negated ranges, single-character ranges, and extreme value ranges). The results are as follows:

Generated 1000000 unique CharRange instances
===== HashCode Collision Rate Comparison =====
Objects.hash implementation:
  Number of unique hash codes: 989719
  Total colliding instances: 10281
  Collision rate: 1.028100%
Bitwise implementation:
  Number of unique hash codes: 999885
  Total colliding instances: 115
  Collision rate: 0.011500%

The bitwise implementation reduces the hash collision rate by approximately 98.88% (from 1.0281% to 0.0115%) compared to the Objects.hash version, representing a near 99% reduction in colliding instances and drastically improving hash distribution quality.

3. Why the Bitwise Version is Superior

  1. Extreme efficiency: Uses only 3 native bitwise operations (left shift, bitwise OR, XOR) – no temporary array allocation, autoboxing, loops, or multiplication (all sources of overhead in Objects.hash)
  2. Contract compliance: Strictly adheres to Java’s hashCode/equals rules (equal instances have identical hashes; unequal instances rarely collide).
  3. Memory-friendly: No GC overhead (unlike Objects.hash, which creates temporary Object[] arrays).

@garydgregory
Copy link
Member

@IcoreE
Why is there a new PR instead of keeping #1524? Now we have to track 2 discussions?

@IcoreE
Copy link
Contributor Author

IcoreE commented Dec 15, 2025

@IcoreE Why is there a new PR instead of keeping #1524? Now we have to track 2 discussions?

@garydgregory I’m so sorry for the confusion caused by the new PR – I’m a beginner with open-source PR workflows and made a misstep here. In #1524, I initially added the hashCode tests in a new CharRangeHashCodeTest.java file, but later realized it’s better to supplement these tests directly in CharRangeTest.java (to keep all CharRange tests in one place), and I have since closed PR #1524. That’s why I created the new PR, which was an incorrect approach. Apologies again for the inconvenience!

@IcoreE IcoreE closed this Dec 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants