Skip to content

levenshtein: wrap to sha256 for state hashing#24

Closed
tmc wants to merge 3 commits intoblevesearch:masterfrom
tmc:fips140
Closed

levenshtein: wrap to sha256 for state hashing#24
tmc wants to merge 3 commits intoblevesearch:masterfrom
tmc:fips140

Conversation

@tmc
Copy link
Copy Markdown

@tmc tmc commented Jun 9, 2025

This modernizes the dependency management, and replaces the md5 hash in levenshtein which is not allowed in the go fips140 configuration.

This also switches to the upstream mmap-go implementation which has a few fixes not yet in the blevesearch mmap-go fork.

This change has negligible performance impact:

benchstat md5.txt sha256.txt
                                         │   md5.txt   │             sha256.txt              │
                                         │   sec/op    │   sec/op     vs base                │
NewEvalEditDistance1-16                    23.05µ ± 1%   21.95µ ± 1%   -4.76% (p=0.000 n=20)
NewEvalEditDistance2-16                    87.20µ ± 1%   87.35µ ± 5%        ~ (p=0.659 n=20)
NewEditDistance1-16                        23.29µ ± 1%   23.47µ ± 1%        ~ (p=0.174 n=20)
NewEditDistance2-16                        87.40µ ± 0%   89.08µ ± 0%   +1.92% (p=0.000 n=20)
LevenshteinAutomatonBuilder/Distance0-16   1.612µ ± 2%   1.535µ ± 3%   -4.78% (p=0.000 n=20)
LevenshteinAutomatonBuilder/Distance1-16   14.98µ ± 1%   12.16µ ± 1%  -18.82% (p=0.000 n=20)
LevenshteinAutomatonBuilder/Distance2-16   467.4µ ± 1%   372.7µ ± 1%  -20.25% (p=0.000 n=20)
BuildDfa/Distance1-Wordtest-16             14.56µ ± 2%   13.91µ ± 3%   -4.46% (p=0.000 n=20)
BuildDfa/Distance1-Wordfuzzy-16            16.86µ ± 2%   16.51µ ± 2%        ~ (p=0.056 n=20)
BuildDfa/Distance1-Wordsearch-16           21.12µ ± 1%   21.50µ ± 3%        ~ (p=0.199 n=20)
BuildDfa/Distance2-Wordtest-16             30.73µ ± 2%   31.01µ ± 2%        ~ (p=0.794 n=20)
BuildDfa/Distance2-Wordfuzzy-16            35.68µ ± 5%   36.11µ ± 2%        ~ (p=0.565 n=20)
BuildDfa/Distance2-Wordsearch-16           49.75µ ± 3%   52.48µ ± 4%   +5.49% (p=0.000 n=20)
geomean                                    29.54µ        28.47µ        -3.62%

Fixes #23

tmc added 3 commits June 8, 2025 19:05
Given modern go module management, this is not necessary
Replace cryptographically broken MD5 with more secure SHA256 hash function
for state identification in parametric DFA implementation.

Required for FIPS 140 compliance in environments where MD5 is not permitted.
Replace blevesearch/mmap-go with upstream edsrzf/mmap-go.
@CascadingRadium
Copy link
Copy Markdown
Member

superseded by #25

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Go 1.24 FIPS-only mode panics

2 participants