Skip to content

tests(utils): add unit tests for murmurhash3 implementation#1589

Open
shivansh-gohem wants to merge 2 commits intokmesh-net:mainfrom
shivansh-gohem:add-murmur3-tests
Open

tests(utils): add unit tests for murmurhash3 implementation#1589
shivansh-gohem wants to merge 2 commits intokmesh-net:mainfrom
shivansh-gohem:add-murmur3-tests

Conversation

@shivansh-gohem
Copy link
Copy Markdown
Contributor

Description:
Adds comprehensive table-driven unit tests for the MurmurHash3 implementation in pkg/utils/hash.

Test Coverage includes:

(i)Empty inputs

(ii)Inputs smaller than the chunk size

(iii)Inputs matching exact chunk boundaries

(iv)Large inputs with special characters

Fixes #1528

Copilot AI review requested due to automatic review settings February 23, 2026 18:58
@kmesh-bot
Copy link
Copy Markdown
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign yaozengzeng for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @shivansh-gohem, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the testing suite for the MurmurHash3 implementation. By introducing a variety of new test cases, including edge cases like empty inputs and large data sets, and refining existing tests for core components, the changes ensure the hash function behaves correctly and predictably across different scenarios. This enhancement contributes to the overall reliability and stability of the hashing utility.

Highlights

  • Enhanced Test Coverage: Comprehensive table-driven unit tests were added for the MurmurHash3 implementation, addressing various input scenarios.
  • New Standard Vector Test: A new test case, TestHash128_StandardVector, was introduced to verify the Murmur3 x64 128-bit hash for an empty string with a zero seed, ensuring adherence to standard expectations.
  • Improved fmix64 Testing: The TestFmix64 function was refined to explicitly check for deterministic behavior and the correct output when the input is zero, enhancing the robustness of the hash finalization step.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • pkg/utils/hash/murmur3_test.go
    • Added TestHash128_StandardVector to validate the hash of an empty string with a zero seed.
    • Modified TestFmix64 to remove the expected field from test cases and explicitly verify hash determinism and the zero input case.
    • Updated a comment in TestHash128_LargeInput to clarify testing for chunking.
    • Adjusted the formatting of the license header URL.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


Hash bits swiftly flow, Tests ensure the truth holds fast, Code's strength now confirmed.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds unit tests for the MurmurHash3 implementation. The changes are generally good, but I've found a critical issue in the TestFmix64 test. The test has been modified to only check for determinism and the zero-input case, removing the check for correctness against expected values. This significantly weakens the test and should be addressed by converting it into a proper table-driven test with known-good output values.

Comment on lines 125 to 159
func TestFmix64(t *testing.T) {
tests := []struct {
name string
input uint64
expected uint64
name string
input uint64
}{
{
name: "zero",
input: 0,
expected: 0,
name: "zero",
input: 0,
},
{
name: "max value",
input: 0xffffffffffffffff,
expected: 0x49a3af502d8a9f23, // This would need to be calculated/verified
name: "max value",
input: 0xffffffffffffffff,
},
{
name: "test value",
input: 0x123456789abcdef0,
expected: 0, // This would need to be calculated/verified - testing for determinism instead
name: "test value",
input: 0x123456789abcdef0,
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := fmix64(tt.input)

// Test deterministic behavior
result1 := fmix64(tt.input)
result2 := fmix64(tt.input)
if result != result2 {
t.Errorf("fmix64(0x%x) is not deterministic: 0x%x != 0x%x", tt.input, result, result2)

// 1. Test deterministic behavior (Hash stability)
if result1 != result2 {
t.Errorf("fmix64(0x%x) is not deterministic: 0x%x != 0x%x", tt.input, result1, result2)
}

// 2. Test mathematical zero behavior
if tt.input == 0 && result1 != 0 {
t.Errorf("fmix64(0) should be 0, got 0x%x", result1)
}
})
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The current implementation of TestFmix64 only checks for determinism and the zero-input case. It does not verify the correctness of the fmix64 function's output against known values. This significantly weakens the test suite and could allow incorrect changes to the fmix64 implementation to pass the tests.

To make this a comprehensive unit test, it should be a table-driven test that compares the function's output with pre-calculated, known-good expected values.

func TestFmix64(t *testing.T) {
	tests := []struct {
		name     string
		input    uint64
		expected uint64
	}{
		{
			name:     "zero input",
			input:    0,
			expected: 0,
		},
		{
			name:     "all ones",
			input:    0xffffffffffffffff,
			expected: 0x64b5720b4b825f21,
		},
		{
			name:     "test value 1",
			input:    0x0123456789abcdef,
			expected: 0x87cbfbfe89022cea,
		},
		{
			name:     "test value 2",
			input:    0xfedcba9876543210,
			expected: 0x03ebebcc1f4a6fd7,
		},
	}

	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			result := fmix64(tt.input)
			if result != tt.expected {
				t.Errorf("fmix64(0x%x) = 0x%x, want 0x%x",
					tt.input, result, tt.expected)
			}
		})
	}
}

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds/updates unit tests for the MurmurHash3 x64 128-bit implementation in pkg/utils/hash, aiming to improve correctness and coverage for hashing helpers.

Changes:

  • Added a new “standard vector” test for Hash128 on empty input with seed 0.
  • Tweaked commentary for the large-input Hash128 test.
  • Modified TestFmix64 to remove expected-output assertions and instead check determinism and 0 -> 0 behavior.
Comments suppressed due to low confidence (1)

pkg/utils/hash/murmur3_test.go:74

  • PR description mentions adding “comprehensive table-driven unit tests” including exact chunk boundaries and large inputs with special characters, but the new/updated tests in this file are mostly property-style (determinism / non-zero) and don’t add the specific vectors described. Please either expand these tests to include the described table-driven cases with explicit expected outputs, or adjust the PR description to match what’s actually being added.
// Test standard Murmur3 universal vector
func TestHash128_StandardVector(t *testing.T) {
	// The Murmur3 x64 128-bit hash of an empty string with seed 0 is guaranteed to be (0, 0)
	h1, h2 := Hash128([]byte(""), 0)
	if h1 != 0 || h2 != 0 {
		t.Errorf("Standard vector for empty string failed. Expected (0, 0), got (%d, %d)", h1, h2)
	}
}

func TestHash128_DifferentSeeds(t *testing.T) {
	input := []byte("test input")

	h1_seed0, h2_seed0 := Hash128(input, 0)
	h1_seed1, h2_seed1 := Hash128(input, 1)

	if (h1_seed0 == h1_seed1) && (h2_seed0 == h2_seed1) {
		t.Errorf("Different seeds produced same hash: (%d, %d)", h1_seed0, h2_seed0)
	}
}

func TestHash128_TailCases(t *testing.T) {
	seed := uint32(0)

	// Test all possible tail lengths (1-15 bytes after 16-byte blocks)
	for i := 1; i <= 15; i++ {
		input := make([]byte, 16+i) // 16 bytes + i tail bytes
		for j := range input {
			input[j] = byte(j)
		}

		h1, h2 := Hash128(input, seed)

		// Test deterministic behavior
		h1_2, h2_2 := Hash128(input, seed)
		if h1 != h1_2 || h2 != h2_2 {
			t.Errorf("Hash128 with tail length %d is not deterministic", i)
		}
	}
}

func TestHash128_LargeInput(t *testing.T) {
	// Test with large input to ensure no issues with memory/chunking
	largeInput := make([]byte, 10000)
	for i := range largeInput {
		largeInput[i] = byte(i % 256)
	}

	h1, h2 := Hash128(largeInput, 0)
	if h1 == 0 && h2 == 0 {
		t.Errorf("Hash128 with large input returned (0, 0)")
	}
}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 125 to 158
func TestFmix64(t *testing.T) {
tests := []struct {
name string
input uint64
expected uint64
name string
input uint64
}{
{
name: "zero",
input: 0,
expected: 0,
name: "zero",
input: 0,
},
{
name: "max value",
input: 0xffffffffffffffff,
expected: 0x49a3af502d8a9f23, // This would need to be calculated/verified
name: "max value",
input: 0xffffffffffffffff,
},
{
name: "test value",
input: 0x123456789abcdef0,
expected: 0, // This would need to be calculated/verified - testing for determinism instead
name: "test value",
input: 0x123456789abcdef0,
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := fmix64(tt.input)

// Test deterministic behavior
result1 := fmix64(tt.input)
result2 := fmix64(tt.input)
if result != result2 {
t.Errorf("fmix64(0x%x) is not deterministic: 0x%x != 0x%x", tt.input, result, result2)

// 1. Test deterministic behavior (Hash stability)
if result1 != result2 {
t.Errorf("fmix64(0x%x) is not deterministic: 0x%x != 0x%x", tt.input, result1, result2)
}

// 2. Test mathematical zero behavior
if tt.input == 0 && result1 != 0 {
t.Errorf("fmix64(0) should be 0, got 0x%x", result1)
}
})
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestFmix64 currently only checks determinism (calling fmix64 twice) and the 0 -> 0 behavior. Determinism is a very weak property test here and won’t catch many incorrect implementations (e.g., returning a constant for all non-zero inputs). Either add known-good expected outputs (test vectors) for the non-zero inputs, or remove/rename this test since pkg/utils/hash/hash_test.go already has stronger fmix64 vector coverage.

Copilot uses AI. Check for mistakes.
Comment thread pkg/utils/hash/murmur3_test.go Outdated
* You may obtain a copy of the License at:
*
* http://www.apache.org/licenses/LICENSE-2.0
* http://www.apache.org/licenses/LICENSE-2.0
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

License header URL indentation was changed to * http://..., but other files in this package keep it aligned as * http://... (e.g., pkg/utils/hash/murmur3.go:8, pkg/utils/hash/hash_test.go:8). Please keep the standard header formatting consistent across the repo.

Suggested change
* http://www.apache.org/licenses/LICENSE-2.0
* http://www.apache.org/licenses/LICENSE-2.0

Copilot uses AI. Check for mistakes.
Comment thread pkg/utils/hash/murmur3_test.go Outdated
Comment on lines +23 to +31
// Test standard Murmur3 universal vector
func TestHash128_StandardVector(t *testing.T) {
// The Murmur3 x64 128-bit hash of an empty string with seed 0 is guaranteed to be (0, 0)
h1, h2 := Hash128([]byte(""), 0)
if h1 != 0 || h2 != 0 {
t.Errorf("Standard vector for empty string failed. Expected (0, 0), got (%d, %d)", h1, h2)
}
}

Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestHash128_StandardVector duplicates existing vector coverage already asserted in pkg/utils/hash/hash_test.go (it includes the empty input/seed 0 case with expected (0,0), plus additional known vectors). To avoid duplicated test cases drifting over time, consider either removing this test or folding any new vectors into the existing table-driven tests instead.

Suggested change
// Test standard Murmur3 universal vector
func TestHash128_StandardVector(t *testing.T) {
// The Murmur3 x64 128-bit hash of an empty string with seed 0 is guaranteed to be (0, 0)
h1, h2 := Hash128([]byte(""), 0)
if h1 != 0 || h2 != 0 {
t.Errorf("Standard vector for empty string failed. Expected (0, 0), got (%d, %d)", h1, h2)
}
}

Copilot uses AI. Check for mistakes.
@shivansh-gohem shivansh-gohem force-pushed the add-murmur3-tests branch 2 times, most recently from 0bde398 to e0620b5 Compare February 23, 2026 19:13
Copilot AI review requested due to automatic review settings February 23, 2026 19:13
@kmesh-bot kmesh-bot added size/S and removed size/M labels Feb 23, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@LiZhenCheng9527
Copy link
Copy Markdown
Contributor

/retest

Comment thread pkg/utils/hash/murmur3_test.go Outdated
Signed-off-by: Shivansh Sahu <sahushivansh142@gmail.com>
Copilot AI review requested due to automatic review settings February 24, 2026 12:46
@shivansh-gohem
Copy link
Copy Markdown
Contributor Author

@LiZhenCheng9527 I've removed those extra test cases as suggested and updated the PR. Could you please take another look?

Also, as a heads-up, the CI build keeps failing on Test_linkXdp in the pkg/controller/manage package, which seems like an unrelated flaky test in the infrastructure. The local pkg/utils/hash tests are passing cleanly!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/utils/hash/murmur3_test.go Outdated
Comment on lines 128 to 129
name: "all ones",
input: 0xffffffffffffffff,
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expected hash value for the 'all ones' input should be verified against a reference implementation or documented calculation. Without verification, this test may pass with an incorrect expected value, reducing test reliability.

Suggested change
name: "all ones",
input: 0xffffffffffffffff,
name: "all ones",
input: 0xffffffffffffffff,
// Expected value verified against the reference MurmurHash3 fmix64
// implementation (see MurmurHash3.cpp in the smhasher project).

Copilot uses AI. Check for mistakes.
Comment on lines +137 to +139
if result != tt.expected {
t.Errorf("fmix64(0x%x) = 0x%x, want 0x%x",
tt.input, result, tt.expected)
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the determinism check eliminates validation that the function produces consistent results across multiple calls. Consider adding a separate test to verify deterministic behavior, as this is an important property for hash functions.

Copilot uses AI. Check for mistakes.
@shivansh-gohem
Copy link
Copy Markdown
Contributor Author

@LiZhenCheng9527 i think the flaky behaviour by build is not curable from my side!! i have resolved all the required changes!

@LiZhenCheng9527
Copy link
Copy Markdown
Contributor

@LiZhenCheng9527 i think the flaky behaviour by build is not curable from my side!! i have resolved all the required changes!

Then I don't understand what the point of this PR is.
In my opinion, "max value" is more appropriate than "all one."

@shivansh-gohem
Copy link
Copy Markdown
Contributor Author

Hi @LiZhenCheng9527, my apologies for the back-and-forth. I have reverted the name back to 'max value' in the latest commit as you requested! (I am ignoring the AI bot suggestions here to focus on your feedback)

Signed-off-by: Shivansh Sahu <sahushivansh142@gmail.com>
@shivansh-gohem
Copy link
Copy Markdown
Contributor Author

@LiZhenCheng9527 To clarify the overall goal of this PR: it aims to improve the unit test coverage for pkg/utils/hash by covering edge cases like empty inputs, specific chunk boundaries, and large inputs, ensuring the MurmurHash3 implementation remains stable against regressions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tests(utils): add unit tests for murmurhash3 implementation

4 participants