Skip to content

bpf/restart: use Update instead of Restart when versions match#1606

Open
Eeshu-Yadav wants to merge 1 commit intokmesh-net:mainfrom
Eeshu-Yadav:fix/restart-status-version-check-794
Open

bpf/restart: use Update instead of Restart when versions match#1606
Eeshu-Yadav wants to merge 1 commit intokmesh-net:mainfrom
Eeshu-Yadav:fix/restart-status-version-check-794

Conversation

@Eeshu-Yadav
Copy link
Copy Markdown
Contributor

What type of PR is this?

/kind bug

What this PR does / why we need it:

SetStartStatus incorrectly sets the start type to Restart when GitVersion == oldGitVersion. The version hash only tracks the binary version, not BPF map state — maps can change independently at runtime. Equal versions don't prove maps are unchanged.

Changed to Update which reconciles partial state via UpdateMapHandler, rather than Restart which blindly reuses everything.

-SetStartType(Restart)
+SetStartType(Update)

Which issue(s) this PR fixes:

Fixes #794

Copilot AI review requested due to automatic review settings March 3, 2026 18:30
@kmesh-bot kmesh-bot added the kind/bug Something isn't working label Mar 3, 2026
@kmesh-bot
Copy link
Copy Markdown
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign yaozengzeng for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a bug where the system would perform a full Restart operation based solely on matching binary versions, neglecting the independent state of BPF maps. By changing this behavior to an Update operation, the system now correctly reconciles BPF map states, ensuring data consistency and preventing unintended side effects from stale map data.

Highlights

  • Incorrect Start Type Logic: The SetStartStatus function previously used Restart when the current GitVersion matched the oldGitVersion. This incorrectly assumed that BPF map states were consistent, even though map states can change independently of the binary version.
  • Corrected BPF Map Reconciliation: The logic has been updated to use Update instead of Restart when Git versions match. This ensures that partial BPF map states are properly reconciled via UpdateMapHandler, preventing potential issues from stale or mismatched map data.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • pkg/bpf/restart/bpf_restart.go
    • Modified the conditional logic to set the start type to Update instead of Restart when the current Git version matches the old Git version.
    • Updated the associated log message to reflect the change from Restart to Update.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


Versions may align, But maps can still diverge, Update sets things right.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request changes the start type from "Restart" to "Update" when the binary version matches the previous version, aiming to improve the robustness of the restart process by ensuring BPF maps are reconciled. However, this change introduces a critical nil pointer dereference vulnerability in the UpdateMapHandler function, which can lead to a daemon panic and denial of service if the BPF map state on disk is inconsistent. Additionally, a potential refactoring is needed to address duplicated logic introduced by this change. It is recommended to fix the error handling in UpdateMapHandler before merging.

Comment on lines +103 to +104
log.Infof("kmesh start with Update")
SetStartType(Update)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The change on line 104, which sets the start type to Update when versions match, introduces a critical denial-of-service vulnerability. This causes the daemon to execute UpdateMapHandler during startup, where a nil pointer dereference can occur. Specifically, the error from ebpf.LoadPinnedMap is ignored, meaning oldMap could be nil, leading to a panic when oldMap.Unpin() is called. This can crash the Kmesh daemon if the BPF map state is inconsistent. To remediate this, ensure proper error handling for ebpf.LoadPinnedMap in pkg/bpf/restart/bpf_update.go before accessing oldMap. Additionally, this change introduces code duplication, as the logic in this if block is now identical to the else block on lines 109-112. Consider refactoring the conditional logic to remove this duplication for better maintainability.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes incorrect startup mode selection for BPF state handling by switching the “same GitVersion” path in SetStartStatus from Restart (blind reuse) to Update (reconcile pinned map state via UpdateMapHandler), aligning behavior with the fact that BPF maps can change independently of the binary version.

Changes:

  • Change SetStartStatus to set start type to Update (not Restart) when GitVersion == oldGitVersion.
  • Update log message for the “versions match” startup path accordingly.
Comments suppressed due to low confidence (1)

pkg/bpf/restart/bpf_restart.go:112

  • GitVersion == oldGitVersion no longer affects control flow: both the == branch and the final else set Update. This makes the version comparison (and related branching) redundant and potentially confusing for future changes. Consider simplifying to a single if oldGitVersion == 0 { Normal } else { Update } (while keeping any desired logging) so the intent is explicit.
	if GitVersion == oldGitVersion {
		log.Infof("kmesh start with Update")
		SetStartType(Update)
	} else if oldGitVersion == 0 {
		// version not found, it is a fresh start
		log.Infof("kmesh start with Normal")
		SetStartType(Normal)
	} else {
		log.Infof("kmesh start with Update")
		SetStartType(Update)
	}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 102 to 105
if GitVersion == oldGitVersion {
log.Infof("kmesh start with Restart, load bpf maps and prog from last")
SetStartType(Restart)
log.Infof("kmesh start with Update")
SetStartType(Update)
} else if oldGitVersion == 0 {
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, SetStartStatus will never set StartType to Restart when a version map exists (it will be Update for any non-zero oldGitVersion). Downstream code has a case restart.Restart path (e.g., NewVersionMap returns early for Restart), and the existing unit test TestRestart in pkg/bpf/bpf_test.go asserts GetStartType() == restart.Restart on the second start; that will now fail. If the new semantics are intended, please update/remove the Restart start path and adjust tests accordingly (or introduce a different condition that still legitimately yields Restart).

Copilot uses AI. Check for mistakes.
@Eeshu-Yadav Eeshu-Yadav force-pushed the fix/restart-status-version-check-794 branch from e8c5241 to 2b51151 Compare March 3, 2026 18:58
@kmesh-bot kmesh-bot added size/S and removed size/XS labels Mar 3, 2026
@Eeshu-Yadav Eeshu-Yadav force-pushed the fix/restart-status-version-check-794 branch from 2b51151 to ad62761 Compare March 4, 2026 06:26
Copilot AI review requested due to automatic review settings March 4, 2026 10:58
@Eeshu-Yadav Eeshu-Yadav force-pushed the fix/restart-status-version-check-794 branch from ad62761 to 8b09b57 Compare March 4, 2026 10:58
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (2)

pkg/bpf/workload/sendmsg.go:104

  • With SetStartStatus now setting StartType=Update for version matches, this branch will run on common restarts. If the old pinned program at pinPath doesn't exist (e.g., partial/failed previous pin state), LoadPinnedProgram will return an error and abort startup even though there is nothing to unpin. Consider treating a not-exist error as non-fatal (log and continue), so Update can reconcile partial pin state instead of failing hard.
	if restart.GetStartType() == restart.Restart || restart.GetStartType() == restart.Update {
		pinPath := filepath.Join(sm.Info.BpfFsPath, "sendmsg_prog")
		oldSkMsg, err := ebpf.LoadPinnedProgram(pinPath, nil)
		if err != nil {
			log.Errorf("LoadPinnedProgram failed: %v", err)
			return nil, err
		}

pkg/bpf/restart/bpf_restart.go:112

  • SetStartStatus now sets StartType=Update when the git version matches. Several call sites still gate “restart” behavior on StartType==Restart only (e.g., restoring endpoint keys and preloading cached xDS resources), so those paths will no longer run on version-match restarts. Please audit and either (a) treat Update as restart-like in those checks (e.g., Restart || Update), or (b) rework the start-type semantics so Update triggers the same initialization that previously relied on Restart while still running UpdateMapHandler for map reconciliation.
	if GitVersion == oldGitVersion {
		log.Infof("kmesh start with Update")
		SetStartType(Update)
	} else if oldGitVersion == 0 {
		// version not found, it is a fresh start
		log.Infof("kmesh start with Normal")
		SetStartType(Normal)
	} else {
		log.Infof("kmesh start with Update")
		SetStartType(Update)
	}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 4, 2026

Codecov Report

❌ Patch coverage is 0% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 39.48%. Comparing base (9106d3e) to head (96669db).
⚠️ Report is 13 commits behind head on main.

Files with missing lines Patch % Lines
pkg/bpf/restart/bpf_update.go 0.00% 8 Missing ⚠️
pkg/bpf/restart/bpf_restart.go 0.00% 2 Missing ⚠️
pkg/bpf/workload/sendmsg.go 0.00% 1 Missing ⚠️

❌ Your patch check has failed because the patch coverage (0.00%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Files with missing lines Coverage Δ
pkg/bpf/workload/sendmsg.go 0.00% <0.00%> (ø)
pkg/bpf/restart/bpf_restart.go 0.00% <0.00%> (ø)
pkg/bpf/restart/bpf_update.go 0.00% <0.00%> (ø)

... and 2 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5ba31c9...96669db. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Eeshu-Yadav Eeshu-Yadav force-pushed the fix/restart-status-version-check-794 branch from 8b09b57 to c88151e Compare March 4, 2026 11:23
The version hash only tracks the binary version, not BPF map state.
Maps can change independently at runtime, so equal versions do not
guarantee maps are unchanged. Use Update which reconciles partial
state rather than Restart which blindly reuses everything.

Also fix a nil pointer dereference in UpdateMapHandler where
LoadPinnedMap error was discarded but oldMap was used unconditionally
for Unpin/Close.

Signed-off-by: Eeshu-Yadav <eeshuyadav123@gmail.com>
@Eeshu-Yadav Eeshu-Yadav force-pushed the fix/restart-status-version-check-794 branch from c88151e to 96669db Compare March 4, 2026 11:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Something isn't working size/S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SetStartStatus may sets to restart incorrectly

3 participants