Skip to content

Increasing log level for logs that are not necessary for debugging on prod#211

Merged
nakshah87 merged 1 commit intoaws:mainfrom
nakshah87:log-verbosity
Feb 26, 2026
Merged

Increasing log level for logs that are not necessary for debugging on prod#211
nakshah87 merged 1 commit intoaws:mainfrom
nakshah87:log-verbosity

Conversation

@nakshah87
Copy link
Copy Markdown
Contributor

@nakshah87 nakshah87 commented Feb 17, 2026

What type of PR is this?
With an increase in pod scale, and high pod churn the "Pod and policy namespace mismatch" log was causing an increase in disk utilization.

  • The code logic walks every policy for every pod CRUD operation and prints this during mismatch.

Optimizing the following logs to not print in a loop but consolidate the info in a list and print once at the end of the loop.

  • skipping headless service when populating EndpointInfo
  • skipping pod selector mismatched service when populating EndpointInfo

Which issue does this PR fix:

What does this PR do / Why do we need it:

If an issue # is not available please add steps to reproduce and the controller logs:

Testing done on this change:

Automation added to e2e:

Will this PR introduce any new dependencies?:

Will this break upgrades or downgrades. Has updating a running cluster been tested?:

Does this PR introduce any user-facing change?: No

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@nakshah87 nakshah87 requested a review from a team as a code owner February 17, 2026 04:28
Comment thread pkg/resolvers/endpoints.go
Comment thread pkg/resolvers/endpoints.go
Comment thread pkg/resolvers/endpoints.go Outdated
Copy link
Copy Markdown
Member

@parlakisik parlakisik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide more context in the PR details?
Why do you characterize this as noisy? How much did the log volume change after this improvement? Are there any additional logs being generated?

@nakshah87
Copy link
Copy Markdown
Contributor Author

Could you provide more context in the PR details? Why do you characterize this as noisy? How much did the log volume change after this improvement? Are there any additional logs being generated?

@parlakisik I updated the Description and provided more context.

Comment thread pkg/resolvers/endpoints.go Outdated
Ports: portList,
})
}
r.logger.Info("headlessServiceList", headlessServiceList)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: with consolidating log, we can just remove L429. can add more clear log message here like r.logger.Info("skipping headless services: ", headlessServiceList)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment thread pkg/resolvers/endpoints.go Outdated
}
r.logger.Info("headlessServiceList", headlessServiceList)
r.logger.Info("nonMatchingPodSelectorList", nonMatchingPodSelectorList)
r.logger.Info("Network peers appended ", networkPeers)
Copy link
Copy Markdown
Contributor

@oliviassss oliviassss Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L467 is not needed, since it may add a very large chunk of log lines in every reconcile. we have the skipping list, so it should be fine.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment thread pkg/resolvers/endpoints.go Outdated
})
}
r.logger.Info("headlessServiceList", headlessServiceList)
r.logger.Info("nonMatchingPodSelectorList", nonMatchingPodSelectorList)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here. remove L435 and add more clear log message here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@nakshah87 nakshah87 force-pushed the log-verbosity branch 2 times, most recently from 461463f to 26696f0 Compare February 24, 2026 03:40
oliviassss
oliviassss previously approved these changes Feb 24, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces production log volume in the resolvers by lowering verbosity for high-churn informational logs and consolidating per-item skip logging into aggregated output.

Changes:

  • Downgrade the “Pod and policy namespace mismatch” log to verbosity level V(1) in the policy reference resolver.
  • Accumulate “skipped service” reasons while resolving service ClusterIPs and emit them once after iterating services.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
pkg/resolvers/policies_for_pod.go Lowers verbosity for namespace-mismatch logging during pod/policy evaluation.
pkg/resolvers/endpoints.go Aggregates skip details for headless and selector-mismatched services during service ClusterIP resolution.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/resolvers/endpoints.go Outdated
Comment thread pkg/resolvers/endpoints.go Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

headlessServiceList = append(headlessServiceList, fmt.Sprintf("%s/%s", svc.Namespace, svc.Name))
}
headlessServiceCount++
r.logger.V(1).Info("skipping headless service when populating EndpointInfo", "serviceName", svc.Name, "serviceNamespace", svc.Namespace)
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The V(1) log inside the loop should be removed. The purpose of this PR is to reduce logging overhead by consolidating logs, but this line still logs for every headless service encountered (at V(1) verbosity). Since the consolidated summary is logged at lines 475-476, this per-iteration log is redundant and should be removed to achieve the intended optimization.

Suggested change
r.logger.V(1).Info("skipping headless service when populating EndpointInfo", "serviceName", svc.Name, "serviceNamespace", svc.Namespace)

Copilot uses AI. Check for mistakes.
nonMatchingPodSelectorList = append(nonMatchingPodSelectorList, fmt.Sprintf("%s/%s (expectedPS: %s)", svc.Namespace, svc.Name, svcSelector.String()))
}
nonMatchingPodSelectorCount++
r.logger.V(1).Info("skipping pod selector mismatched service when populating EndpointInfo", "serviceName", svc.Name, "serviceNamespace", svc.Namespace, "expectedPS", svcSelector)
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The V(1) log inside the loop should be removed. Similar to the headless service case, this line still logs for every mismatched service encountered (at V(1) verbosity), which defeats the purpose of consolidating logs. Since the consolidated summary is logged at lines 479-480, this per-iteration log is redundant and should be removed to achieve the intended optimization.

Suggested change
r.logger.V(1).Info("skipping pod selector mismatched service when populating EndpointInfo", "serviceName", svc.Name, "serviceNamespace", svc.Namespace, "expectedPS", svcSelector)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V(1) log is retained for testing will not affect production. should be fine.

Comment on lines +408 to +412
var headlessServiceList []string
var nonMatchingPodSelectorList []string
var headlessServiceCount int
var nonMatchingPodSelectorCount int
const maxSampleSize = 50
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider initializing these slices with a capacity (e.g., make([]string, 0, maxSampleSize)) to avoid multiple allocations as items are appended. This is a minor optimization but aligns with Go best practices when the maximum size is known in advance.

Suggested change
var headlessServiceList []string
var nonMatchingPodSelectorList []string
var headlessServiceCount int
var nonMatchingPodSelectorCount int
const maxSampleSize = 50
const maxSampleSize = 50
headlessServiceList := make([]string, 0, maxSampleSize)
nonMatchingPodSelectorList := make([]string, 0, maxSampleSize)
var headlessServiceCount int
var nonMatchingPodSelectorCount int

Copilot uses AI. Check for mistakes.
@nakshah87 nakshah87 merged commit d977c0e into aws:main Feb 26, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants