Increasing log level for logs that are not necessary for debugging on prod#211
Increasing log level for logs that are not necessary for debugging on prod#211
Conversation
parlakisik
left a comment
There was a problem hiding this comment.
Could you provide more context in the PR details?
Why do you characterize this as noisy? How much did the log volume change after this improvement? Are there any additional logs being generated?
92759a5 to
4752755
Compare
@parlakisik I updated the Description and provided more context. |
| Ports: portList, | ||
| }) | ||
| } | ||
| r.logger.Info("headlessServiceList", headlessServiceList) |
There was a problem hiding this comment.
nit: with consolidating log, we can just remove L429. can add more clear log message here like r.logger.Info("skipping headless services: ", headlessServiceList)
| } | ||
| r.logger.Info("headlessServiceList", headlessServiceList) | ||
| r.logger.Info("nonMatchingPodSelectorList", nonMatchingPodSelectorList) | ||
| r.logger.Info("Network peers appended ", networkPeers) |
There was a problem hiding this comment.
L467 is not needed, since it may add a very large chunk of log lines in every reconcile. we have the skipping list, so it should be fine.
| }) | ||
| } | ||
| r.logger.Info("headlessServiceList", headlessServiceList) | ||
| r.logger.Info("nonMatchingPodSelectorList", nonMatchingPodSelectorList) |
There was a problem hiding this comment.
same here. remove L435 and add more clear log message here.
461463f to
26696f0
Compare
There was a problem hiding this comment.
Pull request overview
This PR reduces production log volume in the resolvers by lowering verbosity for high-churn informational logs and consolidating per-item skip logging into aggregated output.
Changes:
- Downgrade the “Pod and policy namespace mismatch” log to verbosity level
V(1)in the policy reference resolver. - Accumulate “skipped service” reasons while resolving service ClusterIPs and emit them once after iterating services.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
pkg/resolvers/policies_for_pod.go |
Lowers verbosity for namespace-mismatch logging during pod/policy evaluation. |
pkg/resolvers/endpoints.go |
Aggregates skip details for headless and selector-mismatched services during service ClusterIP resolution. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
26696f0 to
aae21b5
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| headlessServiceList = append(headlessServiceList, fmt.Sprintf("%s/%s", svc.Namespace, svc.Name)) | ||
| } | ||
| headlessServiceCount++ | ||
| r.logger.V(1).Info("skipping headless service when populating EndpointInfo", "serviceName", svc.Name, "serviceNamespace", svc.Namespace) |
There was a problem hiding this comment.
The V(1) log inside the loop should be removed. The purpose of this PR is to reduce logging overhead by consolidating logs, but this line still logs for every headless service encountered (at V(1) verbosity). Since the consolidated summary is logged at lines 475-476, this per-iteration log is redundant and should be removed to achieve the intended optimization.
| r.logger.V(1).Info("skipping headless service when populating EndpointInfo", "serviceName", svc.Name, "serviceNamespace", svc.Namespace) |
| nonMatchingPodSelectorList = append(nonMatchingPodSelectorList, fmt.Sprintf("%s/%s (expectedPS: %s)", svc.Namespace, svc.Name, svcSelector.String())) | ||
| } | ||
| nonMatchingPodSelectorCount++ | ||
| r.logger.V(1).Info("skipping pod selector mismatched service when populating EndpointInfo", "serviceName", svc.Name, "serviceNamespace", svc.Namespace, "expectedPS", svcSelector) |
There was a problem hiding this comment.
The V(1) log inside the loop should be removed. Similar to the headless service case, this line still logs for every mismatched service encountered (at V(1) verbosity), which defeats the purpose of consolidating logs. Since the consolidated summary is logged at lines 479-480, this per-iteration log is redundant and should be removed to achieve the intended optimization.
| r.logger.V(1).Info("skipping pod selector mismatched service when populating EndpointInfo", "serviceName", svc.Name, "serviceNamespace", svc.Namespace, "expectedPS", svcSelector) |
There was a problem hiding this comment.
V(1) log is retained for testing will not affect production. should be fine.
| var headlessServiceList []string | ||
| var nonMatchingPodSelectorList []string | ||
| var headlessServiceCount int | ||
| var nonMatchingPodSelectorCount int | ||
| const maxSampleSize = 50 |
There was a problem hiding this comment.
Consider initializing these slices with a capacity (e.g., make([]string, 0, maxSampleSize)) to avoid multiple allocations as items are appended. This is a minor optimization but aligns with Go best practices when the maximum size is known in advance.
| var headlessServiceList []string | |
| var nonMatchingPodSelectorList []string | |
| var headlessServiceCount int | |
| var nonMatchingPodSelectorCount int | |
| const maxSampleSize = 50 | |
| const maxSampleSize = 50 | |
| headlessServiceList := make([]string, 0, maxSampleSize) | |
| nonMatchingPodSelectorList := make([]string, 0, maxSampleSize) | |
| var headlessServiceCount int | |
| var nonMatchingPodSelectorCount int |
aae21b5 to
75d7aaf
Compare
What type of PR is this?
With an increase in pod scale, and high pod churn the "Pod and policy namespace mismatch" log was causing an increase in disk utilization.
Optimizing the following logs to not print in a loop but consolidate the info in a list and print once at the end of the loop.
Which issue does this PR fix:
What does this PR do / Why do we need it:
If an issue # is not available please add steps to reproduce and the controller logs:
Testing done on this change:
Automation added to e2e:
Will this PR introduce any new dependencies?:
Will this break upgrades or downgrades. Has updating a running cluster been tested?:
Does this PR introduce any user-facing change?: No
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.