You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The activity-feed skip filter shipped in #35 passes through kind: "skip" events with reason no remediation target supports this rule whenever the workload is in config/remediation-targets.json but the specific rule is not listed in that target's supported_rules. The PRD for the feature (/tmp/co-activity-prd.md, "v1 must-do" item 1) called for that reason to be aggregated as a count in engine_status, not emitted as a per-event row, because it's noise — the operator can't act on it without editing the targets file. Observed live, 43% (6 of 14) of skip events in the verification CronJob run carried this noisy reason.
Reproduction
# After the verification CronJob run that landed shortly after deploy:
curl -s 'http://127.0.0.1:18088/api/remediations/history?cluster_id=default&limit=50' \
| jq '.events[] | select(.kind=="skip") | .reason' \
| sort | uniq -c | sort -rn
# Expected: only "confidence ... below minimum ..." and similar actionable reasons.# Actual:# 8 "confidence \"low\" below minimum \"high\""# 6 "no remediation target supports this rule"
The 6 noisy entries are workloads like Deployment/nightlamp-api (in the targets allowlist with supported_rules: [cpu-hpa-low-request-sensitive, runtime-modernization-candidate]) being skipped for cpu-request-over-provisioned and memory-request-over-provisioned. Both rules aren't in supported_rules so the planner correctly skips them — but per the PRD the operator shouldn't see those skips.
Root cause
cmd/cluster-optimizer/main.goskipperEvents() filters by cls.TargetFor(ns, workload) only — it accepts any skip for a workload that has a target row, regardless of whether the rule is in supported_rules. The PRD-intended filter is cls.IsRemediable(ruleID, ns, workload), which is what internal/plan/plan.go already uses to gate no remediation target supports this rule.
Fix
Swap the filter in skipperEvents:
// cmd/cluster-optimizer/main.gofuncskipperEvents(skipped []plan.SkippedReason, ts time.Time, cls*classifier.Classifier) []store.RemediationEvent {
events:=make([]store.RemediationEvent, 0, len(skipped))
for_, skip:=rangeskipped {
ifcls==nil {
continue
}
// Only surface skips for (workload, rule) pairs the operator could// have remediated had the other gates (confidence, persistence,// safe trim) passed. Workloads whose target doesn't list the rule// produce a per-run count, not a per-event row.if!cls.IsRemediable(skip.RuleID, skip.Namespace, skip.Workload) {
continue
}
events=append(events, store.RemediationEvent{...})
}
returnevents
}
Impact
Severity: MEDIUM. Feature works; the feed is noisier than the design intended (~43% of skip events are non-actionable).
No data loss, no crash, no incorrect remediation behavior.
Operator can ignore the rows but they push useful skip rows further down on each run.
Recommended regression test
Add a Go unit test in cmd/cluster-optimizer/main_test.go (if absent — create) that constructs a classifier with one target (rules=[memory-request-below-usage]), passes synthetic SkippedReasons for that workload with both an in-list rule and an out-of-list rule, and asserts only the in-list one survives skipperEvents().
Summary
The activity-feed skip filter shipped in #35 passes through
kind: "skip"events with reasonno remediation target supports this rulewhenever the workload is inconfig/remediation-targets.jsonbut the specific rule is not listed in that target'ssupported_rules. The PRD for the feature (/tmp/co-activity-prd.md, "v1 must-do" item 1) called for that reason to be aggregated as a count inengine_status, not emitted as a per-event row, because it's noise — the operator can't act on it without editing the targets file. Observed live, 43% (6 of 14) of skip events in the verification CronJob run carried this noisy reason.Reproduction
The 6 noisy entries are workloads like
Deployment/nightlamp-api(in the targets allowlist withsupported_rules: [cpu-hpa-low-request-sensitive, runtime-modernization-candidate]) being skipped forcpu-request-over-provisionedandmemory-request-over-provisioned. Both rules aren't insupported_rulesso the planner correctly skips them — but per the PRD the operator shouldn't see those skips.Root cause
cmd/cluster-optimizer/main.goskipperEvents()filters bycls.TargetFor(ns, workload)only — it accepts any skip for a workload that has a target row, regardless of whether the rule is insupported_rules. The PRD-intended filter iscls.IsRemediable(ruleID, ns, workload), which is whatinternal/plan/plan.goalready uses to gateno remediation target supports this rule.Fix
Swap the filter in
skipperEvents:Impact
Recommended regression test
Add a Go unit test in
cmd/cluster-optimizer/main_test.go(if absent — create) that constructs a classifier with one target (rules=[memory-request-below-usage]), passes syntheticSkippedReasons for that workload with both an in-list rule and an out-of-list rule, and asserts only the in-list one survivesskipperEvents().Environment
ghcr.io/gipsychef/cluster-optimizer:197e25621fa8b8173424b1e34d4ad1cda1b68ba7cluster-optimizer/cluster-optimizer-deploy-verify(2026-05-24T02:29:26Z)🤖 Generated with Claude Code