OptiPod is an open-source Kubernetes operator that makes explainable recommendations for CPU and memory requests/limits, and can apply them when you explicitly opt in.
If you're new to operators or autoscaling: OptiPod is designed to be calm and safe to try.
- Recommend mode is the safe way to start.
- Nothing mutates without opt-in (you must explicitly choose
mode: Auto).
Kubernetes resources are often set once and never revisited. Teams want to tighten requests/limits, but hesitate because:
- GitOps controllers can fight with automated mutations.
- Memory tuning can cause OOMKills and noisy rollouts.
- It’s hard to trust a tool if you can’t explain its recommendations.
OptiPod exists to provide a GitOps-safe, policy-driven way to recommend first, then apply when you’re ready.
- Will not mutate workloads unless explicitly configured
- Will not override GitOps ownership
- Will not blindly reduce memory
- Will not require a SaaS backend
In Recommend mode:
- No resources are changed.
- No pods are restarted.
- Recommendations are written as annotations on individual workloads for review.
- Policy status shows aggregate counts (workloads discovered/processed) but not individual recommendations.
Recommended: Helm Installation
# Install from the OCI registry (recommended)
# Omit --version to install the latest chart.
VERSION=<latest> # see https://github.com/Sagart-cactus/optipod/releases/latest
helm install optipod oci://ghcr.io/sagart-cactus/charts/optipod \
--version "${VERSION}" \
--namespace optipod-system \
--create-namespaceAlternative: kubectl
# For ArgoCD/GitOps environments (webhook strategy)
kubectl apply -f https://github.com/Sagart-cactus/optipod/releases/latest/download/install-webhook.yaml
# For traditional Kubernetes environments (SSA strategy)
kubectl apply -f https://github.com/Sagart-cactus/optipod/releases/latest/download/install.yamlAutomated installation script:
curl -sSL https://raw.githubusercontent.com/Sagart-cactus/optipod/main/config/webhook/install.sh | bashSave as optipod-policy.yaml:
apiVersion: optipod.optipod.io/v1alpha1
kind: OptimizationPolicy
metadata:
name: safe-recommendations
namespace: default
spec:
mode: Recommend
selector:
workloadSelector:
matchLabels:
optipod.io/enabled: "true"
metricsConfig:
provider: metrics-server
rollingWindow: 24h
percentile: P90
safetyFactor: 1.2
resourceBounds:
cpu:
min: "100m"
max: "4000m"
memory:
min: "128Mi"
max: "8Gi"
updateStrategy:
allowInPlaceResize: true
allowRecreate: false
updateRequestsOnly: trueApply it:
kubectl apply -f optipod-policy.yamlkubectl label deployment my-app optipod.io/enabled=true
kubectl describe optimizationpolicy safe-recommendations -n default
# View individual workload recommendations:
kubectl get deployment my-app -o yaml | grep -A5 -B5 "optipod.io/recommendation"Safety confirmation: as long as spec.mode: Recommend, OptiPod will not change workload specs.
OptiPod stores recommendations in two places:
Each workload (Deployment, StatefulSet, DaemonSet) gets recommendation annotations:
metadata:
annotations:
optipod.io/managed: "true"
optipod.io/policy: "safe-recommendations"
optipod.io/last-recommendation: "2025-01-04T10:30:00Z"
optipod.io/recommendation.app-container.cpu-request: "250m"
optipod.io/recommendation.app-container.memory-request: "512Mi"
optipod.io/recommendation.app-container.cpu-limit: "500m" # Present when updateRequestsOnly=false
optipod.io/recommendation.app-container.memory-limit: "1Gi" # Present when updateRequestsOnly=falseThe OptimizationPolicy status shows aggregate counts, not individual recommendations:
status:
workloadsDiscovered: 150
workloadsProcessed: 145
workloadsByType:
deployment: 120
statefulset: 25
daemonset: 5
lastReconciliation: "2025-01-04T10:30:00Z"This design keeps the policy status lightweight while making individual recommendations visible on each workload for GitOps workflows.
- Discovers workloads selected by your policy (namespaces, labels, workload types)
- Reads CPU/memory usage from your metrics backend
- Computes recommendations (percentiles over a rolling window + safety factor)
- Applies policy-driven safety (bounds, change controls, memory safeguards)
- Either:
- writes explainable recommendations as annotations on individual workloads (Recommend mode), or
- applies changes via Server-Side Apply (SSA) (Auto mode)
| Term | Meaning (one line) |
|---|---|
| Operator | A controller that continuously reconciles desired state in Kubernetes. |
| GitOps | A workflow where cluster state is driven from Git (e.g. ArgoCD/Flux). |
| Server-Side Apply (SSA) | Kubernetes apply mode that tracks field ownership to avoid conflicts. |
| VPA | Vertical Pod Autoscaler; recommends (and can apply) resource changes. |
- Recommend: Compute and record recommendations; do not mutate workloads.
- Auto: Apply recommendations (within your safety policy) using the configured strategy.
- Disabled: Stop processing workloads under the policy.
OptiPod supports two strategies for applying resource recommendations:
Best for: ArgoCD, GitOps workflows, environments without SSA permissions
- ✅ ArgoCD Compatible: Works seamlessly with GitOps tools
- ✅ No SSA Required: Doesn't need Server Side Apply permissions
- ✅ Automatic Application: Applies changes during pod creation
- ❌ Infrastructure Required: Needs webhook server and certificates
spec:
updateStrategy:
strategy: webhook # Use webhook strategy
rolloutStrategy: onNextRestart # Control when changes take effectBest for: Direct Kubernetes API access, environments with full SSA permissions
- ✅ Direct Updates: Immediate API updates with Server Side Apply
- ✅ No Infrastructure: No additional components required
- ❌ ArgoCD Conflicts: May conflict with GitOps tools
- ❌ SSA Required: Needs Server Side Apply permissions
spec:
updateStrategy:
strategy: ssa # Use SSA strategy
useServerSideApply: true # Enable SSA featuresOptiPod is built around conservative defaults and explicit, policy-driven controls:
- Policy-driven safety: min/max bounds, safety factors, and (where configured) change-rate limits.
- Conservative memory handling: avoids “blind” memory reductions; requires explicit bounds/constraints.
- GitOps-safe: supports both webhook strategy (ArgoCD compatible) and Server-Side Apply (SSA) for different environments.
- Explainable recommendations: usage window, percentile choice, and safety margin are visible before applying.
- Update strategy control: allow/disallow in-place resize; block disruptive recreation unless you opt in.
Legend: ✅ supported, ❌ not supported,
| Capability | OptiPod | Kubernetes VPA |
|---|---|---|
| GitOps-safe (ArgoCD compatible) | ✅ | ❌ |
| Server-Side Apply (SSA) support | ✅ | ❌ |
| Safe by default (Recommend mode) | ✅ | |
| Explainable recommendations | ✅ | |
| Policy-driven safety | ✅ | |
| Multiple update strategies | ✅ | ❌ |
- Platform teams running GitOps-managed clusters
- SREs who want safer, Recommend mode first workflows
- FinOps partners who need guardrails and visibility (without a SaaS dependency)
- Teams who want to start with recommendations and adopt automation gradually
- Metrics backends:
metrics-serverandprometheus - Exposes Prometheus metrics from the controller
- Emits Kubernetes events for important actions and failures
- Writes recommendations as annotations on workloads and aggregate results to
OptimizationPolicystatus
Before you change a policy to mode: Auto, you can generate a report that summarizes the replica-weighted impact (total CPU/memory request deltas across pods) based on OptiPod’s recommendations.
This is especially useful to answer: “If I opt in to Auto, what will change, and by how much?”
Generate an HTML report:
curl -fsSL https://raw.githubusercontent.com/Sagart-cactus/optipod/main/scripts/optipod-recommendation-report.sh -o optipod-recommendation-report.sh
chmod +x optipod-recommendation-report.sh
./optipod-recommendation-report.sh -o html -f optipod-impact.htmlGenerate a JSON report for programmatic analysis:
./optipod-recommendation-report.sh -o json -f optipod-impact.jsonThe report only includes workloads that have OptiPod recommendation annotations (generated in Recommend mode), and highlights warnings (for example: when updateRequestsOnly=true but recommended requests would exceed existing limits).
How the impact report works:
- Scans all workloads in the cluster for OptiPod recommendation annotations
- Consolidates individual workload recommendations into aggregate totals
- Calculates replica-weighted impact (recommendation delta × number of pods)
- Provides both per-workload details and cluster-wide summaries
- Available in JSON format for automation or HTML for human review
OptiPod has core functionality implemented and tested, and is in active development.
-
Production-ready features
- GitOps-safe Server-Side Apply (SSA): OptiPod claims ownership only of CPU/memory request/limit fields.
- Multiple operational modes: Recommend / Auto / Disabled.
- Policy-driven safety: bounds, safety factors, and controlled application strategies.
- Explainable recommendations: visible inputs and margins before applying.
- Observability: Prometheus metrics, Kubernetes events, and detailed policy status.
-
Work in progress
- Per-policy metrics provider selection (currently configured globally)
- Custom metrics provider plugin framework
-
Documentation
-
Docs:
docs/(source) andwebsite/docs/(rendered site) -
Roadmap: ROADMAP.md
- Contributing guide: CONTRIBUTING.md
- Governance: GOVERNANCE.md
- Code of Conduct: CODE_OF_CONDUCT.md
git clone https://github.com/Sagart-cactus/optipod.git
cd optipod
make setup-pre-commit
make testmake build
make runSee ROADMAP.md for planned work and explicit non-goals.
Apache 2.0. See LICENSE.
Start in Recommend mode, set conservative bounds, and validate recommendations in a non-production environment before enabling mode: Auto.
