OptiPod

OptiPod is an open-source Kubernetes operator that makes explainable recommendations for CPU and memory requests/limits, and can apply them when you explicitly opt in.

If you're new to operators or autoscaling: OptiPod is designed to be calm and safe to try.

Recommend mode is the safe way to start.
Nothing mutates without opt-in (you must explicitly choose mode: Auto).

Why OptiPod exists

Kubernetes resources are often set once and never revisited. Teams want to tighten requests/limits, but hesitate because:

GitOps controllers can fight with automated mutations.
Memory tuning can cause OOMKills and noisy rollouts.
It’s hard to trust a tool if you can’t explain its recommendations.

OptiPod exists to provide a GitOps-safe, policy-driven way to recommend first, then apply when you’re ready.

What OptiPod will NOT do

Will not mutate workloads unless explicitly configured
Will not override GitOps ownership
Will not blindly reduce memory
Will not require a SaaS backend

Quick Start (safe Recommend mode)

In Recommend mode:

No resources are changed.
No pods are restarted.
Recommendations are written as annotations on individual workloads for review.
Policy status shows aggregate counts (workloads discovered/processed) but not individual recommendations.

1) Install

Recommended: Helm Installation

# Install from the OCI registry (recommended)
# Omit --version to install the latest chart.
VERSION=<latest> # see https://github.com/Sagart-cactus/optipod/releases/latest
helm install optipod oci://ghcr.io/sagart-cactus/charts/optipod \
  --version "${VERSION}" \
  --namespace optipod-system \
  --create-namespace

Alternative: kubectl

# For ArgoCD/GitOps environments (webhook strategy)
kubectl apply -f https://github.com/Sagart-cactus/optipod/releases/latest/download/install-webhook.yaml

# For traditional Kubernetes environments (SSA strategy)
kubectl apply -f https://github.com/Sagart-cactus/optipod/releases/latest/download/install.yaml

Automated installation script:

curl -sSL https://raw.githubusercontent.com/Sagart-cactus/optipod/main/config/webhook/install.sh | bash

2) Create a minimal policy (Recommend mode)

Save as optipod-policy.yaml:

apiVersion: optipod.optipod.io/v1alpha1
kind: OptimizationPolicy
metadata:
  name: safe-recommendations
  namespace: default
spec:
  mode: Recommend

  selector:
    workloadSelector:
      matchLabels:
        optipod.io/enabled: "true"

  metricsConfig:
    provider: metrics-server
    rollingWindow: 24h
    percentile: P90
    safetyFactor: 1.2

  resourceBounds:
    cpu:
      min: "100m"
      max: "4000m"
    memory:
      min: "128Mi"
      max: "8Gi"

  updateStrategy:
    allowInPlaceResize: true
    allowRecreate: false
    updateRequestsOnly: true

Apply it:

kubectl apply -f optipod-policy.yaml

3) Label a workload and review recommendations

kubectl label deployment my-app optipod.io/enabled=true
kubectl describe optimizationpolicy safe-recommendations -n default
# View individual workload recommendations:
kubectl get deployment my-app -o yaml | grep -A5 -B5 "optipod.io/recommendation"

Safety confirmation: as long as spec.mode: Recommend, OptiPod will not change workload specs.

Where recommendations are stored

OptiPod stores recommendations in two places:

Individual Workload Annotations

Each workload (Deployment, StatefulSet, DaemonSet) gets recommendation annotations:

metadata:
  annotations:
    optipod.io/managed: "true"
    optipod.io/policy: "safe-recommendations"
    optipod.io/last-recommendation: "2025-01-04T10:30:00Z"
    optipod.io/recommendation.app-container.cpu-request: "250m"
    optipod.io/recommendation.app-container.memory-request: "512Mi"
    optipod.io/recommendation.app-container.cpu-limit: "500m"     # Present when updateRequestsOnly=false
    optipod.io/recommendation.app-container.memory-limit: "1Gi"   # Present when updateRequestsOnly=false

Policy Status (Aggregate Only)

The OptimizationPolicy status shows aggregate counts, not individual recommendations:

status:
  workloadsDiscovered: 150
  workloadsProcessed: 145
  workloadsByType:
    deployment: 120
    statefulset: 25
    daemonset: 5
  lastReconciliation: "2025-01-04T10:30:00Z"

This design keeps the policy status lightweight while making individual recommendations visible on each workload for GitOps workflows.

What OptiPod actually does (high-level flow)

Discovers workloads selected by your policy (namespaces, labels, workload types)
Reads CPU/memory usage from your metrics backend
Computes recommendations (percentiles over a rolling window + safety factor)
Applies policy-driven safety (bounds, change controls, memory safeguards)
Either:
- writes explainable recommendations as annotations on individual workloads (Recommend mode), or
- applies changes via Server-Side Apply (SSA) (Auto mode)

Key concepts / terminology

Term	Meaning (one line)
Operator	A controller that continuously reconciles desired state in Kubernetes.
GitOps	A workflow where cluster state is driven from Git (e.g. ArgoCD/Flux).
Server-Side Apply (SSA)	Kubernetes apply mode that tracks field ownership to avoid conflicts.
VPA	Vertical Pod Autoscaler; recommends (and can apply) resource changes.

Operational modes

Recommend: Compute and record recommendations; do not mutate workloads.
Auto: Apply recommendations (within your safety policy) using the configured strategy.
Disabled: Stop processing workloads under the policy.

Update Strategies

OptiPod supports two strategies for applying resource recommendations:

Webhook Strategy (Default)

Best for: ArgoCD, GitOps workflows, environments without SSA permissions

✅ ArgoCD Compatible: Works seamlessly with GitOps tools
✅ No SSA Required: Doesn't need Server Side Apply permissions
✅ Automatic Application: Applies changes during pod creation
❌ Infrastructure Required: Needs webhook server and certificates

spec:
  updateStrategy:
    strategy: webhook                    # Use webhook strategy
    rolloutStrategy: onNextRestart      # Control when changes take effect

SSA Strategy (Traditional)

Best for: Direct Kubernetes API access, environments with full SSA permissions

✅ Direct Updates: Immediate API updates with Server Side Apply
✅ No Infrastructure: No additional components required
❌ ArgoCD Conflicts: May conflict with GitOps tools
❌ SSA Required: Needs Server Side Apply permissions

spec:
  updateStrategy:
    strategy: ssa                       # Use SSA strategy
    useServerSideApply: true           # Enable SSA features

Safety model

OptiPod is built around conservative defaults and explicit, policy-driven controls:

Policy-driven safety: min/max bounds, safety factors, and (where configured) change-rate limits.
Conservative memory handling: avoids “blind” memory reductions; requires explicit bounds/constraints.
GitOps-safe: supports both webhook strategy (ArgoCD compatible) and Server-Side Apply (SSA) for different environments.
Explainable recommendations: usage window, percentile choice, and safety margin are visible before applying.
Update strategy control: allow/disallow in-place resize; block disruptive recreation unless you opt in.

OptiPod vs VPA comparison

Legend: ✅ supported, ❌ not supported, ⚠️ supported with caveats.

Capability	OptiPod	Kubernetes VPA
GitOps-safe (ArgoCD compatible)	✅	❌
Server-Side Apply (SSA) support	✅	❌
Safe by default (Recommend mode)	✅	⚠️
Explainable recommendations	✅	⚠️
Policy-driven safety	✅	⚠️
Multiple update strategies	✅	❌

⚠️ Typically means you can achieve the outcome, but it’s easier to run into GitOps conflicts and/or less predictable rollouts depending on configuration and workload constraints.

Who OptiPod is for

Platform teams running GitOps-managed clusters
SREs who want safer, Recommend mode first workflows
FinOps partners who need guardrails and visibility (without a SaaS dependency)
Teams who want to start with recommendations and adopt automation gradually

Metrics & observability

Metrics backends: metrics-server and prometheus
Exposes Prometheus metrics from the controller
Emits Kubernetes events for important actions and failures
Writes recommendations as annotations on workloads and aggregate results to OptimizationPolicy status

Estimate impact before switching to Auto

Before you change a policy to mode: Auto, you can generate a report that summarizes the replica-weighted impact (total CPU/memory request deltas across pods) based on OptiPod’s recommendations.

This is especially useful to answer: “If I opt in to Auto, what will change, and by how much?”

Generate an HTML report:

curl -fsSL https://raw.githubusercontent.com/Sagart-cactus/optipod/main/scripts/optipod-recommendation-report.sh -o optipod-recommendation-report.sh
chmod +x optipod-recommendation-report.sh
./optipod-recommendation-report.sh -o html -f optipod-impact.html

Generate a JSON report for programmatic analysis:

./optipod-recommendation-report.sh -o json -f optipod-impact.json

The report only includes workloads that have OptiPod recommendation annotations (generated in Recommend mode), and highlights warnings (for example: when updateRequestsOnly=true but recommended requests would exceed existing limits).

How the impact report works:

Scans all workloads in the cluster for OptiPod recommendation annotations
Consolidates individual workload recommendations into aggregate totals
Calculates replica-weighted impact (recommendation delta × number of pods)
Provides both per-workload details and cluster-wide summaries
Available in JSON format for automation or HTML for human review

Project status

OptiPod has core functionality implemented and tested, and is in active development.

Production-ready features
- GitOps-safe Server-Side Apply (SSA): OptiPod claims ownership only of CPU/memory request/limit fields.
- Multiple operational modes: Recommend / Auto / Disabled.
- Policy-driven safety: bounds, safety factors, and controlled application strategies.
- Explainable recommendations: visible inputs and margins before applying.
- Observability: Prometheus metrics, Kubernetes events, and detailed policy status.
Work in progress
- Per-policy metrics provider selection (currently configured globally)
- Custom metrics provider plugin framework
Documentation
Docs: docs/ (source) and website/docs/ (rendered site)
Roadmap: ROADMAP.md

Contributing & governance

Contributing guide: CONTRIBUTING.md
Governance: GOVERNANCE.md
Code of Conduct: CODE_OF_CONDUCT.md

Quick Start for contributors

git clone https://github.com/Sagart-cactus/optipod.git
cd optipod
make setup-pre-commit
make test

Building from source (development)

make build
make run

Roadmap

See ROADMAP.md for planned work and explicit non-goals.

License

Apache 2.0. See LICENSE.

Final safety note

Start in Recommend mode, set conservative bounds, and validate recommendations in a non-production environment before enabling mode: Auto.

Name		Name	Last commit message	Last commit date
Latest commit History 185 Commits
.ai		.ai
.devcontainer		.devcontainer
.github		.github
.kiro		.kiro
api/v1alpha1		api/v1alpha1
charts		charts
cmd		cmd
config		config
docs		docs
hack		hack
internal		internal
scripts		scripts
test		test
website		website
.cursorrules		.cursorrules
.dockerignore		.dockerignore
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.markdownlint.yml		.markdownlint.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
.secrets.baseline		.secrets.baseline
.yamllint.yml		.yamllint.yml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
GOVERNANCE.md		GOVERNANCE.md
LICENSE		LICENSE
Makefile		Makefile
PROJECT		PROJECT
README.md		README.md
ROADMAP.md		ROADMAP.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OptiPod

Why OptiPod exists

What OptiPod will NOT do

Quick Start (safe Recommend mode)

1) Install

2) Create a minimal policy (Recommend mode)

3) Label a workload and review recommendations

Where recommendations are stored

Individual Workload Annotations

Policy Status (Aggregate Only)

What OptiPod actually does (high-level flow)

Key concepts / terminology

Operational modes

Update Strategies

Webhook Strategy (Default)

SSA Strategy (Traditional)

Safety model

OptiPod vs VPA comparison

Who OptiPod is for

Metrics & observability

Estimate impact before switching to Auto

Project status

Contributing & governance

Quick Start for contributors

Building from source (development)

Roadmap

License

Final safety note

About

Uh oh!

Releases 12

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OptiPod

Why OptiPod exists

What OptiPod will NOT do

Quick Start (safe Recommend mode)

1) Install

2) Create a minimal policy (Recommend mode)

3) Label a workload and review recommendations

Where recommendations are stored

Individual Workload Annotations

Policy Status (Aggregate Only)

What OptiPod actually does (high-level flow)

Key concepts / terminology

Operational modes

Update Strategies

Webhook Strategy (Default)

SSA Strategy (Traditional)

Safety model

OptiPod vs VPA comparison

Who OptiPod is for

Metrics & observability

Estimate impact before switching to Auto

Project status

Contributing & governance

Quick Start for contributors

Building from source (development)

Roadmap

License

Final safety note

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 12

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages