Skip to content

Releases: defilantech/infercost

v0.2.1

21 Apr 00:38
a3d0aab

Choose a tag to compare

0.2.1 (2026-04-21)

Bug Fixes

  • bump Chart.yaml to 0.2.0 and wire release-please to maintain it (#34) (59e9de0)

v0.2.0

21 Apr 00:22
10c050d

Choose a tag to compare

0.2.0 (2026-04-21)

Features

  • add team attribution, budget tracking, and alerting (M2) (#17) (c475fa9)
  • add vLLM scraper with per-pod backend selection (#26) (30be439)
  • FOCUS-compatible CSV export for UsageReport data (#31) (277423c)
  • ship PodMonitor template for automatic Prometheus discovery (#24) (57e8eb8), closes #18
  • surface DCGMReachable status condition on CostProfile (#28) (f44b538)
  • unify cloud pricing on canonical YAML with refresh workflow (#29) (aed344a)

Bug Fixes

  • bust Go Report Card badge cache (38ee787)
  • stop UsageReport reconcile hot-loop on status updates (#25) (98ee49e)

Documentation

  • add CostProfile sample library for common GPUs (#27) (8d9418d)
  • batch-fill missing godoc comments (#32) (b92b57b)
  • write launch runbooks (CNCF, OpenCost, HN/Reddit, FinOps) — not executed yet (#30) (4525822)

infercost-0.2.1

21 Apr 00:38
a3d0aab

Choose a tag to compare

A Helm chart for InferCost - Kubernetes-native cost intelligence for on-premises AI inference

infercost-0.1.0

21 Apr 00:22
10c050d

Choose a tag to compare

A Helm chart for InferCost - Kubernetes-native cost intelligence for on-premises AI inference

v0.1.1

23 Mar 06:51

Choose a tag to compare

Patch release: Docker images on GHCR, Homebrew formula update, OCI labels for repo linking.

v0.1.0

22 Mar 23:37

Choose a tag to compare

InferCost v0.1.0

The first release of InferCost: Kubernetes-native cost intelligence for on-premises AI inference.

Features

  • CostProfile CRD: Declare GPU hardware economics (purchase price, amortization, electricity rate, PUE)
  • UsageReport CRD: Auto-populated cost reports with per-model/namespace breakdown
  • Cost engine: Scrapes DCGM for GPU power draw and llama.cpp/vLLM for token counts
  • 12 Prometheus metrics: Cost-per-token, hourly cost, cloud comparison, GPU power, savings
  • Cloud comparison: Verified pricing for 9 models across OpenAI, Anthropic, and Google
  • CLI: infercost status and infercost compare with monthly projections
  • REST API: /api/v1/costs/current, /api/v1/models, /api/v1/compare, /api/v1/status
  • Grafana dashboard: 13 panels shipped as JSON
  • Helm chart: Single-command installation with configurable DCGM endpoint and API server
  • Honest results: Shows when cloud is cheaper than on-prem

Install

Homebrew (macOS):

brew install defilantech/tap/infercost

Helm:

helm install infercost infercost/infercost \
  --set dcgm.endpoint=http://nvidia-dcgm-exporter:9400/metrics

Binary: Download from the assets below.