Releases: defilantech/infercost
Releases · defilantech/infercost
v0.2.1
v0.2.0
0.2.0 (2026-04-21)
Features
- add team attribution, budget tracking, and alerting (M2) (#17) (c475fa9)
- add vLLM scraper with per-pod backend selection (#26) (30be439)
- FOCUS-compatible CSV export for UsageReport data (#31) (277423c)
- ship PodMonitor template for automatic Prometheus discovery (#24) (57e8eb8), closes #18
- surface DCGMReachable status condition on CostProfile (#28) (f44b538)
- unify cloud pricing on canonical YAML with refresh workflow (#29) (aed344a)
Bug Fixes
- bust Go Report Card badge cache (38ee787)
- stop UsageReport reconcile hot-loop on status updates (#25) (98ee49e)
Documentation
infercost-0.2.1
A Helm chart for InferCost - Kubernetes-native cost intelligence for on-premises AI inference
infercost-0.1.0
A Helm chart for InferCost - Kubernetes-native cost intelligence for on-premises AI inference
v0.1.1
v0.1.0
InferCost v0.1.0
The first release of InferCost: Kubernetes-native cost intelligence for on-premises AI inference.
Features
- CostProfile CRD: Declare GPU hardware economics (purchase price, amortization, electricity rate, PUE)
- UsageReport CRD: Auto-populated cost reports with per-model/namespace breakdown
- Cost engine: Scrapes DCGM for GPU power draw and llama.cpp/vLLM for token counts
- 12 Prometheus metrics: Cost-per-token, hourly cost, cloud comparison, GPU power, savings
- Cloud comparison: Verified pricing for 9 models across OpenAI, Anthropic, and Google
- CLI:
infercost statusandinfercost comparewith monthly projections - REST API: /api/v1/costs/current, /api/v1/models, /api/v1/compare, /api/v1/status
- Grafana dashboard: 13 panels shipped as JSON
- Helm chart: Single-command installation with configurable DCGM endpoint and API server
- Honest results: Shows when cloud is cheaper than on-prem
Install
Homebrew (macOS):
brew install defilantech/tap/infercostHelm:
helm install infercost infercost/infercost \
--set dcgm.endpoint=http://nvidia-dcgm-exporter:9400/metricsBinary: Download from the assets below.