Summary
Teach users how to track token usage, estimate costs, and enforce budgets to prevent runaway agent loops and enable enterprise chargeback models. Cost visibility is critical for production deployments where multiple teams share inference infrastructure.
Course Section Outline
- Why cost tracking matters in enterprise deployments — runaway loops, shared infrastructure, chargeback
- Configuring budget limits in agent.yaml — per-request and per-session ceilings
- Custom pricing tables for self-hosted models vs. cloud API pricing
- Reading usage data from the API response — token counts and cost estimates
- Prometheus metrics for cost monitoring and alerting
- Per-tenant budgets and chargeback patterns using tenant context
- Handling budget-exceeded scenarios gracefully — warnings, soft limits, hard stops
Lab Exercise
Configure budget limits on an agent with both a warning threshold and a hard stop. Run conversations that approach and exceed the limit. Observe the warning behavior in the API response and the hard-stop rejection. Query Prometheus to view cost metrics over time.
Companion Issues
Companion issues filed on fips-agents/agent-template, fips-agents/gateway-template, fips-agents/ui-template, and fips-agents/fips-agents-cli.
Size
S
Summary
Teach users how to track token usage, estimate costs, and enforce budgets to prevent runaway agent loops and enable enterprise chargeback models. Cost visibility is critical for production deployments where multiple teams share inference infrastructure.
Course Section Outline
Lab Exercise
Configure budget limits on an agent with both a warning threshold and a hard stop. Run conversations that approach and exceed the limit. Observe the warning behavior in the API response and the hard-stop rejection. Query Prometheus to view cost metrics over time.
Companion Issues
Companion issues filed on fips-agents/agent-template, fips-agents/gateway-template, fips-agents/ui-template, and fips-agents/fips-agents-cli.
Size
S