Skip to content

MukundaKatta/AgentBudgetPy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

agentbudget-py

Token + dollar budget caps for AI agents. Raises BudgetExceededError when an LLM call would push past the ceiling. Zero deps, drop into any provider SDK.

pip install agentbudget-py

Python port of @mukundakatta/agentbudget — same API, snake_case names.

Why

You ship an agent. A bug in the planner makes it loop. Your claude-opus-4-5 bill is $300 before you notice.

agentbudget is one class. Set caps once, record usage after each call, raise the moment any cap is breached. CI catches loops; production catches runaways.

Quickstart

from agentbudget import Budget, BudgetExceededError

budget = Budget(
    max_total_tokens=200_000,   # hard token ceiling
    max_cost_usd=5.00,          # hard dollar ceiling
)

try:
    for turn in turns:
        resp = client.messages.create(...)
        budget.record_usage({
            "model": resp.model,
            "input_tokens": resp.usage.input_tokens,
            "output_tokens": resp.usage.output_tokens,
        })
except BudgetExceededError as err:
    print(f"stopped — {err.cap} cap of {err.limit} hit")
    raise

The raised BudgetExceededError carries cap, limit, attempted, overshoot, and model so you can build human messages without re-reading the budget.

Caps

All optional, all checked after each record_usage. The first violation wins, in this order:

Argument Caps
max_input_tokens total input tokens across all calls
max_output_tokens total output tokens across all calls
max_total_tokens input + output combined
max_cost_usd dollars (requires pricing — see below)

Auto-record with wrap

Budget.wrap adapts the Anthropic and OpenAI response shapes (object or dict) out of the box:

import anthropic
from agentbudget import Budget

client = anthropic.Anthropic()
budget = Budget(max_cost_usd=1)

create = budget.wrap(client.messages.create)

await create(model="claude-sonnet-4-7", max_tokens=1024, messages=[...])
# budget.totals is updated automatically; raises if the cap is hit

For other providers, pass extract_usage:

wrapped = budget.wrap(
    my_custom_call,
    extract_usage=lambda r: {
        "model": r["model_id"],
        "input_tokens": r["tokens"]["in"],
        "output_tokens": r["tokens"]["out"],
    },
)

Pre-flight checks

Don't want to make the call when you're already near the cap? Use would_exceed (returns the cap name or None) or assert_can_spend (raises):

if budget.would_exceed({"input_tokens": 8000, "output_tokens": 2000}):
    return await fallback()  # skip the call entirely

# or, in batch flows where you can split work:
budget.assert_can_spend(input_tokens=estimated_tokens)  # raises if not

Pricing

max_cost_usd needs per-model rates. agentbudget ships a starter DEFAULT_PRICING table (Claude + GPT, early-2026 rates) and lets you override:

from agentbudget import Budget

budget = Budget(
    max_cost_usd=10,
    pricing={
        # override one model
        "claude-sonnet-4-7": {"input_per_1k": 0.0015, "output_per_1k": 0.0075},  # cached rate
        # add a model the default doesn't know
        "my-finetune-v2": {"input_per_1k": 0.001, "output_per_1k": 0.001},
    },
)

Always verify the default rates against the provider's current pricing page before relying on them for billing-critical work.

If you call a model not in either table:

Budget(max_cost_usd=1)                              # raises UnknownPricingError
Budget(max_cost_usd=1, allow_unknown_pricing=True)  # unknown models cost $0
Budget(max_total_tokens=1_000_000)                  # no cap, no error — pricing irrelevant

Introspection

budget.totals
# {'input_tokens': 12_400, 'output_tokens': 3_100, 'total_tokens': 15_500,
#  'cost_usd': 0.084, 'calls': 7}

budget.remaining()
# {
#   'total_tokens': {'used': 15500, 'limit': 200000, 'remaining': 184500},
#   'cost_usd':     {'used': 0.084,  'limit': 5,      'remaining': 4.916},
#   'calls': 7,
# }

budget.reset() zeroes the totals but keeps caps + pricing — useful for re-using one Budget across runs.

Sibling libraries

Part of the @mukundakatta/agent* reliability stack:

JS sibling: @mukundakatta/agentbudget on npm.

License

MIT © Mukunda Katta

About

Token + dollar budget caps for AI agents — raises BudgetExceededError when an LLM call would push past the ceiling. Zero deps, drop into any provider SDK. Python port of @mukundakatta/agentbudget.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages