High-level direction and rationale. Dates are targets, not commitments.
- What: Authenticated users can copy a referral invite URL (
/login?ref=…) from the header Invite button and from an Invite friends card on/dashboard/affiliates;GET /api/v1/referralsensures areferral_codesrow exists and returns flat JSON; inactive codes block copy in the header and show a clear state on the Affiliates page;403returned forForbiddenError(e.g. missing org). - Why: Referral attribution already existed (
apply, login query params, revenue splits) but users had no first-class way to discover their code or link. Colocating with Affiliates under Monetization keeps one “growth links” area without implying affiliate and referral are the same program. Why not nested JSON in GET: Reduces parser mistakes in clients and small models. - Follow-ups (later): Vanity codes, optional
intent=signupon links, shared client cache (SWR) if duplicate GETs become noisy—see referrals.md.
- What:
user_characters.settings.anthropicThinkingBudgetTokenssets thinking per cloud agent (MCP/A2A chat).ANTHROPIC_COT_BUDGETis the default when that key is omitted;ANTHROPIC_COT_BUDGET_MAXoptionally caps any effective budget. - Why: Agent owners control inference policy without redeploying; request bodies must not carry budgets (untrusted clients). Env default + max give operators baseline and cost bounds.
- Docs: docs/anthropic-cot-budget.md
- What: Shared
mockMiladyPricingMinimumDepositForRouteTests(); Milady billing cron tests use stable DB mocks;package.jsonscript paths updated for the renamed test file. - Why: Replacing
@/lib/constants/milady-pricingwith only{ MINIMUM_DEPOSIT }stripped hourly rates and warning thresholds for every later importer in the same Bun process, so billing cron assertions failed only when the full unit tree ran. Spreading real constants preserves cross-module correctness. - Docs: docs/unit-testing-milady-mocks.md
- What: POST
/api/v1/messageswith Anthropic request/response format, tools, streaming SSE. - Why: Claude Code and many integrations are built for Anthropic’s API. Supporting it lets users point those tools at elizaOS Cloud with a single API key and credit balance, instead of maintaining a separate Anthropic key and proxy.
- Outcomes: Claude Code works with
ANTHROPIC_BASE_URL+ Cloud API key; same billing and safety as chat completions.
- Dashboard / character editor — Expose
settings.anthropicThinkingBudgetTokenswith copy that explains cost vs quality tradeoffs. Why: today the field is JSON-only; most creators will not discover it from docs alone. - Room- or conversation-scoped chat — When
/api/v1/chat(or eliza runtime paths) resolve auser_charactersrow, thread the sameparseThinkingBudgetFromCharacterSettings+ merge helpers. Why: parity between “chat in app” and “chat via MCP/A2A” for the same agent.
- Streaming tool_use blocks — Emit
content_block_deltafor tool_use (partial JSON) so clients can stream tool calls. Why: some SDKs expect incremental tool payloads. - Ping interval — Optional periodic
pingevents during long streams. Why: proxies and clients often use pings to detect dead connections. - anthropic-version — Validate or document supported
anthropic-versionheader values. Why: avoid breakage when Anthropic adds new fields.
- Consistent error envelope — Align OpenAI-style endpoints with a shared
{ type, code, message }shape where possible. Why: one client-side error handler for all Cloud APIs. - OpenAPI tags — Tag Messages and Chat in OpenAPI so generators produce separate clients. Why: clearer SDKs and docs.
- Vanity referral codes — User-chosen strings with strict validation and uniqueness. Why: memorability; requires abuse review and collision handling.
- Single client cache for
GET /api/v1/referrals— e.g. React context or SWR so header and Affiliates page share one request. Why: fewer redundant GETs; today idempotent DB writes make duplicates harmless.
- Google Gemini REST compatibility — If demand exists, a Gemini-style route (e.g.
generateContent) could reuse the same credits and gateway. Why: same “one key, one bill” story for Gemini-native tools.
- Usage alerts — Notify when credits or usage cross thresholds. Why: avoid surprise exhaustion for high-volume or app credits.
- Rate limit headers — Return
X-RateLimit-*on relevant endpoints. Why: clients can back off or show “N requests left” without guessing.
- Direct Anthropic key passthrough — We do not forward to Anthropic with the user’s key; we always use our gateway and bill Cloud credits. Why: single billing, consistent safety and routing.