feat: Anthropic CoT controls by odilitime · Pull Request #417 · elizaOS/cloud

odilitime · 2026-03-27T04:12:43Z

Align check-types-split with packages/db, packages/lib, and UI components; exclude Storybook stories from the UI chunk.
Add check-types:tests to CI; add root scripts for UI, agent-server, and gateway typechecks.
packages/ui: path aliases for @/lib, @/db, @/app, @/packages/ui; styled-jsx attrs for <style jsx>.
packages/lib: JWKS uses jose KeyLike; runtime-factory imports plugin-sql default with typed adapter factory.
gateway-discord: explicit discord.js handler types, connectBot characterId, sendTyping guard, listener map cast.
services/agent-server: requireInternalAuth accepts Elysia string/number status.
Tests: mocks, RequestInit/fetch preconnect, UUID casts, toMatchObject, financial addCredits, Playwright APIRequestContext, and related fixes.
New anthropic-thinking provider helper and unit test.

Note: services/operator (Pepr) is not part of the main workspace typecheck.
Made-with: Cursor

Note

Medium Risk
Medium risk: changes LLM providerOptions and credit reservation estimates across multiple production API routes, which can impact billing and model behavior; also modifies rate limiting presets via a new env multiplier.

Overview
Introduces per-agent Anthropic extended thinking controls by adding packages/lib/providers/anthropic-thinking.ts to resolve thinking budgets from user_characters.settings.anthropicThinkingBudgetTokens with deploy-level defaults (ANTHROPIC_COT_BUDGET) and an optional global cap (ANTHROPIC_COT_BUDGET_MAX), plus helpers to deep-merge providerOptions without clobbering existing gateway/google settings.

Wires these options into MCP/A2A agent chat (including credit reservation estimates that account for thinking tokens) and applies env-based thinking merges to several other text routes, while explicitly disabling CoT for background/promo/SEO and image-generation flows to preserve temperature control. Adds env validation for the new variables, updates rate limiting to use production-level limits in dev with a RATE_LIMIT_MULTIPLIER, expands CI/typecheck coverage, and hardens/adjusts unit+integration tests (including Milady pricing mock isolation) with new docs and changelog entries.

^{Written by Cursor Bugbot for commit 3b0b079. This will update automatically on new commits. Configure here.}

Summary by CodeRabbit

New Features
- Per-agent Anthropic “extended thinking” token budgets for MCP/A2A chats; model calls now honor per-agent and env-derived budgets with optional max cap
- Global env toggles added to set defaults and optional ceilings
Documentation
- New docs and roadmap entries explaining Anthropic extended-thinking configuration and unit-testing guidance
Bug Fixes
- Rate-limit presets unified to consistent numeric values across environments
Chores
- CI/typecheck coverage extended to include test project types

- Align check-types-split with packages/db, packages/lib, and UI components; exclude Storybook stories from the UI chunk. - Add check-types:tests to CI; add root scripts for UI, agent-server, and gateway typechecks. - packages/ui: path aliases for @/lib, @/db, @/app, @/packages/ui; styled-jsx attrs for <style jsx>. - packages/lib: JWKS uses jose KeyLike; runtime-factory imports plugin-sql default with typed adapter factory. - gateway-discord: explicit discord.js handler types, connectBot characterId, sendTyping guard, listener map cast. - services/agent-server: requireInternalAuth accepts Elysia string/number status. - Tests: mocks, RequestInit/fetch preconnect, UUID casts, toMatchObject, financial addCredits, Playwright APIRequestContext, and related fixes. - New anthropic-thinking provider helper and unit test. Note: services/operator (Pepr) is not part of the main workspace typecheck. Made-with: Cursor

vercel · 2026-03-27T04:12:48Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
eliza-cloud-v2	Error		Mar 27, 2026 5:40pm

coderabbitai · 2026-03-27T04:12:51Z

Walkthrough

This PR adds Anthropic “extended thinking” (CoT) budget support with per-agent settings and env fallbacks/caps, provider-options merge helpers, widespread integration of those helpers into AI request calls, TypeScript typing and test infrastructure improvements, Milady test rework, a rate-limit preset change, and documentation updates.

Changes

Cohort / File(s)	Summary
Anthropic Extended Thinking Core `packages/lib/providers/anthropic-thinking.ts`, `packages/lib/providers/cloud-provider-options.ts`, `packages/lib/providers/index.ts`, `packages/lib/config/env-validator.ts`	New Anthropic CoT module: env + per-character parsing, resolution with optional max clamp, provider-options deep-merge helpers, new type `CloudMergedProviderOptions`, and env validation entries for `ANTHROPIC_COT_BUDGET` / `ANTHROPIC_COT_BUDGET_MAX`.
Route & Service Integrations `app/api/agents/[id]/a2a/route.ts`, `app/api/agents/[id]/mcp/route.ts`, `app/api/v1/chat/completions/route.ts`, `app/api/v1/chat/route.ts`, `app/api/v1/messages/route.ts`, `app/api/mcp/tools/generation.ts`, `app/api/v1/generate-image/route.ts`, `app/api/v1/responses/route.ts`, `packages/lib/services/...` (several files)	Spreads merged provider options (Anthropic CoT / Google modalities / gateway prefs) into `streamText`/`generateText` calls; MCP/A2A chat handlers now parse `character.settings` for per-agent budgets and pass computed options into LLM invocations.
Image Model Modalities & Google Merge `app/api/mcp/tools/generation.ts`, `app/api/v1/generate-image/route.ts`, `packages/lib/services/app-promotion-assets.ts`, `packages/lib/services/app-promotion.ts`, `packages/lib/services/seo.ts`, `packages/lib/services/twitter-automation/app-automation.ts`, `packages/lib/api/a2a/skills.ts`	Replaced inline Google `responseModalities` with helper that merges Google image/text modalities with Anthropic CoT options; image-generation calls now pick model constants and merged provider options.
Provider Options Typing `packages/lib/providers/types.ts`, `packages/lib/providers/cloud-provider-options.ts`, `packages/lib/providers/index.ts`	Introduces `CloudMergedProviderOptions` and updates request typings to use the shared provider-options shape; re-exports Anthropic helpers.
Env / Config / Scripts `.env.example`, `packages/scripts/check-types-split.ts`, `package.json`, `services/agent-server/package.json`, `tsconfig.test.json`, `packages/ui/tsconfig.json`	Documented Anthropic env vars in `.env.example`; added/check-script and typecheck entries; adjusted typecheck splitting and tsconfig path/exclude updates; added agent-server `typecheck` script.
Milady Test Infrastructure & Helpers `packages/tests/helpers/mock-milady-pricing-for-route-tests.ts`, `packages/tests/helpers/index.ts`, `packages/tests/unit/milady.test.ts`, `packages/tests/unit/z-milady-billing-route.test.ts`	New shared Milady mock helper that spreads real `MILADY_PRICING` and overrides `MINIMUM_DEPOSIT`; moved billing test to `z-*` and refactored DB mocks into `registerMiladyBillingMocks()` for stable ordering.
Unit Tests: Anthropic & Merge Helpers `packages/tests/unit/anthropic-thinking.test.ts`	Comprehensive tests covering env parsing, per-character parsing, resolve/clamp logic, `anthropicThinkingProviderOptions`, and provider-options merge behaviors across Anthropic/Google/gateway cases.
Type Safety & Test Improvements many `packages/tests/...` files (type-casting fetch mocks, typed fixtures, signature tightening)	Widespread test typing hardening: explicit casts via `unknown`, typed fixtures for OAuth/Twitter/Google, adjusted helper signatures, assert relaxations where appropriate, and preservation of `preconnect` via `Object.assign` wrappers.
Rate Limiting `packages/lib/middleware/rate-limit.ts`	Removed dev-vs-prod `isDevelopment` conditionals: `RateLimitPresets` now use fixed numeric `maxRequests` values across environments (backing store still differs by `REDIS_RATE_LIMITING`).
Auth & JWKS Typing `packages/lib/auth/jwks.ts`, `services/agent-server/src/routes.ts`	Replaced `JoseCryptoKey` alias with `KeyLike` in JWKS API; relaxed `requireInternalAuth` status typing to `number
E2E & Integration Test Adjustments `packages/tests/e2e/setup-server.ts`, integration tests under `packages/tests/integration/*`	Improved server log piping guard, reworked `fetchWithServer` to preserve `preconnect`, removed `source` from some credits calls, and relaxed/updated integration assertions and fixtures.
Fixtures & Minor App Changes `packages/tests/fixtures/mcp-test-character.ts`, `packages/lib/eliza/runtime-factory.ts`, `packages/services/gateway-discord/src/gateway-manager.ts`	Updated test fixture IDs; wrapped DB adapter import with typed local wrapper; strengthened Discord gateway typings and safe `sendTyping` checks.
Docs & Changelog `README.md`, `CHANGELOG.md`, `docs/anthropic-cot-budget.md`, `docs/unit-testing-milady-mocks.md`, `docs/ROADMAP.md`, `packages/content/changelog.mdx`	Adds Anthropic CoT documentation and internal HOWTOs, README engineering docs links, changelog entries, and roadmap notes.
Misc Tests: assertions, typings, refactors many `packages/tests/unit/*` files	Multiple tests updated for safer assertions (`toMatchObject`, guards), stricter typing, fixture helpers, and minor control-flow tweaks; many changes are test-only and type-focused.
UI JSX typings `packages/ui/src/styled-jsx.d.ts`	Adds React module augmentation for `StyleHTMLAttributes<T>` with optional `jsx` and `global` boolean props.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Route as MCP/A2A Route
    participant Resolver as CharacterResolver
    participant Anthropic as AnthropicThinking
    participant LLM as LLMProvider

    Client->>Route: POST /api/agents/{id}/mcp (chat)
    Route->>Resolver: resolve character
    Resolver-->>Route: character (includes settings)
    Route->>Anthropic: parseThinkingBudgetFromCharacterSettings(settings)
    Anthropic-->>Route: agentThinkingBudget?
    Route->>Anthropic: mergeAnthropicCotProviderOptions(model, env, budget)
    Anthropic-->>Route: providerOptions fragment
    Route->>LLM: streamText({ model, prompt, ...providerOptions })
    LLM-->>Route: stream response
    Route-->>Client: stream completion

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🧠 Extended thinking budgets bloom,
Per-agent settings fill the room,
Provider options merge with grace,
TypeScript types find their place,
Milady mocks now stable stand— ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 60.47% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title 'feat: Anthropic CoT controls' directly reflects the main feature added: per-agent Anthropic extended thinking (chain-of-thought) token budget controls and related infrastructure.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch odi-anthro

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

claude · 2026-03-27T04:16:49Z

Code Review

This PR adds Anthropic extended thinking (ANTHROPIC_COT_BUDGET) support across all AI routes, fixes TypeScript type errors in tests/packages, adds a check-types:tests CI step, and hardens the Discord gateway with explicit handler types. The implementation is generally solid with good test coverage for the new helper.

Issues:

Silent temperature stripping when thinking is enabled (medium)

app-promotion-assets.ts, app-promotion.ts, twitter-automation/app-automation.ts, and seo.ts pass explicit temperature alongside mergeProviderOptions. When ANTHROPIC_COT_BUDGET is set, @ai-sdk/anthropic silently strips temperature/topP/topK (noted in the v1/chat completions comment). Callers relying on temperature 0.7/0.8 will silently get default sampling - an invisible behavior change. Consider logging a warning at startup when both are configured.

mergeProviderOptions silently overwrites unknown provider keys (low-medium)

The function deep-merges only gateway, anthropic, and google. Any other provider key present in both a and b (e.g. openai, azure) is overwritten by the shallow spread without merging. Worth a comment flagging this limitation.

Rate limit change may break local development (medium)

Removing the isDevelopment override drops STANDARD from 10,000 to 60 req/min in local dev. Developers running hot-reload tests or UI workflows will now hit production limits backed only by in-memory storage. Consider keeping separate dev limits or documenting the change prominently.

runtime-factory.ts default-import cast hides the type (low)

The default import is immediately cast to a manually-written signature. If the upstream @elizaos/plugin-sql signature changes, this cast will silently pass wrong types. Worth a comment explaining the workaround and a TODO to remove it when upstream fixes the type export.

Definite assignment in test teardown (low)

In x402-topup.test.ts, afterAll accesses referralsServiceForTest etc. with definite-assignment assertions and no null guards. If beforeAll throws before all three are assigned, afterAll will also throw and may swallow the original failure. The previous null-guard pattern was safer.

Positives:

parseAnthropicCotBudgetFromEnv is thorough: strict digit regex, safe-integer check, and solid edge-case test coverage.
Extracting model strings to named constants before passing to mergeProviderOptions improves readability.
Discord sendTyping guard is correct - DMChannel/PartialGroupDMChannel lack that method.
Excluding Storybook stories from check-types-split is a good cleanup.
Switching to toMatchObject in server-wallets tests is the correct fix for partial-shape assertions.

Rename milady billing route test to z-milady-billing-route for script ordering, use queue-backed db mocks re-registered in beforeEach, and add mockMiladyPricingMinimumDepositForRouteTests so route tests do not replace MILADY_PRICING with only MINIMUM_DEPOSIT (which broke billing thresholds in-process). Update package.json bulk/special unit paths accordingly. Made-with: Cursor

Extend anthropic-thinking handling, add cloud-provider-options, update provider types and index exports, and thread safe model params through v1 chat, completions, messages, responses, generate-image, MCP/A2A, and internal services. Expand anthropic-thinking unit tests. Made-with: Cursor

Add engineering docs (unit-testing-milady-mocks, anthropic-cot-budget), root CHANGELOG.md, README links and accurate test script table, ROADMAP and site changelog entries. Expand JSDoc and .env.example with rationale for ANTHROPIC_COT_BUDGET and partial MILADY_PRICING mock pitfalls. Made-with: Cursor

claude · 2026-03-27T08:07:02Z

Code Review - PR 417

This PR is a broad type-fix/quality pass with one new functional feature (Anthropic extended thinking via ANTHROPIC_COT_BUDGET). The changes are generally clean. A few things worth discussing:

ISSUES

Rate limiting: dev/prod unification is a breaking DX change (medium)

File: packages/lib/middleware/rate-limit.ts

Removing the isDevelopment branch means local dev now hits the same strict limits as production (60/min STANDARD, 10/min STRICT, 5 per 5-min CRITICAL) backed only by in-memory storage. Developers iterating locally will now hit 429s immediately. The old 10000 req/window dev ceiling existed precisely to avoid this.

The log message change says "same numeric limits as production" explicitly. Was this intentional? If so, it should be called out in the PR description. If not, the isDevelopment guard should be restored.

Definite assignment (!) removes null-safety in teardown (low)

File: packages/tests/integration/x402-topup.test.ts

Using ! on the three service vars silences TS but means if beforeAll throws partway through, afterAll will crash accessing properties on uninitialized vars. The original null-guards were load-bearing. Consider keeping them or initializing with no-op stubs.

mergeProviderOptions shallow merge - worth a comment (low)

File: packages/lib/providers/anthropic-thinking.ts

The merge is one level deep for anthropic/google/gateway. For the thinking key specifically: if base has { thinking: { type: "disabled" } } and COT helper adds { thinking: { type: "enabled", budgetTokens: N } }, the second replaces the first entirely. This is the desired behavior, but a short comment would prevent future contributors from trying to fix it.

Google responseModalities leaks into non-Google paths (nit)

File: app/api/v1/generate-image/route.ts

The else branch in isOpenAIModel ? mergeAnthropicCot : mergeGoogleImage covers everything that is not OpenAI. Future Anthropic image models would fall into the Google path and pick up responseModalities unnecessarily. Low risk now, worth noting when new providers are added.

Runtime factory type cast is fragile (nit)

File: packages/lib/eliza/runtime-factory.ts

Casting the default import manually bypasses TS module resolution. A comment explaining the upstream typing gap would help future audits.

POSITIVES

sendTyping guard: "sendTyping" in channel is the right type-narrowing approach.
UUID fixture: changing "test-agent-simple-001" to a valid UUID format prevents silent failures.
APIRequestContext in Playwright fixture: removing the ReturnType gymnastics is cleaner.
preconnect binding on fetchWithServer: correctly preserving fetch.preconnect.
Unit tests for anthropic-thinking.ts: good boundary coverage (0, invalid, whitespace, non-anthropic model, merge collision).
Storybook exclusion from check-types-split: correct - stories use Storybook types not in the main tsconfig.
isChatInputCommand() guard: proper discord.js type guard instead of duck-typing is a correctness improvement.

Overall this is a solid type cleanup pass. The rate-limit change is the most impactful behavioral difference and should be confirmed as intentional before merge.

claude · 2026-03-27T08:09:17Z

PR Review: fix(types): test project, split check-types, UI package, gateways

Overall this is well-structured work with clear documentation of design decisions. A few items worth addressing before merge.

Bugs / Correctness

x402-topup.test.ts: afterAll null safety regression - The refactor replaced null-guarded service patches with definite assignment (!) but the afterAll cleanup now runs unconditionally. If beforeAll throws before the service variables are assigned, afterAll will throw on property access, masking the root failure and potentially breaking test teardown for other suites. Restore optional chaining or null guards in afterAll even if the variable declarations no longer need the union type.

Behaviour Change Worth Calling Out

Rate limit presets: dev-mode unlimited -> production limits for local development - packages/lib/middleware/rate-limit.ts removes the isDevelopment pattern and always uses production-level limits (STANDARD: 60/min, STRICT: 10/min, etc.). The only difference now is in-memory vs Redis backing. This will hit local developers on standard routes once they exceed 60 req/min. If intentional, worth a note in dev setup docs.

Minor Issues

mergeProviderOptions silently drops unknown provider keys - The merge explicitly deep-merges gateway, anthropic, and google, but the initial spread means any other provider key present in both inputs will be shallow-overwritten by the second. Worth a comment so the next person adding a provider (mistral, cohere) knows to add it to the explicit merge block.
runtime-factory.ts: default import workaround is not grouped - The import declaration and the const re-cast are separated by an unrelated import. Grouping them or adding a comment would help future readers understand why the named binding does not work here.
ANTHROPIC_COT_BUDGET parse throws at request time, not startup - parseAnthropicCotBudgetFromEnv throws for invalid values and is called per-request, so a misconfigured env var produces 500s rather than a startup failure. Consider a module-level one-time parse or document that env-validator must run at startup.

What is Good

mockMiladyPricingMinimumDepositForRouteTests cleanly solves the Bun mock pollution problem; docs/unit-testing-milady-mocks.md is a valuable future reference.
Deep-merge helpers correctly preserve existing gateway.order and google.responseModalities when adding anthropic.thinking - the test suite covers the key combinations well.
Discord gateway: sendTyping guard and isChatInputCommand() check are correct defensive fixes.
Playwright APIRequestContext type fix removes the as-any cast cleanly.
fetch.preconnect binding on test fetch wrappers is a correct TypeScript conformance fix.
check-types:tests CI step closes a real coverage gap.

Summary: The null-safety regression in x402-topup.test.ts afterAll is the only real bug. The rate-limit behaviour change is worth explicit acknowledgement. Everything else is minor or informational.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

✅ Fixed: Spread order lets options override deploy-wide thinking config
- Swapped the spread order so mergeAnthropicCotProviderOptions comes after options, ensuring deploy-wide ANTHROPIC_COT_BUDGET cannot be overridden by caller-supplied options.

Or push these changes by commenting:

@cursor push 99924fb2eb

Preview (99924fb2eb)

diff --git a/packages/lib/api/a2a/skills.ts b/packages/lib/api/a2a/skills.ts
--- a/packages/lib/api/a2a/skills.ts
+++ b/packages/lib/api/a2a/skills.ts
@@ -91,8 +91,8 @@
         role: m.role as "user" | "assistant" | "system",
         content: m.content,
       })),
+      ...options,
       ...mergeAnthropicCotProviderOptions(model),
-      ...options,
     });
 
     let fullText = "";

_{This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.}

packages/lib/api/a2a/skills.ts

Read user_characters.settings.anthropicThinkingBudgetTokens for agent chat; add ANTHROPIC_COT_BUDGET_MAX cap and resolution helpers. Wire MCP and A2A routes; extend tests, env-validator, and .env.example. Update CHANGELOG, site changelog, README, ROADMAP, anthropic-cot-budget doc, and route/module comments with WHYs (owner-controlled settings vs untrusted request params). Made-with: Cursor

…ridden Move mergeAnthropicCotProviderOptions spread after options spread in executeSkillChatCompletion to ensure the deploy-wide ANTHROPIC_COT_BUDGET configuration takes precedence over caller-supplied options. This aligns with all other call sites in the codebase which consistently place mergeAnthropicCotProviderOptions last to enforce operator-controlled thinking budgets.

odilitime · 2026-03-27T08:15:00Z

@cursor push 99924fb

…ridden Move mergeAnthropicCotProviderOptions spread after options spread in executeSkillChatCompletion to ensure the deploy-wide ANTHROPIC_COT_BUDGET configuration takes precedence over caller-supplied options. This aligns with all other call sites in the codebase which consistently place mergeAnthropicCotProviderOptions last to enforce operator-controlled thinking budgets. Applied via @cursor push command

Copilot

Pull request overview

Aligns repo typechecking and test typings across the workspace, introduces deploy-scoped Anthropic extended-thinking configuration helpers, and tightens/clarifies several gateway + test mocks/casts to reduce order-dependent failures.

Changes:

Split/expand typecheck coverage (tests project typecheck in CI, new per-package typecheck scripts, updated tsconfigs/paths).
Add Anthropic COT budget env + providerOptions merge helpers and thread them through gateway-backed routes/services.
Stabilize/retarget numerous unit/integration tests (mocks, env handling, casting, and fixture shapes), including Milady pricing mock helper + renamed cron test.

Reviewed changes

Copilot reviewed 100 out of 101 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
tsconfig.test.json	Adds UI package path aliases and refines include/exclude for test typecheck.
services/agent-server/src/routes.ts	Broadens `set.status` typing for Elysia compatibility.
services/agent-server/package.json	Adds `typecheck` script.
packages/ui/tsconfig.json	Adds cross-workspace path aliases and excludes Storybook stories from UI typecheck.
packages/ui/src/styled-jsx.d.ts	Augments React style attributes for `styled-jsx`.
packages/tests/unit/z-milady-billing-route.test.ts	Reworks Milady billing cron test mocking to be order-stable across the suite.
packages/tests/unit/x402/facilitator-service.test.ts	Adjusts casts for stricter typing.
packages/tests/unit/wallet-auth.test.ts	Fixes mock typing/resolved values.
packages/tests/unit/waifu-bridge.test.ts	Uses a typed mutable env view for `NODE_ENV` writes.
packages/tests/unit/v1-milaidy-provision-route.test.ts	Uses shared Milady pricing mock helper.
packages/tests/unit/token-agent-linkage.test.ts	Refactors validation test into scenario loop with strict checks.
packages/tests/unit/security-validations.test.ts	Avoids `{}` prototype typing issues in pollution assertions.
packages/tests/unit/referrals-service.test.ts	Uses `toMatchObject` to reduce brittle equality.
packages/tests/unit/provisioning-jobs.test.ts	Fixes Vitest mock restore typing and fetch assignment typing.
packages/tests/unit/provisioning-jobs-followups.test.ts	Uses `toMatchObject` for partial comparisons.
packages/tests/unit/privy-sync.test.ts	Uses `toMatchObject` for hydrated user comparisons.
packages/tests/unit/pr385-round5-fixes.test.ts	Aligns fake service signature with optional id param.
packages/tests/unit/performance-optimizations.test.ts	Adds typed helpers to avoid repeated unsafe access to provider snapshots.
packages/tests/unit/oauth/token-cache.test.ts	Rewrites date normalization branch for clearer typing.
packages/tests/unit/oauth/provider-registry.test.ts	Adjusts optional chaining/non-null assertions in provider routes.
packages/tests/unit/oauth/oauth-service.test.ts	Tightens typings for provider routes/platform loops/status handling.
packages/tests/unit/oauth/errors.test.ts	Stabilizes enum value typing for expectations.
packages/tests/unit/milaidy-sandbox-bridge-security.test.ts	Fixes fetch typing and backup fixture shape updates.
packages/tests/unit/milaidy-pairing-token-route.test.ts	Stabilizes env setup/teardown for base domain; removes module mock.
packages/tests/unit/milaidy-agent-routes-followups.test.ts	Uses shared Milady pricing mock helper.
packages/tests/unit/milady-sandbox-service.test.ts	Introduces typed sandbox provider helper and updates sandbox fixture fields.
packages/tests/unit/milady-create-routes.test.ts	Uses shared Milady pricing mock helper.
packages/tests/unit/mcp-twitter-tools.test.ts	Adds typed OAuth fixtures + handler result typing; normalizes token result shape.
packages/tests/unit/mcp-proxy-affiliate-pricing.test.ts	Fixes fetch typing cast.
packages/tests/unit/mcp-lib.test.ts	Tightens auth context typing and assertions.
packages/tests/unit/mcp-hubspot-tools.test.ts	Adds typed OAuth fixtures/token helper and fixes fetch typing casts.
packages/tests/unit/mcp-google-tools.test.ts	Adds typed OAuth fixtures/handler result typing; fixes fetch typing casts.
packages/tests/unit/internal-jwt-auth.test.ts	Uses typed env writes for `NODE_ENV`.
packages/tests/unit/field-encryption.test.ts	Adds typed rows and mock signatures to match repository interfaces.
packages/tests/unit/evm-rpc-proxy-route.test.ts	Fixes fetch typing casts.
packages/tests/unit/engagement-metrics/compute-metrics-cron.test.ts	Aligns mock signatures with implementation params.
packages/tests/unit/engagement-metrics/admin-metrics-api.test.ts	Aligns mock signature with implementation params.
packages/tests/unit/eliza-app/whatsapp-auth.test.ts	Casts string literals for stricter typing compatibility.
packages/tests/unit/eliza-app/telegram-ux-helpers.test.ts	Fixes fetch typing cast.
packages/tests/unit/docker-ssh-cloud-deploy.test.ts	Adjusts logger redact import strategy.
packages/tests/unit/database-url.test.ts	Uses typed env record for safe env mutation + asserts.
packages/tests/unit/compat-envelope.test.ts	Updates Milady sandbox fixture with new billing fields.
packages/tests/unit/api/route-test-helpers.ts	Improves `routeParams` typing and fixes `File` BlobPart typing.
packages/tests/unit/api/openapi-catalog.test.ts	Refactors test structure and simplifies OPTIONS invocation.
packages/tests/unit/anthropic-thinking.test.ts	Adds unit tests for new Anthropic thinking env + merge helpers.
packages/tests/unit/admin-service-pricing-route.test.ts	Tightens auth mock typing and request construction.
packages/tests/runtime/performance.test.ts	Fixes UUID typing casts for agentId.
packages/tests/runtime/mcp-assistant-trending.test.ts	Updates debug rendering mode and failure handling structure.
packages/tests/runtime/integration/runtime-factory/oauth-cache-invalidation.test.ts	Removes `any` for MCP settings lookup with typed cast.
packages/tests/runtime/integration/performance/runtime-creation.test.ts	Fixes UUID typing casts for agentId.
packages/tests/runtime/integration/message-handler/mcp-tools.test.ts	Updates debug rendering mode and failure handling structure.
packages/tests/playwright/fixtures/auth.fixture.ts	Uses `APIRequestContext` instead of inferred request type and removes `any` call.
packages/tests/load-env.ts	Uses typed env write for `NODE_ENV`.
packages/tests/integration/x402-topup.test.ts	Refactors service patch typing and restores methods after suite.
packages/tests/integration/webhooks-e2e.test.ts	Wraps fetch to auto-sign webhooks and preserves `fetch.preconnect`.
packages/tests/integration/unified-oauth-api.test.ts	Tightens `it` wrapper signature.
packages/tests/integration/services/users.service.test.ts	Adds fixture guard and tightens nullable handling.
packages/tests/integration/services/users-join-regression.test.ts	Adds runtime guard for relation shape.
packages/tests/integration/services/organizations.service.test.ts	Simplifies membership assertion.
packages/tests/integration/server-wallets.test.ts	Updates mocked returning rows and loosens assertion to `toMatchObject`.
packages/tests/integration/mcp-registry.test.ts	Tightens test wrapper signature.
packages/tests/integration/financial/credits-budget-flow.test.ts	Removes `source` field from addCredits call in test.
packages/tests/integration/financial/concurrent-operations.test.ts	Removes `source` field from addCredits call in test.
packages/tests/helpers/mock-milady-pricing-for-route-tests.ts	Adds helper to safely override Milady MINIMUM_DEPOSIT without clobbering constants.
packages/tests/helpers/index.ts	Re-exports the new Milady pricing mock helper.
packages/tests/fixtures/mcp-test-character.ts	Updates `simpleTestCharacter` id to UUID-like value.
packages/tests/e2e/setup-server.ts	Wraps fetch to ensure server availability and preserves `fetch.preconnect`.
packages/services/gateway-discord/src/gateway-manager.ts	Tightens discord.js handler typing, adds characterId routing, guards sendTyping, and listener map casts.
packages/scripts/check-types-split.ts	Updates typecheck chunk selection and excludes stories from temp configs.
packages/lib/services/twitter-automation/app-automation.ts	Threads new Anthropic COT merge helper into gateway generateText calls.
packages/lib/services/seo.ts	Threads new Anthropic COT merge helper into SEO Claude calls.
packages/lib/services/app-promotion.ts	Threads new Anthropic COT merge helper into promo copy generation.
packages/lib/services/app-promotion-assets.ts	Uses merge helpers for google image modalities + Anthropic COT for copy generation.
packages/lib/services/app-builder-ai-sdk.ts	Threads new Anthropic COT merge helper into streamText calls.
packages/lib/providers/types.ts	Broadens `providerOptions` typing to shared JSON-serializable merged options.
packages/lib/providers/index.ts	Exports new anthropic-thinking helpers.
packages/lib/providers/cloud-provider-options.ts	Adds shared merged providerOptions type alias.
packages/lib/providers/anthropic-thinking.ts	Adds env parsing + merge helpers for Anthropic extended thinking.
packages/lib/middleware/rate-limit.ts	Makes preset numeric limits consistent across dev/prod; clarifies logging/docs.
packages/lib/eliza/runtime-factory.ts	Changes plugin-sql adapter import to default + casted factory signature.
packages/lib/config/env-validator.ts	Adds validation rule for `ANTHROPIC_COT_BUDGET`.
packages/lib/auth/jwks.ts	Updates jose KeyLike typings for cached keys.
packages/lib/api/a2a/skills.ts	Threads merge helpers into skill chat + image generation paths.
packages/content/changelog.mdx	Adds Mar 27, 2026 entry for Anthropic COT + Milady testing fixes.
package.json	Adds per-package typecheck scripts and updates unit test split scripts for renamed file.
docs/unit-testing-milady-mocks.md	Documents Bun `mock.module` pitfalls and Milady pricing mock strategy.
docs/anthropic-cot-budget.md	Documents `ANTHROPIC_COT_BUDGET` behavior and providerOptions merge rationale.
docs/ROADMAP.md	Records completed items for COT budget + Milady test stabilization.
app/api/v1/responses/route.ts	Uses gateway+Anthropic merge helper for providerOptions injection.
app/api/v1/messages/route.ts	Threads merge helper into non-stream and stream AI SDK calls.
app/api/v1/generate-image/route.ts	Uses merge helpers for google image modalities and optional Anthropic COT.
app/api/v1/chat/route.ts	Threads merge helper into chat streaming call.
app/api/v1/chat/completions/route.ts	Threads merge helper into streaming and non-streaming calls.
app/api/v1/admin/service-pricing/tests/route.integration.test.ts	Standardizes NextRequest construction via URL base.
app/api/mcp/tools/generation.ts	Threads merge helpers into MCP generation (text + image) tools.
app/api/agents/[id]/mcp/route.ts	Threads merge helper into tool-call streaming.
app/api/agents/[id]/a2a/route.ts	Threads merge helper into A2A streaming.
README.md	Adds links to new “WHY” engineering docs and clarifies `bun run test` staging.
CHANGELOG.md	Adds Keep-a-Changelog style engineering changelog entries for these changes.
.github/workflows/tests.yml	Adds `check-types:tests` to CI.
.env.example	Documents optional `ANTHROPIC_COT_BUDGET` env var.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-27T08:15:54Z

packages/lib/eliza/runtime-factory.ts

+const createDatabaseAdapter = createDatabaseAdapterDefault as (
+  config: { postgresUrl: string },
+  agentId: UUID,
+) => IDatabaseAdapter;


@elizaos/plugin-sql/node is being imported as a default export and then cast to a callable adapter factory. This bypasses type safety and can hide a runtime mismatch if the default export isn’t actually callable (or if its signature changes). Consider updating the local declare module "@elizaos/plugin-sql/node" typings to correctly model the default export (e.g., a callable function with any attached properties like schema) and then import it with the accurate type, rather than casting unknown here.

Suggested change

const createDatabaseAdapter = createDatabaseAdapterDefault as (

config: { postgresUrl: string },

agentId: UUID,

) => IDatabaseAdapter;

type CreateDatabaseAdapterFn = (

config: { postgresUrl: string },

agentId: UUID,

) => IDatabaseAdapter;

function ensureCreateDatabaseAdapter(fn: unknown): CreateDatabaseAdapterFn {

if (typeof fn !== "function") {

throw new TypeError(

'Default export from "@elizaos/plugin-sql/node" is not a callable database adapter factory',

);

}

return fn as CreateDatabaseAdapterFn;

}

const createDatabaseAdapter = ensureCreateDatabaseAdapter(

createDatabaseAdapterDefault,

);

Dismissed: Comment targets line 21 but identifier schema not found in file — code may have been removed or renamed

Fixed in 6455e14.

Copilot · 2026-03-27T08:15:54Z

packages/lib/config/env-validator.ts

+  ANTHROPIC_COT_BUDGET: {
+    required: false,
+    description:
+      "Default Anthropic extended-thinking token budget when a character omits settings.anthropicThinkingBudgetTokens. Unset or 0 disables unless the character sets a positive budget",
+    validate: (value: string) => {
+      const trimmed = value.trim();
+      if (trimmed === "") {
+        return false;
+      }
+      if (!/^\d+$/.test(trimmed)) {
+        return false;
+      }
+      const n = Number.parseInt(trimmed, 10);
+      return n >= 0 && n <= Number.MAX_SAFE_INTEGER;
+    },
+    errorMessage:
+      "Must be a non-negative integer string (0 = off; when enabling, Anthropic often expects at least ~1024)",
+  },


ANTHROPIC_COT_BUDGET parsing in anthropic-thinking.ts throws on invalid non-empty values, but the env validator treats invalid optional values as warnings (startup continues). That means a mis-set budget can turn into request-time exceptions across any route that spreads mergeAnthropicCotProviderOptions(...). Either make invalid values for this variable a hard validation error (fail fast at startup), or change the parsing/merge path to degrade safely (e.g., log once and treat as "off").

Fixed in fe10e6b.

Fixed in 18bfb11.

Copilot · 2026-03-27T08:15:55Z

packages/tests/unit/docker-ssh-cloud-deploy.test.ts

 // Importing via the file-system path with a cache-buster query param
 // guarantees we always get the real implementation regardless of mocks.
-import { redact } from "../../lib/utils/logger.ts?_real";
+import { redact } from "../../lib/utils/logger";


The comment above says this import uses a cache-buster query param to avoid mock.module("@/lib/utils/logger") poisoning, but the import no longer includes any cache-buster. If other unit files mock the logger module earlier in the same Bun process, this test may again receive the mocked module and lose the real redact export. Either restore the cache-buster (or another unmocked import strategy) or update the comment and ensure redact is always available in the mocked logger module(s).

Fixed in fe10e6b.

Dismissed: Line 16 still includes the cache-buster query param ?v=docker-ssh-test as required

odilitime · 2026-03-27T08:16:48Z

@coderabbitai review

coderabbitai · 2026-03-27T08:16:53Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Iteration 1 prr-fix:prrc_kwdop6btdm6yyjtn

Iteration 1 prr-fix:prrc_kwdop6btdm6yyjt7

Changes: - rate-limit.ts: PR Review: fix(types): test project, split check-types, UI package, gateways ...

odilitime · 2026-03-27T08:25:33Z

@coderabbitai review

coderabbitai · 2026-03-27T08:25:38Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

claude · 2026-03-27T08:27:29Z

Code Review — PR #417: Anthropic CoT Controls

Overall the approach is sound: centralizing thinking-budget resolution in a single helper, keeping budgets off request bodies, and wiring env defaults through a type-safe merge layer. A few issues worth addressing before merge.

🐛 Potential Bugs

1. Temperature conflict in seo.ts and app-promotion.ts

Anthropic's API rejects requests with temperature ≠ 1 when extended thinking is enabled. The PR comment in chat/completions/route.ts says "@ai-sdk/anthropic strips temp/topP/topK when thinking is on", but seo.ts and app-promotion.ts use generateText (not streamText), and the two routes use hardcoded non-1 temperatures:

// seo.ts
const { text } = await generateText({
  model: gateway.languageModel(modelId),
  temperature: 0.3,           // ← will cause API error when ANTHROPIC_COT_BUDGET is set
  ...mergeAnthropicCotProviderOptions(modelId),
  ...

// app-promotion.ts
await generateText({
  model: gateway.languageModel(promoModel),
  temperature: 0.7,           // ← same risk
  ...mergeAnthropicCotProviderOptions(promoModel),

If the AI SDK Gateway path does not strip temperature for generateText the way the comment implies, this will surface as a 400 from the Anthropic API when ANTHROPIC_COT_BUDGET is non-zero. Recommend either (a) explicitly documenting the SDK version that guarantees stripping or (b) using temperature: 1 at these call sites when thinking is active, or (c) a comment noting the SDK guarantees and the SDK version.

2. mergeGoogleImageModalitiesWithAnthropicCot now applied to ALL non-OpenAI image models

In generate-image/route.ts:

const cotOpts = isOpenAIModel
  ? mergeAnthropicCotProviderOptions(imageModel)    // returns {} for OpenAI — no-op
  : mergeGoogleImageModalitiesWithAnthropicCot(imageModel);  // applies google.responseModalities to ANY non-OpenAI model

Before this PR, google.responseModalities was only set in the two Google-specific streamConfig paths. After this change it's applied to every non-OpenAI model at the call site. If the image route ever serves an Anthropic or other non-Google image model, it will receive an unexpected google provider option. The AI SDK likely ignores it, but it changes observable behavior. The isOpenAIModel ? ... : ... branch would be more precise as isGoogleModel ? mergeGoogleImageModalitiesWithAnthropicCot : mergeAnthropicCotProviderOptions.

⚠️ Breaking Change for Developers

3. Rate-limit preset change removes dev relaxation

// Before
maxRequests: isDevelopment ? 10000 : 60,

// After
maxRequests: 60,

Local developers and CI integration tests that exercise HTTP API routes will now hit production-level rate limits (60 req/min STANDARD). Unit tests that spin up a real server or use fetch against local routes can fail intermittently. This is documented in the preset comments, but it's a behavioral change that will surprise contributors who run bun run test:integration locally. Worth calling out explicitly in the PR description or flagging in CONTRIBUTING.md.

🔍 Code Quality

4. mergeProviderOptions only deep-merges 3 known provider keys

if (a?.gateway && b?.gateway) { out.gateway = { ...a.gateway, ...b.gateway }; }
if (a?.anthropic && b?.anthropic) { out.anthropic = { ...a.anthropic, ...b.anthropic }; }
if (a?.google && b?.google) { out.google = { ...a.google, ...b.google }; }

Any other provider-specific keys (openai, vertex, bedrock, etc.) fall through to the { ...a, ...b } shallow spread, where b's top-level key silently overwrites a's. This is fine for current usage but worth a comment noting the limitation so future callers don't get surprised when adding new provider options.

5. parsePositiveIntStrict accepts and returns 0

The function is named "strict positive" but returns 0 (since /^\d+$/ matches "0" and parseInt("0") = 0). Callers then do if (n <= 0) return null after the call. The naming and intent are slightly mismatched — a comment clarifying that "positive" here means "non-negative" would reduce confusion.

6. parseThinkingBudgetFromCharacterSettings silently truncates floats

const n = Math.trunc(raw);  // 1.9 → 1, 0.5 → 0

A stored value of 0.5 becomes 0 (thinking off), which may surprise an agent owner who intended a positive budget. Since the docs and interface say "integer ≥ 0", consider rejecting non-integers explicitly (!Number.isInteger(raw)) rather than silently truncating, or documenting that truncation occurs.

✅ Strengths

The single resolution path in resolveAnthropicThinkingBudgetTokens is clean and easy to audit.
ANTHROPIC_COT_BUDGET_MAX as an operator ceiling over per-agent settings is a good safety valve.
failOnInvalid: true in env-validator.ts ensures bad config is caught at startup rather than at request time.
Test coverage for parse functions, merge helpers, and edge cases (0 disables, max clamps) is thorough.
Deep-merging providerOptions rather than shallow-spreading is the correct approach.
The CHANGELOG.md and docs/anthropic-cot-budget.md docs are well-written and explain the design rationale clearly.

cursor · 2026-03-27T17:15:14Z

app/api/agents/[id]/a2a/route.ts

+    process.env,
+    agentThinkingBudget ?? undefined,
+  );
+  const baseCost = await estimateRequestCost(model, fullMessages, effectiveThinkingBudget);


A2A route cost estimation inconsistent with MCP route

Medium Severity

The A2A route passes effectiveThinkingBudget (type number | null) directly to estimateRequestCost as maxOutputTokens. When the thinking budget is e.g. 5000, estimateRequestCost uses 5000 as the total output token estimate, replacing the default 500 base output tokens. The MCP route correctly adds the thinking budget on top of base output tokens (500 + effectiveThinkingBudget). This means the A2A route underestimates credit reservation when thinking is enabled. Additionally, when resolveAnthropicThinkingBudgetTokens returns null, it's passed as null to a parameter typed number | undefined.

Additional Locations (1)

app/api/agents/[id]/mcp/route.ts#L334-L341

Fixed in 67138cc.

cursor · 2026-03-27T17:15:14Z

app/api/agents/[id]/a2a/route.ts

+  // Note: estimateRequestCost adds thinking tokens to output estimate when budget is provided
+// Calculate estimated costs, including potential thinking budget
+  // Use resolveAnthropicThinkingBudgetTokens to get effective budget (same as MCP route)
+  // This ensures credit reservation accounts for thinking tokens


Duplicate comment block in A2A route handler

Low Severity

Lines 297-301 contain a duplicated comment — // Calculate estimated costs, including potential thinking budget appears twice in succession (once at line 297 with a // Note: continuation, and again at line 299 with different continuation text). This looks like an accidental merge artifact from editing the same block twice.

Fixed in 67138cc.

Iteration 7 prr-fix:prrc_kwdop6btdm6y8iwm prr-fix:prrc_kwdop6btdm6y8iwq

claude · 2026-03-27T17:15:47Z

PR Review: feat: Anthropic CoT controls

Overall this is well-structured work — the security rationale (budget lives on the character record, not request bodies) is sound, the deep-merge helpers are properly designed, and env validation with failOnInvalid is a good pattern. A few issues worth addressing before merge.

Bugs / Correctness

1. Duplicate comment block in app/api/agents/[id]/a2a/route.ts

There are two nearly-identical comment blocks left by accident around the cost estimation section:

// Calculate estimated costs, including potential thinking budget
// Note: estimateRequestCost adds thinking tokens to output estimate when budget is provided
// Calculate estimated costs, including potential thinking budget   ← duplicate
// Use resolveAnthropicThinkingBudgetTokens to get effective budget (same as MCP route)

One should be removed.

2. Inconsistent output-token estimation between MCP and A2A routes

MCP route (mcp/route.ts): manually adds effectiveThinkingBudget to a hardcoded baseOutputTokens = 500
A2A route (a2a/route.ts): passes effectiveThinkingBudget as a 3rd argument to estimateRequestCost

These do different things. If estimateRequestCost already accounts for the thinking budget, the MCP route's manual addition may double-count in some future refactor. The two routes should use the same approach.

3. model.includes("claude") is a weak check in mcp/route.ts

const estimatedOutputTokens = model.includes("claude") && effectiveThinkingBudget > 0

Every other check in this file uses getProviderFromModel(model) === "anthropic" or resolveAnthropicThinkingBudgetTokens (which checks provider internally). The .includes("claude") check could match a hypothetical non-Anthropic model with "claude" in the name. Use getProviderFromModel(model) === "anthropic" here for consistency.

4. getRateLimitMultiplier uses parseInt but validator allows decimals

env-validator.ts accepts RATE_LIMIT_MULTIPLIER values matching /^\d+(\.\d+)?$/ (decimals OK). But getRateLimitMultiplier() calls Number.parseInt, silently truncating decimal values like "1.5". Either align the validator to integers-only (/^\d+$/) or switch to Number.parseFloat in getRateLimitMultiplier.

5. No minimum budget enforcement in resolveAnthropicThinkingBudgetTokens

Anthropic requires budget_tokens ≥ 1024 when thinking is enabled. A character setting of, say, anthropicThinkingBudgetTokens: 50 would pass through resolveAnthropicThinkingBudgetTokens (it's > 0) and hit the API, which would reject it. Consider enforcing a minimum (e.g., 1024) on the resolved value, or at least documenting the Anthropic minimum in the settings key JSDoc.

Design / Quality

6. Root-level anthropic-thinking.ts and anthropic-thinking.test.ts are dead code

The routes import from @/lib/providers/anthropic-thinking, not from the repo root. The root-level files define a completely different API (CharacterThinkingSettings.anthropicThinking.enabled boolean vs the anthropicThinkingBudgetTokens integer in the packages/lib version) and have divergent constants (DEFAULT=10000, MIN=1000 vs no explicit minimums in packages/lib). These look like an early prototype that was superseded. They should be deleted — leaving them creates confusion about which implementation is authoritative.

7. RateLimitPresets lost as const

The change removes as const from RateLimitPresets, making the presets mutable objects. Any call site that spreads or destructures these can now accidentally mutate them. This was probably unintentional — add as const back, or use Object.freeze.

8. agentThinkingBudget ?? undefined in a2a/route.ts is a no-op

parseThinkingBudgetFromCharacterSettings returns number | undefined (never null), so agentThinkingBudget ?? undefined always equals agentThinkingBudget. This can be simplified to just agentThinkingBudget.

Missing Test Coverage

9. packages/lib/providers/anthropic-thinking.ts has no unit tests

The root anthropic-thinking.test.ts covers the orphaned root module. The production module at packages/lib/providers/anthropic-thinking.ts — which includes resolveAnthropicThinkingBudgetTokens, mergeProviderOptions, mergeAnthropicCotProviderOptions, mergeGoogleImageModalitiesWithAnthropicCot, and parseThinkingBudgetFromCharacterSettings — has no tests. These are the functions actually used by the routes and have edge cases (env cap wins, per-agent 0 disables despite env, maxCap < agentBudget clamps, model provider filter, etc.) that would benefit from test coverage.

Minor

The mcp-test-character.ts fixture addition in packages/tests/fixtures/ — does this fixture now need to include a settings field to test the settings: Record<string, unknown> added to the character type? Verify the fixture is still sufficient for the MCP route tests.
parsePositiveIntStrict duplicates the validation logic already in env-validator.ts. Not blocking, but worth noting if the validation rules ever diverge.

Iteration 1 prr-fix:ic-4143706884-3 prr-fix:ic-4144015244-5

Iteration 1 prr-fix:ic-4144023511-5

Iteration 1 prr-fix:ic-4143990242-4

claude · 2026-03-27T17:19:36Z

Code Review — PR #417: Anthropic CoT Controls

Overall the architecture is sound: character settings > env default > disabled, with proper security rationale (untrusted clients cannot set thinking budgets). The merge helpers are a good abstraction. A few issues worth addressing before merge:

Bugs / Correctness

1. Root-level anthropic-thinking.ts duplicates — different model list than production

anthropic-thinking.ts and anthropic-thinking.test.ts live at the repo root but the production implementation is in packages/lib/providers/anthropic-thinking.ts. The root-level version:

Is missing claude-3-7-sonnet, claude-sonnet-4, claude-opus-4 from EXTENDED_THINKING_MODEL_PATTERNS
Has a completely different API (validateBudgetTokens, getThinkingConfig, buildThinkingParam vs the production resolveAnthropicThinkingBudgetTokens)

The root-level test (anthropic-thinking.test.ts) imports from ./anthropic-thinking (root), so it's testing the wrong implementation. If these are dev artifacts they should be removed; if they're meant to be the canonical implementation they should replace or be reconciled with the packages/lib/providers/ version.

2. getRateLimitMultiplier uses parseInt but env validator accepts floats

In packages/lib/middleware/rate-limit.ts:

const parsed = Number.parseInt(multiplier, 10);

But env-validator.ts accepts floats for RATE_LIMIT_MULTIPLIER:

if (!/^\d+(\.\d+)?$/.test(trimmed)) {

Setting RATE_LIMIT_MULTIPLIER=1.5 passes validation but silently truncates to 1. Use Number.parseFloat in getRateLimitMultiplier, or restrict the validator to integers only.

3. settings added as required field in character type narrowing

Both route handlers add:

settings: Record<string, unknown>;

to their local character type. If any existing user_characters DB row has a NULL settings column this will fail at runtime when parseThinkingBudgetFromCharacterSettings receives null. The function itself handles null gracefully, but the type annotation doesn't reflect that. It should be:

settings: Record<string, unknown> | null;

4. Thinking budget cost estimate inconsistency between MCP and A2A routes

MCP route:

const effectiveThinkingBudget = resolveAnthropicThinkingBudgetTokens(...) ?? 0;
const estimatedOutputTokens = model.includes("claude") && effectiveThinkingBudget > 0
  ? baseOutputTokens + effectiveThinkingBudget
  : baseOutputTokens;

A2A route:

const maxOutputTokens = effectiveThinkingBudget != null ? 500 + effectiveThinkingBudget : undefined;
const baseCost = await estimateRequestCost(model, fullMessages, maxOutputTokens);

The logic is different: MCP checks model.includes("claude") as a guard; A2A just checks != null. The resolveAnthropicThinkingBudgetTokens function already handles the provider/model check internally and returns null for non-Anthropic models, so the extra model.includes("claude") guard in MCP is redundant but harmless. Unifying these would reduce drift.

Design / Clarity

5. ANTHROPIC_COT_BUDGET=0 passes validation but is silently treated as disabled

The validator accepts 0 (n >= 0) but parseAnthropicCotBudgetFromEnv returns null when n <= 0. The env var description says "0 or unset = disabled" which is accurate, but the validator's errorMessage says "0 or unset = disabled" — making 0 pass validation when it could be clearer to just document this as "use empty string or omit to disable" rather than accepting 0. Not a bug, just potentially surprising to operators.

6. mergeProviderOptions note on non-deep-merged providers

The JSDoc notes that non-gateway/anthropic/google keys get clobbered. The current call sites don't pass conflicting providers, but this is a latent correctness risk as more providers are added. Consider either: (a) making the merge generic over all shared keys, or (b) adding a runtime assertion/warning if unknown keys collide.

7. a2a/skills.ts — no character context for CoT

The A2A skills route applies mergeAnthropicCotProviderOptions(model) (env-only, no character budget). This is documented as intentional (env-only for routes without a resolved character), but unlike api/agents/[id]/a2a it doesn't pass character settings even when a character is resolvable from the A2A context. Worth a comment explaining why skills don't get per-character budgets.

Minor

8. parsePositiveIntStrict throws on whitespace-only values — but callers trim before passing. The whitespace check in the function is unreachable for ANTHROPIC_COT_BUDGET since parseAnthropicCotBudgetFromEnv returns early for empty strings. Not harmful but slightly misleading.

9. Test file naming — packages/tests/unit/milady-billing-route.test.ts was renamed to z-milady-billing-route.test.ts (for ordering). This is a workaround for a Bun test runner issue. A comment explaining this naming convention (and ideally a note in the test file itself) would help future contributors avoid renaming it back.

What's Good

Security model is correct: budget on character record, not in request body. Clear rationale in docs.
The resolveAnthropicThinkingBudgetTokens resolution hierarchy (character > env default > disabled) is clean.
Explicit 0 passed to disable CoT in background services (SEO, promotion, Twitter) is the right approach where temperature matters.
failOnInvalid flag on env validator for startup-time failures is a useful addition.
Rate limit unification (same numbers prod/dev, multiplier knob) is cleaner than the old isDevelopment ? 10000 : 60 pattern.
RATE_LIMIT_DISABLED guarded to non-production only.

The main blocking items are #1 (root-level duplicate with wrong model list), #2 (parseInt/float mismatch), and #3 (nullable settings type).

Iteration 4 prr-fix:ic-4144080947-4

claude · 2026-03-27T17:20:42Z

Eliza Cloud V2

Stack

Runtime: Bun
Framework: Next.js 15 (App Router)
Database: PostgreSQL + Drizzle ORM
Deployment: Vercel Serverless
UI: React + Tailwind CSS

Commands

bun install          # Install dependencies
bun run dev          # Start dev server
bun run build        # Production build
bun run db:migrate   # Apply database migrations
bun run db:generate  # Generate migration from schema
bun run db:studio    # Open Drizzle Studio

Database Migrations

Never use db:push - it's removed. All schema changes go through migrations.

Schema Change Workflow

Edit schema in db/schemas/
bun run db:generate
Review SQL in db/migrations/
bun run db:migrate
Commit both schema + migration

Custom Migrations

npx drizzle-kit generate --custom --name=descriptive_name

Rules

No CREATE INDEX CONCURRENTLY (runs in transaction)
Use IF NOT EXISTS / IF EXISTS for creating tables
Never edit applied migrations
NEVER use omnibus migrations that recreate the full schema or existing objects - they will fail in production by locking active tables. Instead:
1. Create small targeted migrations that ONLY add your new schema objects
2. Use separate migrations for data backfills
3. Put cleanup/drops in their own migration
4. Group related objects together but limit migrations to <100 lines
See docs/database-migrations.md for details

Type Checking

bun run check-types has many pre-existing errors across the codebase (db/, lib/services/, app/). Don't try to fix them all — only verify your changed files have no new errors. Filter output:

bun run check-types 2>&1 | grep -E "(your-file\.ts|your-other-file\.ts)"

If the grep returns empty, your changes are clean. bun run build also fails on unrelated env vars (ELIZA_APP_DISCORD_BOT_TOKEN). Use check-types filtered to your files instead.

Project Structure

app/           # Next.js App Router pages
lib/           # Business logic, services
db/
  schemas/     # Drizzle schema definitions
  migrations/  # SQL migration files
  repositories/# Data access layer
components/    # React components
scripts/       # CLI utilities

claude · 2026-03-27T17:20:51Z

Eliza Cloud

A comprehensive AI agent development platform built with Next.js 15, featuring multi-model AI generation (text, image, video), full elizaOS runtime integration, enterprise authentication, credit-based billing, and production-ready cloud infrastructure.

📋 Table of Contents

Overview
Key Features
Architecture
Tech Stack
Prerequisites
Quick Start
Development
Production Security
Platform Features
Database Architecture
API Reference
Deployment
Mobile App (iOS & Android)
AWS ECS Container Deployment
Troubleshooting
Additional Resources

🎯 Overview

Eliza Cloud V2 is a full-stack AI-as-a-Service platform that combines:

Multi-Modal AI Generation: Text chat, image creation, and video generation
elizaOS Integration: Full-featured autonomous agent runtime with memory, rooms, and plugins
SaaS Platform: User management, API keys, credit-based billing, usage tracking
Container Deployment: Deploy elizaOS projects via elizaos deploy CLI to AWS ECS
Enterprise Features: Privy authentication with multi-provider support, Stripe billing, ECR image storage, health monitoring

✨ Key Features

🤖 AI Generation Studio

Text & Chat:
- Multi-model support (GPT-4, Claude, Gemini, etc.) via AI SDK Gateway
- Real-time streaming responses
- Anthropic-compatible /api/v1/messages endpoint for Claude Code and Anthropic SDK clients
- Conversation persistence with full history
- Model selection and configuration
Image Creation:
- Google Gemini 2.5 Flash multimodal generation
- High-quality images (1024x1024)
- Automatic Vercel Blob storage
- Base64 preview + downloadable files
Video Generation:
- Multiple Fal.ai models: Veo3, Kling v2.1, MiniMax Hailuo
- Long-form video support (up to 5 minutes)
- Automatic Vercel Blob upload
- Fallback handling with error recovery

🧠 elizaOS Runtime Integration

Full Agent Runtime:
- AgentRuntime from @elizaos/core with PostgreSQL database
- Memory system with vector embeddings (384-3072 dimensions)
- Rooms, participants, relationships, and entities
- Plugin system with custom providers and actions
Character Creator:
- AI-assisted character definition builder
- Progressive JSON generation with live preview
- Import/export elizaOS-compatible character files
- Support for all character fields (bio, style, plugins, knowledge, etc.)
Agent Chat Interface:
- Chat with deployed elizaOS agents via rooms
- Message persistence and history
- Real-time WebSocket updates (future)
- Multi-agent conversations

💳 SaaS Platform Features

Billing System:
- Add funds via Stripe integration
- Automatic deduction for AI operations
- Usage tracking per organization/user
- Credit packs with volume pricing
API Key Management:
- Generate API keys for programmatic access
- Key rotation and regeneration
- Rate limiting per key
- Usage statistics and audit logs
Container Deployments:
- Deploy elizaOS projects via elizaos deploy CLI
- Docker-based deployments to AWS ECS (Elastic Container Service)
- ECR (Elastic Container Registry) for Docker image storage
- EC2-based ECS (t4g.small ARM instances, 1 per user)
- Health checks and monitoring via ECS

📊 Management & Analytics

Dashboard:
- Usage overview with charts (Recharts)
- Provider health monitoring
- Credit activity timeline
- Model usage breakdown
Gallery:
- View all generated images and videos
- Filter by type (image/video)
- Download or delete media
- Storage usage statistics
Analytics:
- Usage records by model, provider, type
- Cost breakdown and trends
- Error tracking and success rates

🔐 Security & Infrastructure

Enterprise Auth:
- Privy authentication with email, wallet, and social logins
- Organization and user management
- Webhook-based user synchronization
- Role-based access (admin, member)
Billing Integration:
- Stripe Checkout for credit purchases
- Webhook processing with idempotency
- Tax ID collection for businesses
- Invoice generation
Type Safety:
- Full TypeScript coverage
- Zod validation for API requests
- Drizzle ORM with type-safe queries

🏗 Architecture

Directory Structure

cloud/
├── app/                      # Next.js App Router
│   ├── api/                  # API routes
│   │   ├── v1/              # Versioned API
│   │   │   ├── chat/        # AI text generation
│   │   │   ├── generate-image/  # Image generation
│   │   │   ├── generate-video/  # Video generation
│   │   │   ├── gallery/     # Media gallery
│   │   │   ├── containers/  # Container management (AWS ECS/ECR)
│   │   │   ├── api-keys/    # API key CRUD
│   │   │   ├── character-assistant/  # Character creator AI
│   │   │   ├── user/        # User info
│   │   │   └── models/      # Available AI models
│   │   ├── eliza/           # elizaOS agent API
│   │   │   └── rooms/       # Agent rooms and messages
│   │   ├── stripe/          # Stripe webhooks and checkout
│   │   └── fal/             # Fal.ai proxy
│   ├── dashboard/           # Protected dashboard pages
│   │   ├── text/            # Text chat interface
│   │   ├── image/           # Image generation studio
│   │   ├── video/           # Video generation studio
│   │   ├── gallery/         # Generated media gallery
│   │   ├── containers/      # Container management UI
│   │   ├── api-keys/        # API key management
│   │   ├── billing/         # Credits and billing
│   │   ├── analytics/       # Usage analytics
│   │   ├── account/         # Account settings
│   │   ├── character-creator/  # Character builder
│   │   ├── eliza/           # elizaOS agent chat
│   │   └── storage/         # Storage management
│   ├── actions/             # Server actions
│   │   ├── auth.ts          # Auth actions
│   │   ├── gallery.ts       # Gallery actions
│   │   ├── characters.ts    # Character CRUD
│   │   ├── conversations.ts # Conversation management
│   │   └── users.ts         # User actions
│   ├── layout.tsx           # Root layout with analytics
│   ├── page.tsx             # Landing page
│   └── globals.css          # Global styles (Tailwind)
├── components/              # React components
│   ├── chat/                # Chat interfaces
│   ├── image/               # Image generation UI
│   ├── video/               # Video generation UI
│   ├── gallery/             # Gallery grid and display
│   ├── containers/          # Container tables
│   ├── api-keys/            # API key management UI
│   ├── billing/             # Credit packs and billing
│   ├── character-creator/   # Character builder UI
│   ├── dashboard/           # Dashboard metrics and cards
│   ├── layout/              # Header, sidebar, navigation
│   ├── theme/               # Theme provider and toggle
│   ├── ui/                  # Reusable UI components (45+ components)
│   └── ai-elements/         # AI-specific UI components
├── db/                      # Database layer
│   ├── sass/                # SaaS platform schema
│   │   └── schema.ts        # Organizations, users, API keys, credits, etc.
│   ├── eliza/               # elizaOS runtime schema
│   │   └── schema.ts        # Agents, memories, rooms, embeddings, etc.
│   ├── drizzle.ts           # Database client
│   └── migrations/          # Migration SQL files
├── lib/                     # Shared utilities
│   ├── queries/             # Database queries (12 files)
│   │   ├── api-keys.ts      # API key operations
│   │   ├── credits.ts       # Credit transactions
│   │   ├── containers.ts    # Container CRUD
│   │   ├── container-quota.ts  # Quota enforcement
│   │   ├── generations.ts   # Media generation records
│   │   ├── usage.ts         # Usage tracking
│   │   └── ...
│   ├── services/            # Business logic services
│   │   ├── ecr.ts           # AWS ECR integration
│   │   ├── ecs.ts           # AWS ECS deployment
│   │   ├── health-monitor.ts  # Provider health checks
│   │   └── containers.ts    # Container management
│   ├── eliza/               # elizaOS integration
│   │   ├── agent-runtime.ts # AgentRuntime wrapper
│   │   ├── agent.ts         # Agent management
│   │   └── plugin-assistant/  # Custom elizaOS plugin
│   ├── config/              # Configuration
│   │   ├── env-validator.ts # Environment validation
│   │   ├── env-consolidation.ts  # Config helpers
│   │   └── startup.ts       # Startup checks
│   ├── errors/              # Custom error classes
│   ├── middleware/          # Middleware utilities
│   ├── auth.ts              # Auth helpers
│   ├── blob.ts              # Vercel Blob utilities
│   ├── stripe.ts            # Stripe client
│   ├── pricing.ts           # Cost calculations
│   ├── rate-limiter.ts      # Rate limiting
│   ├── utils.ts             # General utilities
│   └── types.ts             # Shared TypeScript types
├── .env.example            # Environment template
├── docs/                    # Detailed documentation
│   ├── API_REFERENCE.md    # Complete API reference
│   ├── anthropic-cot-budget.md   # ANTHROPIC_COT_BUDGET + provider merge WHYs
│   ├── unit-testing-milady-mocks.md  # Bun mock.module + Milady pricing test WHYs
│   ├── ROADMAP.md          # Product direction and done items
│   ├── DEPLOYMENT.md       # Deployment guide
│   ├── DEPLOYMENT_TROUBLESHOOTING.md  # Troubleshooting
│   ├── STRIPE_SETUP.md     # Stripe integration
│   ├── ENV_VARIABLES.md    # Environment configuration
│   └── ...
├── scripts/                 # Utility scripts
│   ├── seed-credit-packs.ts
│   └── ...
├── middleware.ts            # Next.js middleware (auth)
├── drizzle.config.ts        # Drizzle Kit config
└── package.json             # Dependencies

Request Flow

graph TD
    A[Client Request] --> B[Next.js Middleware]
    B --> C{Auth Required?}
    C -->|Yes| D[Privy Auth]
    C -->|No| E[Route Handler]
    D -->|Authenticated| E
    D -->|Unauthenticated| F[Redirect to Login]
    E --> G{Request Type}
    G -->|AI Chat| H[AI SDK Gateway]
    G -->|Image/Video| I[Gemini/Fal.ai]
    G -->|Data| J[Drizzle ORM]
    G -->|Container| K[AWS ECS/ECR]
    G -->|elizaOS| L[AgentRuntime]
    H --> M[Response]
    I --> M
    J --> N[PostgreSQL]
    K --> M
    L --> N
    N --> M

Database Architecture

The platform uses a single database with integrated schemas:

Platform Schemas (db/schemas/*.ts): Platform infrastructure
- Organizations (db/schemas/organizations.ts), users (db/schemas/users.ts), authentication
- API keys (db/schemas/api-keys.ts), usage tracking (db/schemas/usage-records.ts)
- Credit system (db/schemas/credit-transactions.ts, db/schemas/credit-packs.ts), billing, Stripe integration
- Containers (db/schemas/containers.ts), ECS/ECR deployments
- Generations (db/schemas/generations.ts - image/video records)
- Conversations (db/schemas/conversations.ts - platform-level chat)
- elizaOS Tables (integrated via @elizaos/plugin-sql schema):
  - Agents (character definitions)
  - Memories with vector embeddings
  - Rooms and participants
  - Entities and relationships
  - Components and tasks
  - Message servers and channels

🛠 Tech Stack

Core Framework

Next.js 15.5.4: React framework with App Router, Turbopack, and Server Actions
React 19.2.0: Latest UI library with server components
TypeScript 5: Full type safety

Database & ORM

PostgreSQL: Single database with all tables
- Platform tables: SaaS tables (users, credits, containers, etc.)
- elizaOS tables: Agent runtime tables (agents, memories, rooms, etc.)
Drizzle ORM 0.44.6: Type-safe SQL ORM
Drizzle Kit 0.31.5: Migrations and schema management
pgvector: Vector similarity search for embeddings

Authentication & Billing

Privy Auth: Web3-native authentication with multi-provider support (email, wallet, social logins)
- @privy-io/react-auth for frontend
- @privy-io/server-auth for backend token verification
Stripe 19.1.0: Payment processing and credit purchases
@stripe/stripe-js 8.0.0: Client-side Stripe integration

AI & Machine Learning

AI SDK 5.0.60: Vercel AI SDK for streaming
@ai-sdk/gateway 1.0.33: Multi-provider AI routing
@ai-sdk/openai 2.0.43: OpenAI provider
@ai-sdk/react 2.0.60: React hooks for AI
@fal-ai/client 1.6.2: Fal.ai video generation
@elizaos/core 1.6.1: elizaOS agent runtime
@elizaos/plugin-openai 1.5.15: OpenAI plugin for elizaOS
@elizaos/plugin-sql 1.6.1: SQL database plugin for elizaOS

Storage & Infrastructure

Vercel Blob 2.0.0: Media storage (images/videos)
@aws-sdk/client-ecr 3.x: AWS Elastic Container Registry
@aws-sdk/client-ecs 3.x: AWS Elastic Container Service deployment

Styling & UI

Tailwind CSS 4.1.14: Utility-first CSS framework
Radix UI: 20+ accessible, unstyled UI primitives
Lucide React 0.545.0: Icon library (1000+ icons)
class-variance-authority 0.7.1: Component variants
next-themes 0.4.6: Dark/light mode support
motion 12.23.22: Animation library
Sonner 2.0.7: Toast notifications
Recharts 2.15.4: Charts for analytics

Development Tools

ESLint 9.37.0: Code linting
Prettier 3.6.2: Code formatting
tsx 4.19.2: TypeScript execution
Zod 4.1.11: Schema validation

📦 Prerequisites

Required Software

Node.js: v20 or higher
npm: v10 or higher
Git: For version control

Required Services

Neon Database (neon.tech)
- Create a new project
- Copy the connection string
Privy (privy.io)
- Create an application
- Configure webhook endpoint: http://localhost:3000/api/privy/webhook
- Enable desired login methods (email, wallet, social)
- Note your Client ID and API Key
OpenAI or AI Gateway (at least one)
- OpenAI API key for direct access, OR
- AI Gateway API key for multi-provider access

Optional Services

Vercel Blob (vercel.com)
- Required for Gallery feature
- Create a Blob store and copy token
Fal.ai (fal.ai)
- Required for video generation
- Create account and get API key
AWS (aws.amazon.com)
- Required for container deployments
- AWS credentials (Access Key ID, Secret Access Key)
- ECS/ECR configuration, VPC, subnets, security groups
Stripe (stripe.com)
- Required for billing/credits
- Secret key and webhook secret

🚀 Quick Start

1. Clone and Install

cd cloud
bun install

2. Environment Setup

Copy the example environment file:

cp .env.example .env.local

Edit .env.local with your credentials (see .env.example for all options).

For local caching, the repo expects the Docker Redis container on localhost:6379:

REDIS_URL=redis://localhost:6379
KV_URL=redis://localhost:6379

Leave KV_REST_API_URL and KV_REST_API_TOKEN blank locally unless you are intentionally using Upstash.

Minimum required variables:

# Database
DATABASE_URL=postgresql://user:password@host:5432/database?sslmode=require

# Privy Authentication
NEXT_PUBLIC_PRIVY_APP_ID=your_privy_app_id_here
PRIVY_APP_SECRET=your_privy_app_secret_here
PRIVY_WEBHOOK_SECRET=replace_with_strong_random_secret

# AI (at least one)
OPENAI_API_KEY=sk-your_openai_key
# OR
AI_GATEWAY_API_KEY=your_gateway_key
# Optional for Groq-native models (groq/compound, groq/compound-mini)
GROQ_API_KEY=your_groq_key

Eliza App variables (for Telegram, iMessage, Discord, and WhatsApp integrations):

# JWT secret for Eliza App user sessions (required)
ELIZA_APP_JWT_SECRET=          # Generate: openssl rand -hex 32

# Telegram (optional)
ELIZA_APP_TELEGRAM_BOT_TOKEN=  # From @BotFather

# iMessage / Blooio (optional)
ELIZA_APP_BLOOIO_API_KEY=      # From Blooio dashboard

# Discord (optional)
ELIZA_APP_DISCORD_BOT_TOKEN=         # Developer Portal → Bot
ELIZA_APP_DISCORD_APPLICATION_ID=    # Developer Portal → General Information (also the OAuth2 Client ID)
ELIZA_APP_DISCORD_CLIENT_SECRET=     # Developer Portal → OAuth2 → Client Secret

# WhatsApp Business Cloud API (optional — for the public Eliza App bot)
ELIZA_APP_WHATSAPP_ACCESS_TOKEN=     # Meta Business Settings → System Users → Generate Token
ELIZA_APP_WHATSAPP_PHONE_NUMBER_ID=  # Meta App Dashboard → WhatsApp → API Setup
ELIZA_APP_WHATSAPP_APP_SECRET=       # Meta App Dashboard → Settings → Basic → App Secret
ELIZA_APP_WHATSAPP_VERIFY_TOKEN=     # Generate: openssl rand -hex 32
ELIZA_APP_WHATSAPP_PHONE_NUMBER=     # Display phone number in E.164 format (e.g. +14245074963)

Organization-level WhatsApp (Dashboard > Settings > Connections):

Per-organization WhatsApp credentials are stored in the database via the dashboard UI. Each organization connects their own WhatsApp Business account by entering their Access Token, Phone Number ID, and App Secret in the connections settings. The webhook URL and verify token are auto-generated and displayed after connecting.

See .env.example for the full list of Eliza App environment variables.

Generate secure passwords:

# Generate PRIVY_WEBHOOK_SECRET (min 32 chars)
openssl rand -base64 32

# Generate CRON_SECRET / ELIZA_APP_JWT_SECRET
openssl rand -hex 32

3. Database Setup

If you want the full local infrastructure stack, start Postgres and Redis first:

bun run db:local:start

Run migrations to create all tables:

npm run db:migrate

4. Seed Credit Packs (Optional)

If using Stripe billing:

npm run seed:credit-packs

This creates credit pack products in Stripe.

5. Start Development Server

npm run dev

Visit http://localhost:3000.

6. First Login

Click "Sign In" → Privy will create your user
You'll be redirected to the dashboard
Your organization starts with 10,000 credits

💻 Development

Available Scripts

# Development
npm run dev              # Start dev server with Turbopack
npm run build            # Production build with Turbopack
npm start                # Start production server

# Database
npm run db:generate      # Generate migrations from schema changes
npm run db:migrate       # Run pending migrations
npm run db:studio        # Open Drizzle Studio

# Code Quality
npm run lint             # Run ESLint
npm run lint:fix         # Auto-fix ESLint issues
npm run format           # Format with Prettier
npm run format:check     # Check formatting
npm run check-types      # TypeScript type checking

# Utilities
npm run seed:credit-packs   # Seed Stripe credit packs
npm run bootstrapper:build  # Build container bootstrapper

Testing (unit vs integration vs runtime)

Tests are split by kind; use the right script for what you want to run:

Script	Directory	What it runs	Needs
`bun run test:unit`	`tests/unit/`	Unit tests (mocked deps, fast)	Env preload only; some skip without `DATABASE_URL`
`bun run test:integration`	`tests/integration/`	API/DB/E2E integration tests	`DATABASE_URL` (+ migrations); some need a running server
`bun run test:runtime`	`tests/runtime/`	Runtime/factory and perf tests	`DATABASE_URL` (+ migrations), heavier
`bun run test`	`test:repo-unit:bulk` + `special`	Two staged unit batches (see `package.json` for included/excluded files)	Env preload only (same family as `test:unit`)
`bun run test:playwright`	`tests/playwright/`	Playwright E2E (optional)	`@playwright/test` installed

Env is loaded from .env, .env.local, and .env.test via preload.

Engineering docs (WHYs)

docs/unit-testing-milady-mocks.md — Why partial MILADY_PRICING mocks break other Milady modules under Bun, and how the billing cron tests isolate mock.module("@/db/client") contention.
docs/anthropic-cot-budget.md — Per-agent settings.anthropicThinkingBudgetTokens (MCP/A2A), env default (ANTHROPIC_COT_BUDGET) and cap (ANTHROPIC_COT_BUDGET_MAX), and why thinking budgets are not request parameters.
CHANGELOG.md — Engineering changelog (Keep a Changelog style).
docs/ROADMAP.md — Product direction and rationale; “Done” links to the above where relevant.

Development Workflow

Start dev server: npm run dev
Make changes: Edit files in app/, components/, lib/
Instant feedback: Turbopack provides sub-second HMR
Test features: Navigate to /dashboard routes
Check types: npm run check-types
Database changes: Edit db/schemas/*.ts → npm run db:generate → npm run db:migrate

Project Structure Guidelines

app/: Routes, API handlers, server actions
components/: Reusable React components
lib/: Business logic, database queries, services
db/: Database schemas and migrations
Server Components: Default for all components
Client Components: Only when needed ('use client')

🔐 Production Security

⚠️ IMPORTANT: Before deploying to production, you MUST complete the security configuration for Privy authentication.

Security Features Implemented

✅ Content Security Policy (CSP): Comprehensive CSP configured in next.config.ts that:

Protects against XSS attacks
Allows Privy authentication iframe
Allows WalletConnect and wallet integrations
Prevents clickjacking with frame-ancestors restrictions
Includes all required domains for Privy, Solana, and third-party services

✅ Security Headers: Multiple layers of protection:

X-Frame-Options: DENY (clickjacking protection)
X-Content-Type-Options: nosniff (MIME sniffing protection)
Referrer-Policy (referrer information control)
X-XSS-Protection (browser XSS protection)
Permissions-Policy (browser feature restrictions)

Pre-Production Checklist

Before deploying to production, complete these critical steps:

1. Configure Privy Dashboard

Visit https://dashboard.privy.io and configure:

✅ Add production domain to Allowed Domains
✅ Remove all test/development domains
✅ Enable HttpOnly cookies for enhanced security
✅ Complete domain ownership verification
✅ Configure MFA settings (disable SMS, enable authenticator apps)
✅ Review OAuth providers and session duration

2. Set Environment Variables

NEXT_PUBLIC_PRIVY_APP_ID=your_production_app_id
NEXT_PUBLIC_PRIVY_CLIENT_ID=your_production_client_id
PRIVY_APP_SECRET=your_production_app_secret
PRIVY_WEBHOOK_SECRET=strong_random_secret_here

3. Test Security Configuration

# Build and run in production mode
npm run build
npm run start

# Test security headers (in another terminal)
npm run test:security-headers

# Test all authentication flows
# - Login with all methods
# - Wallet connections
# - Transactions
# - Check browser console for CSP violations

Documentation

Complete security documentation is available:

Quick Start: docs/PRIVY_PRODUCTION_QUICKSTART.md - 5-minute setup guide
Complete Guide: docs/PRIVY_PRODUCTION_SECURITY.md - Comprehensive security documentation
CSP Testing: docs/CSP_TESTING_GUIDE.md - Testing and debugging CSP
Full Checklist: docs/PRODUCTION_CHECKLIST.md - Complete deployment checklist
Summary: SECURITY_UPDATES_SUMMARY.md - What was implemented

Testing Your Security Setup

Run the automated security headers test:

npm run test:security-headers

This will verify:

All security headers are present
CSP includes required Privy domains
Frame embedding protection is active
All security directives are configured

Common Security Issues

Issue: "Domain not allowed"

Solution: Add your domain to Privy Dashboard > Configuration > App settings > Allowed domains

Issue: Privy iframe not loading

Solution: Check browser console for CSP violations. Verify frame-src includes https://auth.privy.io

Issue: CSP violations

Solution: Review browser console, determine if legitimate, update next.config.ts if needed

See docs/CSP_TESTING_GUIDE.md for detailed troubleshooting.

🔧 Platform Features

1. AI Text Generation

Location: /dashboard/chat and /app/api/v1/chat/route.ts

Features:

Multi-model support (GPT-4, Claude, Gemini, etc.)
Real-time streaming responses with useChat hook
Conversation persistence with full history
Model selection dropdown
Token usage and cost tracking

Usage:

import { useChat } from "@ai-sdk/react";

const { messages, input, handleSubmit, isLoading } = useChat({
  api: "/api/v1/chat",
  body: { model: "gpt-4o" },
});

Cost: Token-based pricing from lib/pricing.ts

Anthropic Messages API (Claude Code): For tools that expect the Anthropic Messages API (e.g. Claude Code), use POST /api/v1/messages with the same request/response shape. Set ANTHROPIC_BASE_URL=https://cloud.milady.ai/api/v1 and ANTHROPIC_API_KEY to your Cloud API key so usage goes through Cloud credits instead of a direct Anthropic key. See API docs → Anthropic Messages. Why: single API key and billing for both OpenAI-style and Anthropic-style clients.

Public cloud agents (MCP / A2A) — Anthropic extended thinking: For POST /api/agents/{id}/mcp (chat tool) and POST /api/agents/{id}/a2a (chat), extended thinking uses the character’s settings.anthropicThinkingBudgetTokens when the model is Anthropic (0 = off; omitted = fall back to ANTHROPIC_COT_BUDGET). Optional ANTHROPIC_COT_BUDGET_MAX clamps any effective budget. Why: the agent owner controls cost/quality per agent; MCP/A2A clients cannot pass a thinking budget in the request (untrusted input). See docs/anthropic-cot-budget.md.

2. AI Image Generation

Location: /dashboard/image and /app/api/v1/generate-image/route.ts

Features:

Google Gemini 2.5 Flash multimodal generation
High-quality 1024x1024 images
Automatic Vercel Blob upload
Base64 preview for instant display
Download functionality

API:

POST /api/v1/generate-image
Content-Type: application/json
Authorization: Bearer eliza_your_api_key

{
  "prompt": "A serene landscape with mountains and lake at sunset"
}

Cost: $0.01 per image

3. AI Video Generation

Location: /dashboard/video and /app/api/v1/generate-video/route.ts

Features:

Multiple Fal.ai models:
- fal-ai/veo3 (Google Veo 3)
- fal-ai/veo3/fast (faster version)
- fal-ai/kling-video/v2.1/pro/text-to-video (Kling Pro)
- fal-ai/minimax/hailuo-02/pro/text-to-video (MiniMax)
Automatic Vercel Blob upload
Progress tracking with queue updates
Fallback video on errors

API:

POST /api/v1/generate-video
Content-Type: application/json
Authorization: Bearer eliza_your_api_key

{
  "prompt": "A cinematic shot of a spaceship flying through stars",
  "model": "fal-ai/veo3"
}

Cost: $0.05 per video ($0.025 for fallback)

4. Gallery & Media Storage

Location: /dashboard/gallery

Features:

View all generated images and videos
Filter by type (image, video, all)
Grid layout with thumbnails
Full-size preview with details
Download media files
Delete from both DB and Vercel Blob
Storage usage statistics

Vercel Blob Benefits:

Global CDN delivery (19 edge regions)
Public access with unguessable URLs
Automatic caching
No upload fees (only downloads charged)
Hierarchical folder structure

Setup:

# 1. Create Blob store in Vercel Dashboard
# 2. Copy BLOB_READ_WRITE_TOKEN to .env.local
BLOB_READ_WRITE_TOKEN=vercel_blob_rw_your_token

5. Container Deployments

Location: /dashboard/containers and /app/api/v1/containers/route.ts

Features:

Deploy elizaOS projects via elizaos deploy CLI
Multi-project support: Deploy multiple different projects per user
Multi-architecture support: Auto-detects platform and deploys to matching AWS instance type
- ARM64: t4g.small (AWS Graviton2, $15.76/month) - Recommended for cost savings
- x86_64: t3.small (Intel/AMD, $18.68/month) - Universal compatibility
Smart update detection: Automatically detects and updates existing deployments
Docker-based deployments to AWS ECS (Elastic Container Service)
ECR (Elastic Container Registry) for Docker image storage with project-specific repositories
CloudFormation stack per project: elizaos-{userId}-{projectName}
Optimized health checks (15s interval, 5min grace period)
Health monitoring via CloudWatch and ECS
Quota enforcement (prevents race conditions)
Environment variable injection
Credit-based billing with automatic deduction
Container management CLI: elizaos containers list|delete|logs
Async deployment: API returns immediately, CLI polls with beautiful progress

How It Works:

User gets API key from /dashboard/api-keys
User runs elizaos deploy --project-name my-project --api-key eliza_xxxxx from project directory
CLI auto-detects if project already deployed (checks project_name)
CLI requests ECR credentials from the cloud API
CLI builds Docker image locally using project's Dockerfile (or generates one)
CLI pushes Docker image to project-specific ECR repository
CLI creates/updates container deployment via cloud API:
- Fresh deployment: Creates new CloudFormation stack
- Update deployment: Updates existing CloudFormation stack (zero-downtime)
Cloud provisions/updates dedicated EC2 instance with ECS
Container accessible via AWS Load Balancer URL
Credits automatically deducted based on container resources (CPU/memory)

Multi-Project Example:

# Deploy first project
cd ~/chatbot
elizaos deploy --project-name chatbot --api-key eliza_xxx
# URL: https://fc51b251-chatbot.containers.elizacloud.ai

# Deploy second project (same user, different project)
cd ~/assistant
elizaos deploy --project-name assistant --api-key eliza_xxx
# URL: https://fc51b251-assistant.containers.elizacloud.ai

# Update chatbot
cd ~/chatbot
# ... make changes ...
elizaos deploy --project-name chatbot  # Auto-detected as update
# URL unchanged: https://fc51b251-chatbot.containers.elizacloud.ai

Human-Readable URLs:

Format: https://{userId-prefix}-{project-name}.containers.elizacloud.ai
Example: https://fc51b251-chatbot.containers.elizacloud.ai
Uses first segment of UUID + project name for easy recognition

Instance Specs (Auto-Selected):

ARM64 (t4g.small - Recommended):

2 vCPUs (ARM Graviton2)
2 GiB RAM (2048 MB)
$15.76/month ($12.26 instance + $3.50 storage/monitoring)
Default container allocation: 1.75 vCPU (1792 units), 1.75 GiB RAM (1792 MB, 87.5% of instance)

x86_64 (t3.small - Universal):

2 vCPUs (Intel/AMD)
2 GiB RAM (2048 MB)
$18.68/month ($15.18 instance + $3.50 storage/monitoring)
Default container allocation: 1.75 vCPU (1792 units), 1.75 GiB RAM (1792 MB, 87.5% of instance)

Platform is automatically detected from your system. ARM64 provides better cost efficiency ($2.92/month savings) while x86_64 ensures universal compatibility.

Container Management:

# List all containers (with project names)
elizaos containers list --api-key eliza_xxx

# View logs (auto-detects from current directory)
cd ~/chatbot
elizaos containers logs  # Finds chatbot project automatically

# Delete container (auto-detects from current directory)
cd ~/chatbot
elizaos containers delete  # Finds and deletes chatbot project

Deployment Architecture:

┌──────────────┐
│   CLI Tool   │
│  (elizaos)   │
└──────┬───────┘
       │ 1. Request ECR credentials
       ▼
┌──────────────┐
│  Cloud API   │
│   (Next.js)  │
└──────┬───────┘
       │ 2. Return ECR auth token + repository
       ▼
┌──────────────┐
│  Docker CLI  │
│ (local build)│
└──────┬───────┘
       │ 3. Push image to ECR
       ▼
┌──────────────┐     4. Deploy container     ┌──────────────┐
│     ECR      │ ─────────────────────────▶ │  EC2 + ECS   │
│  (Registry)  │                              │  (Runtime)   │
└──────────────┘                              └──────┬───────┘
                                                      │
                                                      ▼
                                              ┌──────────────┐
                                              │ Load Balancer│
                                              │   (Public)   │
                                              └──────────────┘

Docker Image Requirements:

Must expose a port (default: 3000)
Must include a /health endpoint for ECS health checks
Dockerfile can be auto-generated if not present
Environment variables passed from cloud API

API:

POST /api/v1/containers
Content-Type: application/json
Authorization: Bearer eliza_your_api_key

{
  "name": "my-agent",
  "port": 3000,
  "max_instances": 1,
  "environment_vars": {
    "NODE_ENV": "production"
  },
  "ecr_image_uri": "123456789012.dkr.ecr.us-east-1.amazonaws.com/elizaos/my-project:latest"
}

Requirements:

AWS account with ECS/ECR/EC2 access and CloudFormation permissions
elizaOS Cloud account with API key
VPC with public subnets configured
IAM roles for ECS task execution
Environment variables set (see .env.example and docs/ENV_VARIABLES.md)

6. elizaOS Agent Integration

Location: /dashboard/chat and lib/eliza/

Features:

Full AgentRuntime from @elizaos/core
PostgreSQL-backed memory system
Vector embeddings (384-3072 dimensions)
Rooms for conversations
Participants and relationships
Custom plugins and providers

Database Schema:

agents: Character definitions
memories: Conversation history
embeddings: Vector similarity search
rooms: Conversation contexts
entities: Users and participants
relationships: Entity connections

API:

# Create room
POST /api/eliza/rooms
{
  "agentId": "uuid",
  "name": "Chat Room"
}

# Send message
POST /api/eliza/rooms/{roomId}/messages
{
  "content": "Hello, agent!",
  "authorId": "user-uuid"
}

7. Character Creator

Location: /dashboard/character-creator and /app/api/v1/character-assistant/route.ts

Features:

AI-assisted character building using GPT-4o-mini
Progressive JSON generation
Live preview of character definition
Import/export elizaOS-compatible JSON
Support for all character fields:
- name, username, bio, system prompt
- messageExamples, postExamples
- topics, adjectives, style
- plugins, knowledge, settings

Workflow:

User describes character in natural language
AI generates JSON incrementally
User sees live preview
AI suggests improvements
Export as elizaOS character file

Example:

{
  "name": "Alex",
  "bio": ["A friendly AI assistant", "Specializes in technical support"],
  "adjectives": ["helpful", "knowledgeable", "patient"],
  "system": "You are a helpful technical support agent...",
  "style": {
    "chat": ["Be concise", "Use bullet points"],
    "post": ["Be professional", "Include examples"]
  },
  "plugins": ["@elizaos/plugin-sql", "@elizaos/plugin-openai"]
}

8. Developer API & Programmatic Access

Location: Documented management endpoints that explicitly note API key support

API key authentication is available for the specific endpoints documented in this README (for example: /api/v1/chat, /api/v1/chat/completions, /api/v1/messages, /api/v1/generate-image, /api/v1/generate-video, /api/v1/containers, /api/v1/voice/*, /api/v1/billing/*, /api/v1/models, /api/v1/gallery). Not every /api/v1/ or /api/my-agents/ route supports API keys today, so rely on the documented list, enabling:

Programmatic Agent Management: Create, update, delete, and clone agents via API
Voice Integration: Text-to-speech, speech-to-text, and voice cloning for voice-enabled applications
Billing Automation: Monitor balance, configure auto-top-up, and manage credits programmatically
AI Agent Autonomy: Enable AI agents to manage their own resources and budgets

Session-based auth only (no API key support yet): /api/v1/api-keys, /api/v1/apps/[id]/deploy, /api/v1/dashboard, /api/my-agents/characters/[id]/track-interaction.

Why API Keys for Management Endpoints?

Traditional SaaS platforms only expose limited APIs. We've enabled API key authentication across these management endpoints because:

Developer Experience: Developers can build integrations without browser-based auth flows
Agent Autonomy: AI agents need to manage their own resources (credits, other agents, voices) autonomously
Automation: CI/CD pipelines, scripts, and external systems can interact with the platform programmatically
No Vendor Lock-in: Generic endpoint paths (/api/v1/voice/ instead of provider-specific paths) allow switching providers without breaking integrations

Generic Voice API: Voice endpoints use provider-agnostic paths (/api/v1/voice/tts instead of /api/elevenlabs/tts) so your code doesn't need to change if the underlying provider changes. Legacy paths are preserved for backwards compatibility.

Billing Management: Agents and developers can configure auto-top-up settings programmatically, ensuring autonomous agents never stop working due to insufficient credits.

9. API Key Management

Location: /dashboard/api-keys and /app/api/v1/api-keys/route.ts

Features:

Generate API keys for programmatic access
Key rotation and regeneration
Rate limiting per key (default 1000 req/day)
Usage tracking and statistics
Expires_at support for time-limited keys

Key Format: eliza_<random_32_chars>

API:

# Create API key
POST /api/v1/api-keys
{
  "name": "Production API Key",
  "description": "Main production key",
  "rate_limit": 10000
}

# Regenerate key
POST /api/v1/api-keys/{id}/regenerate

# Delete key
DELETE /api/v1/api-keys/{id}

Using API Keys:

curl https://your-app.com/api/v1/chat \
  -H "Authorization: Bearer eliza_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "Hello"}]}'

9. Credit System & Billing

Location: /dashboard/billing and lib/queries/credits.ts

Features:

Credit-based pricing model
Stripe integration for purchases
Credit packs with volume discounts
Automatic deduction on usage
Transaction history
Organization-level balance

Pricing:

Text Chat: Token-based (varies by model)
Image Generation: $0.01 per image
Video Generation: $0.05 per video
Container Running: $0.67/day (~$20/month) - billed daily
Container Deployment: $0.50 one-time per deployment
Voice Clone (Instant): 50 credits
Voice Clone (Professional): $2.00

Stripe Integration:

Credit pack products defined in Stripe
Checkout session for purchases
Webhook processing for fulfillment
Idempotency for duplicate webhooks

Credit Packs (example):

[
  { name: "Starter", credits: 10000, price: 9.99 },
  { name: "Pro", credits: 50000, price: 39.99 },
  { name: "Enterprise", credits: 200000, price: 129.99 },
];

Setup:

See docs/STRIPE_SETUP.md for detailed Stripe configuration.

Referrals & Affiliates

Referrals: Signup-based. When a user signs up with a referral code, we record the link; when they buy credits (Stripe or x402), we redistribute 100% of that purchase in a 50/40/10 split (ElizaCloud / app owner / creator). Signup and qualified bonuses ($1 + $0.50 + $0.50) are minted as marketing spend, not carved from revenue. Why: One predictable split model; no risk of over-paying (splits always sum to 100%).
Affiliates: Link-based. Users can be linked to an affiliate code; on auto top-up and MCP usage we add a markup (default 20%) to what the customer pays and pay that to the affiliate. Why: Affiliate cost is passed to the customer, so we never over-allocate.
No double-apply: Referral splits apply only to Stripe checkout and x402; affiliate markup only to auto top-up and MCP. No single transaction pays both.

See docs/referrals.md for flow, API, and revenue math; docs/affiliate-referral-comparison.md for comparison with the other cloud repo.

Signup codes

What: One-time bonus credits per organization, e.g. launch50 → $50. Codes are defined in the SIGNUP_CODES_JSON env var (JSON object); if unset, defaults to {} (no codes). Why env var: So each environment (staging, prod) can have its own codes without committing them; no config file in the repo.
Where: Redeem via POST /api/signup-code/redeem (session auth only) or during Discord/Telegram signup by passing signup_code in the auth body. Why one per org: Prevents abuse (one shared code = one bonus per org) and keeps "welcome bonus" semantics.
Distinct from referrals: Referrals split revenue on purchases (50/40/10). Signup codes are flat campaign bonuses; an org can use both. See docs/signup-codes.md for API, security (rate limit CRITICAL, no-cache, two-layer one-per-org), and WHYs.

Wallet API (SIWE + wallet header auth)

SIWE (EIP-4361): GET /api/auth/siwe/nonce → sign message → POST /api/auth/siwe/verify → receive API key. New wallets get an account and initial free credits. Why: Agents and headless clients need a way to sign in and get an API key without a browser.
Wallet header signature: Send X-Wallet-Address, X-Timestamp, X-Wallet-Signature on each request to authenticate without storing a key. First valid request for an unknown wallet creates the account. Why: Some clients prefer not to store an API key; the wallet proves ownership per request.
x402 topup: Can credit the signer when wallet sig headers are present, or use body.walletAddress. All wallet signup (SIWE, wallet-header, topup) uses the same findOrCreateUserByWalletAddress path (slug, credits, race handling). See docs/wallet-api.md and Authentication / Wallet API for full reference and WHYs.

10. Analytics & Monitoring

Location: /dashboard/analytics and lib/queries/usage.ts

Features:

Usage records per request (tokens, cost, model)
Provider health monitoring
Model usage breakdown (Recharts)
Credit activity timeline
Error rate tracking
Response time monitoring

Metrics Tracked:

Input/output tokens
Cost per request
Duration (ms)
Success/failure status
IP address and user agent
Model and provider used

Provider Health:

Automatic health checks for AI providers
Status: healthy, degraded, unhealthy
Response time percentiles
Error rate calculation

11. MCP (Model Context Protocol) API

Location: /app/api/mcp/route.ts

Features:

Standards-based MCP server implementation
Platform tools exposed via MCP protocol
Bearer token authentication using API keys
Compatible with any MCP client (Claude Desktop, MCP Inspector, etc.)
Automatic billing for tool usage

Available MCP Tools:

check_credits: View organization balance and recent transactions
get_recent_usage: View recent API usage statistics
generate_text: Generate text using AI models
generate_image: Generate images using Google Gemini 2.5
save_memory: Save to long-term memory
retrieve_memories: Search and retrieve memories
chat_with_agent: Chat with deployed elizaOS agents
list_agents: List all available agents
list_containers: List deployed containers

Using MCP Inspector:

npm run mcp:inspector

Using with Claude Desktop:

Add to your Claude Desktop config:

{
  "mcpServers": {
    "eliza-cloud": {
      "url": "http://localhost:3000/api/mcp",
      "transport": {
        "type": "streamableHttp"
      },
      "headers": {
        "Authorization": "Bearer eliza_your_api_key_here"
      }
    }
  }
}

🗄 Database Architecture

SaaS Schema (`db/schemas/*.ts`)

Core Tables:

organizations: Multi-tenant organization data
- credit_balance, stripe_customer_id
- allowed_models, allowed_providers
- webhook_url for notifications
users: User accounts linked to organizations
- privy_user_id for authentication
- role: admin, member
- is_active for deactivation
api_keys: API authentication
- key_hash for secure storage
- rate_limit, usage_count
- permissions array
credit_transactions: Credit ledger
- amount (positive or negative)
- type: purchase, deduction, refund, adjustment
- stripe_payment_intent_id for reconciliation
credit_packs: Balance top-up packages
- stripe_price_id, stripe_product_id
- sort_order for display
usage_records: Per-request usage tracking
- input_tokens, output_tokens
- input_cost, output_cost
- model, provider, type
- is_successful, error_message
generations: Image/video generation records
- type: image, video
- status: pending, completed, failed
- storage_url (Vercel Blob)
- dimensions, file_size, mime_type
containers: AWS ECS container deployments
- ecr_repository_uri, ecr_image_tag (Docker image in ECR)
- ecs_cluster_arn, ecs_service_arn, ecs_task_definition_arn (ECS resources)
- load_balancer_url (ALB URL for accessing the container)
- status: pending, building, deploying, running, failed, stopped
- environment_vars, desired_count, cpu, memory, port
- Unique constraint on (organization_id, name)
alb_priorities: Application Load Balancer priority management
- Ensures each container gets a unique priority for ALB routing rules
- Prevents priority conflicts when multiple containers share an ALB
conversations: Platform-level chat history
- title, model, settings
- message_count, total_cost
conversation_messages: Messages in conversations
- role: user, assistant, system
- sequence_number for ordering
- tokens, cost, processing_time
user_characters: User-created elizaOS characters
- character_data (full JSON)
- is_template, is_public
- Stored separately from agents
model_pricing: Dynamic pricing per model
- input_cost_per_1k, output_cost_per_1k
- effective_from, effective_until
- is_active for versioning
provider_health: AI provider status
- status: healthy, degraded, unhealthy
- response_time, error_rate
- last_checked timestamp
jobs: Background job queue
- type, status: pending, in_progress, completed, failed
- attempts, max_attempts
- webhook_url for callbacks

elizaOS Schema (Integrated)

Integrated into the main database via @elizaos/plugin-sql schema. These tables are managed by elizaOS core:

Agent Runtime Tables:

agents: Character definitions
- name, username, bio, system
- messageExamples, postExamples
- topics, adjectives, style
- plugins, knowledge, settings
memories: Conversation history
- type (message, document, fragment)
- content (JSONB)
- unique flag for deduplication
- metadata with document references
embeddings: Vector similarity search
- Multiple dimension columns:
  - dim384, dim512, dim768 (small-large)
  - dim1024, dim1536, dim3072 (XL-XXXL)
- memory_id foreign key
rooms: Conversation contexts
- source (discord, telegram, web, etc.)
- type (DM, group, channel)
- world_id optional reference
- channel_id for platform mapping
participants: Room membership
- entity_id, room_id, agent_id
- room_state for custom data
entities: Users and participants
- names array for aliases
- metadata JSONB
relationships: Entity connections
- source_entity_id, target_entity_id
- agent_id scope
- tags array
- Unique constraint prevents duplicates
components: ECS-style data
- entity_id, room_id, world_id
- type, data JSONB
worlds: High-level grouping
- agent_id, name, server_id
tasks: Scheduled agent tasks
- name, description, tags
- metadata JSONB
cache: Key-value cache
- key, agent_id composite primary key
- expires_at for TTL
logs: Audit trail
- entity_id, room_id, type
- body JSONB
message_servers: Central messaging (future)
- source_type, source_id
- For multi-platform agents
channels: Message channels
- message_server_id
- type (text, voice, DM, etc.)
central_messages: Cross-platform messages
- channel_id, author_id
- in_reply_to_root_message_id for threads

Database Migrations

Generate migration:

bun run db:generate

This creates SQL migration files in db/migrations/.

Apply migration:

bun run db:migrate

Race Condition Prevention

The platform implements atomic operations to prevent quota bypass:

Example: Container quota enforcement

await db.transaction(async (tx) => {
  // 1. Lock organization row
  const org = await tx
    .select()
    .from(organizations)
    .where(eq(organizations.id, orgId))
    .for("update");

  // 2. Count containers while holding lock
  const count = await tx
    .select()
    .from(containers)
    .where(eq(containers.organization_id, orgId));

  // 3. Check quota
  if (count >= maxAllowed) throw new QuotaExceededError();

  // 4. Create container
  return await tx.insert(containers).values(data);
});

See lib/queries/container-quota.ts for full implementation.

🔌 API Reference

Authentication

Documented management endpoints support multiple authentication methods:

Session Cookie (Privy): Automatic for logged-in users
API Key Header: Authorization: Bearer eliza_your_key or X-API-Key: eliza_your_key
SIWE: Get nonce from GET /api/auth/siwe/nonce, sign EIP-4361 message, POST /api/auth/siwe/verify to receive an API key
Wallet header: X-Wallet-Address, X-Timestamp, X-Wallet-Signature (per-request signature; first request can create account)

Base URL

Development: http://localhost:3000
Production: https://your-domain.com

Endpoints

AI Generation

# Text Chat
POST /api/v1/chat
{
  "messages": [{"role": "user", "content": "Hello"}],
  "model": "gpt-4o"
}

# Image Generation
POST /api/v1/generate-image
{
  "prompt": "A beautiful sunset over mountains"
}

# Video Generation
POST /api/v1/generate-video
{
  "prompt": "Cinematic shot of spaceship",
  "model": "fal-ai/veo3"
}

# Available Models
GET /api/v1/models

Gallery

# List Media
GET /api/v1/gallery?type=image&limit=50&offset=0

# Response:
{
  "items": [...],
  "count": 10,
  "hasMore": false
}

Containers

# List Containers
GET /api/v1/containers

# Create Container
POST /api/v1/containers
{
  "name": "my-agent",
  "port": 3000,
  "ecr_image_uri": "123456789012.dkr.ecr.us-east-1.amazonaws.com/elizaos/my-project:v1.0.0",
  "environment_vars": {...},
  "cpu": 256,
  "memory": 512,
  "desired_count": 1
}

# Get Container
GET /api/v1/containers/{id}

# Delete Container
DELETE /api/v1/containers/{id}

# Check Quota
GET /api/v1/containers/quota

ECR Credentials

# Get ECR credentials for pushing Docker images
POST /api/v1/containers/credentials
{
  "projectId": "my-project",
  "version": "1.0.0"
}

# Response includes:
# - ecrRepositoryUri: Where to push the image
# - authToken: Docker login credentials
# - ecrImageUri: Full image URI to use in deployment

API Keys

# Create Key
POST /api/v1/api-keys
{
  "name": "Production",
  "rate_limit": 10000
}

# List Keys
GET /api/v1/api-keys

# Regenerate Key
POST /api/v1/api-keys/{id}/regenerate

# Delete Key
DELETE /api/v1/api-keys/{id}

User Info

# Get Current User
GET /api/v1/user

# Response:
{
  "id": "uuid",
  "email": "user@example.com",
  "name": "John Doe",
  "organization": {...},
  "credit_balance": 5000
}

elizaOS Agents

# Create Room
POST /api/eliza/rooms
{
  "agentId": "uuid",
  "name": "Chat"
}

# Get Room Messages
GET /api/eliza/rooms/{roomId}/messages

# Send Message
POST /api/eliza/rooms/{roomId}/messages
{
  "content": "Hello!",
  "authorId": "user-uuid"
}

Rate Limiting

Default: 1000 requests/day per API key
Container Deployments: 5 per 5 minutes
Billing Endpoints: 100 per hour

Rate limits return:

{
  "error": "Rate limit exceeded",
  "retryAfter": 3600
}

Error Responses

{
  "success": false,
  "error": "Error message",
  "details": {...}
}

HTTP Status Codes:

400: Bad Request (validation error)
401: Unauthorized (missing/invalid auth)
403: Forbidden (insufficient permissions)
404: Not Found
429: Too Many Requests (rate limited)
500: Internal Server Error
503: Service Unavailable (feature not configured)

🚢 Deployment

Deploying to Vercel (Recommended)

1. Push to GitHub:

git add .
git commit -m "Initial commit"
git push origin main

2. Import to Vercel:

Go to vercel.com/new
Import your repository
Vercel auto-detects Next.js

3. Configure Environment Variables:

Add all variables from .env.local in Vercel dashboard:

DATABASE_URL - Single database for platform and elizaOS tables
NEXT_PUBLIC_PRIVY_APP_ID, PRIVY_APP_SECRET, PRIVY_WEBHOOK_SECRET
OPENAI_API_KEY or AI_GATEWAY_API_KEY
BLOB_READ_WRITE_TOKEN (optional, for media gallery)
FAL_KEY (optional, for video generation)
STRIPE_SECRET_KEY, STRIPE_WEBHOOK_SECRET (optional, for payments)
AWS_REGION, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY (optional, for container deployments)
ECS_CLUSTER_NAME, AWS_VPC_ID, AWS_SUBNET_IDS, AWS_SECURITY_GROUP_IDS (for containers)
CRON_SECRET (required for production)

Optional moderation config:

CONTENT_MODERATION_ENABLED=false - disables async content moderation checks (the app will still stream responses).

4. Update Privy Redirect URI:

Add production callback URL to Privy dashboard
Configure allowed origins: https://your-app.vercel.app

5. Deploy:

Click "Deploy"
Vercel automatically builds and deploys
Database migrations run on build

6. Configure Stripe Webhook:

Add webhook endpoint in Stripe dashboard
URL: https://your-app.vercel.app/api/stripe/webhook
Select events: checkout.session.completed, payment_intent.succeeded

Database Migrations in Production

Vercel runs migrations automatically via build script. For manual migration:

# Connect to production database
DATABASE_URL=postgres://prod-url npm run db:migrate

Monitoring

Vercel Analytics: Built-in (automatically enabled)
Logs: View in Vercel dashboard
Error Tracking: Console logs captured
Provider Health: Check /dashboard/analytics

🐛 Troubleshooting

Common Issues

1. Database Connection Errors

Error: Connection refused or SSL required

Solutions:

Verify DATABASE_URL includes ?sslmode=require
Check Neon dashboard for correct connection string
Ensure database is not paused (serverless auto-pause)

2. Authentication Issues

Error: Authentication errors or login failures

Solutions:

Verify NEXT_PUBLIC_PRIVY_APP_ID and PRIVY_APP_SECRET are correct
Check allowed origins in Privy dashboard match your domain (e.g., https://your-app.vercel.app)
Clear browser cookies and localStorage, then try again
Ensure Privy webhook is configured: https://your-domain.com/api/privy/webhook
Check webhook secret matches PRIVY_WEBHOOK_SECRET in your environment

3. Environment Variables Not Loading

Error: undefined values in runtime

Solutions:

Restart dev server after changing .env.local
Ensure file is named exactly .env.local (not .env)
Public variables must start with NEXT_PUBLIC_
In production, verify all variables set in Vercel dashboard

4. Container Deployment Fails

Error: "Container deployment failed" or "Deployment timeout"

Solutions:

Check AWS credentials are correct: AWS_REGION, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY
Verify ECS configuration:
- ECS_CLUSTER_NAME - cluster must exist or CloudFormation will create it
- AWS_VPC_ID - must be a valid VPC ID
- AWS_SUBNET_IDS - comma-separated subnet IDs in different AZs
- AWS_SECURITY_GROUP_IDS - security group must allow HTTP/HTTPS ingress
- ECS_EXECUTION_ROLE_ARN - IAM role for ECS task execution
Test AWS credentials: aws sts get-caller-identity
Check quota: GET /api/v1/containers/quota
View logs in AWS CloudWatch or ECS console
Ensure shared infrastructure is deployed: cd scripts/cloudformation && ./deploy-shared.sh

See docs/DEPLOYMENT_TROUBLESHOOTING.md for detailed troubleshooting.

5. Docker Image Push Fails

Error: "Failed to push image to ECR" or "Authentication failed"

Solutions:

Verify Docker is running: docker info
Check AWS ECR credentials are valid
Ensure image was built successfully: docker images
Verify network connectivity to ECR
Try re-authenticating: Request new credentials from /api/v1/containers/credentials

6. Image/Video Generation Fails

Error: "No image/video was generated" or timeout

Solutions:

Image: Verify Google Gemini access in AI Gateway or OpenAI API key
Video: Check FAL_KEY is set correctly
Try simpler prompts first
Check rate limits in provider dashboard
View error in /dashboard/analytics

7. Credits Not Deducting

Error: Usage not tracking or credits not deducted

Solutions:

Check credit_transactions table for records
Verify organization credit_balance column
Check for database transaction errors in logs
Ensure calculateCost() is being called

8. Stripe Webhook Not Working

Error: Credits not added after purchase

Solutions:

Verify STRIPE_WEBHOOK_SECRET matches Stripe dashboard
Check webhook endpoint URL is correct
View webhook events in Stripe dashboard → Developers → Webhooks

Test locally with Stripe CLI:

stripe listen --forward-to localhost:3000/api/stripe/webhook

Getting Help

🚀 AWS ECS Container Deployment

Deploy elizaOS agents to AWS ECS (Elastic Container Service) using Docker containers. Each user gets a dedicated EC2 instance (t4g.small ARM, Graviton2) managed via CloudFormation.

Quick Start

# 1. Get your API key from the dashboard
# Visit https://your-domain.com/dashboard/api-keys

# 2. Set your API key
export ELIZAOS_API_KEY="your-api-key-here"

# 3. Ensure Docker is running locally
docker --version
docker info

# 4. Deploy your elizaOS project
cd your-elizaos-project
elizaos deploy

How It Works

CLI requests ECR credentials from the cloud API
CLI builds Docker image locally
CLI pushes image to AWS ECR (Elastic Container Registry)
CLI creates container deployment via cloud API
Cloud deploys to dedicated EC2 instance (t4g.small ARM) with ECS
Agent runs on AWS with health checks and monitoring

AWS Infrastructure Setup (Platform Maintainers)

1. Deploy Shared Infrastructure

The platform uses CloudFormation to provision per-user infrastructure. First, deploy shared resources:

cd scripts/cloudformation
./deploy-shared.sh

This creates:

VPC with public subnets
Application Load Balancer (ALB) for routing
IAM roles for ECS tasks
Security groups

2. Configure Environment Variables

# AWS Credentials
AWS_REGION=us-east-1
AWS_ACCESS_KEY_ID=your_aws_access_key
AWS_SECRET_ACCESS_KEY=your_aws_secret

# Network Configuration (from CloudFormation outputs)
AWS_VPC_ID=vpc-xxxxx
AWS_SUBNET_IDS=subnet-xxxxx,subnet-yyyyy
AWS_SECURITY_GROUP_IDS=sg-xxxxx

# ECS Configuration
ECS_CLUSTER_NAME=elizaos-production
ECS_EXECUTION_ROLE_ARN=arn:aws:iam::ACCOUNT:role/ecsTaskExecutionRole
ECS_TASK_ROLE_ARN=arn:aws:iam::ACCOUNT:role/ecsTaskRole

# Optional: Shared ALB (recommended for cost savings)
ECS_SHARED_ALB_ARN=arn:aws:elasticloadbalancing:...
ECS_SHARED_LISTENER_ARN=arn:aws:elasticloadbalancing:...

# Environment (for stack naming)
ENVIRONMENT=production

3. Start the Platform

npm run dev  # Development
npm run build && npm start  # Production

Users can now deploy via: elizaos deploy

For Users: Deployment Options

# Basic deployment
elizaos deploy

# With custom name and resources
elizaos deploy \
  --name my-agent \
  --port 8080 \
  --desired-count 2 \
  --cpu 512 \
  --memory 1024

# With environment variables
elizaos deploy \
  --env "OPENAI_API_KEY=sk-..." \
  --env "DATABASE_URL=postgresql://..."

# Using existing Docker image
elizaos deploy \
  --skip-build \
  --image-uri 123456789.dkr.ecr.us-east-1.amazonaws.com/my-project:v1.0.0

Verification

# Check container status via API
curl https://elizacloud.ai/api/v1/containers \
  -H "Authorization: Bearer $ELIZAOS_API_KEY"

# View in dashboard
# https://elizacloud.ai/dashboard/containers

Cost & Billing

Container deployments are billed daily:

Deployment: $0.50 one-time per deployment
Running Costs: $0.67/day per container (~$20/month)
- Billed automatically at midnight UTC
- 48-hour warning email sent when credits are low
- Container shut down after 48 hours if no credits added

Infrastructure (managed by elizaOS Cloud):

t4g.small (1.75 vCPU + 1.75 GB RAM) default instance type
ECR image storage included
Load balancing included
Auto-scaling available for additional instances

Daily Billing Behavior:

CRON runs daily at midnight UTC
Charges $0.67 per running container
If insufficient credits: 48-hour shutdown warning email
If still insufficient after 48 hours: container stopped

📚 Additional Resources

Core Framework

Database & ORM

AI & Machine Learning

Authentication & Billing

Storage & Infrastructure

UI & Styling

Development Tools

📄 License

See the LICENSE file in the repository root.

Built with ❤️ for the elizaOS ecosystem

Iteration 1 prr-fix:ic-4144102325-5

Iteration 2 prr-fix:ic-4143785130-8

Iteration 2 prr-fix:ic-4140973365-1 prr-fix:ic-4143807945-15

Iteration 2 prr-fix:ic-4144061538-2

cursor · 2026-03-27T17:25:13Z

packages/lib/api/a2a/skills.ts

+ * The calling agent's character is not available here — skills are invoked via
+ * the A2A protocol which only provides user/org context, not agent personality.
 */
+</search>


Cursor search/replace artifact left in production code

High Severity

A stray </search> tag from a Cursor search/replace operation was accidentally committed on line 12 of packages/lib/api/a2a/skills.ts. This is not valid TypeScript and will cause a syntax error, breaking the A2A skills module at parse time.

Fixed in 67138cc.

cursor · 2026-03-27T17:25:13Z

anthropic-thinking.test.ts

+  supportsExtendedThinking,
+  type ThinkingConfig,
+  type CharacterThinkingSettings,
+} from "./anthropic-thinking";


Root test imports from nonexistent module path

Medium Severity

The root-level anthropic-thinking.test.ts imports validateBudgetTokens, getThinkingConfig, buildThinkingParam, and CharacterThinkingSettings from ./anthropic-thinking, but no such file exists at the repository root. The actual implementation lives at packages/lib/providers/anthropic-thinking.ts and exports completely different functions. This test will fail at import time.

Dismissed: Comment targets line 9 but identifier validateBudgetTokens not found in file — code may have been removed or renamed

cursor · 2026-03-27T17:25:13Z

packages/lib/middleware/rate-limit.ts

+  const multiplier = process.env.RATE_LIMIT_MULTIPLIER;
+  if (!multiplier) return 1;
+  const parsed = Number.parseInt(multiplier, 10);
+  return Number.isNaN(parsed) || parsed < 1 ? 1 : parsed;


Validator accepts floats but consumer uses parseInt

Low Severity

getRateLimitMultiplier parses RATE_LIMIT_MULTIPLIER with Number.parseInt, silently truncating decimal values (e.g., "1.5" → 1). However, the env validator in env-validator.ts explicitly accepts floats via parseFloat and regex /^\d+(\.\d+)?$/. An operator who sets a fractional multiplier would pass validation but get unexpected truncation.

Additional Locations (1)

packages/lib/config/env-validator.ts#L127-L138

Could not auto-fix (wrong file or repeated failures); manual review recommended.

claude · 2026-03-27T17:26:08Z

Code Review

Overall: The core design is solid — thinking budgets are server-side controlled, the resolution hierarchy (character > env default, capped by env max) is clean, and the security rationale (untrusted MCP/A2A callers must not raise token budgets) is sound. The new anthropic-thinking.ts module is well-structured and the unit test coverage is thorough. A few issues worth addressing before merge:

Bug: Root-level `anthropic-thinking.test.ts` imports a missing file

anthropic-thinking.test.ts at the repo root imports from ./anthropic-thinking:

import {
  validateBudgetTokens,
  getThinkingConfig,
  buildThinkingParam,
  supportsExtendedThinking,
} from "./anthropic-thinking";

There is no anthropic-thinking.ts at the root level in this PR's diff. The functions validateBudgetTokens, getThinkingConfig, and buildThinkingParam are also not exported from packages/lib/providers/anthropic-thinking.ts. This test file will fail at import time. It looks like it was written against a different (earlier?) iteration of the API. Either remove it or wire it to the correct module.

Bug: `model.includes("claude")` in `mcp/route.ts` is inconsistent with the actual logic

In app/api/agents/[id]/mcp/route.ts:

const estimatedOutputTokens = model.includes("claude") && effectiveThinkingBudget > 0
  ? baseOutputTokens + effectiveThinkingBudget
  : baseOutputTokens;

resolveAnthropicThinkingBudgetTokens already returns null for non-Anthropic and non-thinking-capable models, so effectiveThinkingBudget will already be 0 for non-Claude models. The model.includes("claude") guard is therefore redundant — but it's also subtly wrong because it would match any hypothetical provider-prefixed model containing "claude" in the name that is not Anthropic. The a2a route handles this correctly using effectiveThinkingBudget != null instead. Suggest aligning both routes:

const estimatedOutputTokens = effectiveThinkingBudget > 0
  ? baseOutputTokens + effectiveThinkingBudget
  : baseOutputTokens;

Breaking change for local development: rate limits now match production by default

The PR changes RateLimitPresets from isDevelopment ? 10000 : N to flat production-level limits, with RATE_LIMIT_MULTIPLIER as the escape hatch. This will break any existing local dev or CI setup that relies on the previously relaxed limits. The new env var is also not in .env.example, making it hard to discover.

Suggest adding to .env.example:

# Set to a large number (e.g. 100) to relax rate limits during local development
# RATE_LIMIT_MULTIPLIER=1

Potential over-reservation of credits for thinking budget

In both MCP and A2A routes, the credit reservation adds the full effectiveThinkingBudget to the output token estimate:

const maxOutputTokens = effectiveThinkingBudget != null ? 500 + effectiveThinkingBudget : undefined;

effectiveThinkingBudget is the maximum budget (tokens available for thinking), not expected usage. For a 16K budget, this reserves credits for 16,500 output tokens even if the model only uses 2K thinking tokens. This could cause legitimate requests to fail with "insufficient credits" for users with smaller balances. Consider reserving a fraction of the thinking budget (e.g. 50%) or documenting this as an intentional conservative reservation.

Minor: `supportsExtendedThinking` dot-notation gap

The test in packages/tests/unit/anthropic-thinking.test.ts tests "anthropic/claude-sonnet-4" style IDs which all match. But the root test (noted above) expects "claude-3.5-sonnet" (dot notation) to return true. The pattern /claude-3-5-sonnet/ requires hyphens (3-5) and won't match the dot variant (3.5). If any callers or model registries use dot notation, thinking will silently be disabled. Worth adding /claude-3\.5-sonnet/ to the pattern list or confirming the dot variant is never used.

Nit: `RATE_LIMIT_DISABLED` bypass is production-unsafe

if (process.env.RATE_LIMIT_DISABLED === "true" && process.env.NODE_ENV !== "production") {
  return;
}

The NODE_ENV !== "production" guard is good, but RATE_LIMIT_DISABLED silently suppresses the Redis configuration check in non-prod. If someone accidentally sets this in a staging environment that has NODE_ENV=staging, they'd skip the warning about missing Redis. Consider limiting to NODE_ENV === "development" || NODE_ENV === "test" explicitly.

What's good

Security design: budgets live in character settings, not caller-supplied params — well-documented and enforced consistently
mergeProviderOptions deep-merge prevents Google responseModalities from being dropped when Anthropic CoT is added
Explicit 0 disables CoT for background services (app-builder, SEO, promotion) preserving temperature control — and the inline comments explain why
New env vars are documented in .env.example with clear explanations
Unit tests for the resolution logic are comprehensive (env override, agent override, max cap, 0-disable)

claude · 2026-03-27T17:26:57Z

test connection

claude · 2026-03-27T17:27:36Z

PR Review: feat: Anthropic CoT controls

Good overall direction — gating thinking budgets on server-side character settings (not untrusted request bodies) is the right security model. The per-agent override + env default + env cap hierarchy is clean. A few issues to address before merge:

Bug: Root-level test file imports non-existent module

anthropic-thinking.test.ts at the repo root imports from "./anthropic-thinking":

import {
  validateBudgetTokens,
  getThinkingConfig,
  buildThinkingParam,
  supportsExtendedThinking,
} from "./anthropic-thinking";

No anthropic-thinking.ts exists at root, and validateBudgetTokens, getThinkingConfig, buildThinkingParam are not exported from packages/lib/providers/anthropic-thinking.ts. This test will fail at import — it looks like a leftover from an earlier API design. Either delete it or rewrite it to import the actual provider module. The real unit tests in packages/tests/unit/anthropic-thinking.test.ts cover the production code correctly.

Bug: Fragile model detection in MCP credit reservation

In app/api/agents/[id]/mcp/route.ts:

const estimatedOutputTokens = model.includes("claude") && effectiveThinkingBudget > 0
  ? baseOutputTokens + effectiveThinkingBudget
  : baseOutputTokens;

supportsExtendedThinking() is already imported and handles this precisely. model.includes("claude") would match custom gateway aliases or any model with "claude" in the name that may not actually support extended thinking. These two checks can diverge.

// Prefer:
const estimatedOutputTokens = supportsExtendedThinking(model) && effectiveThinkingBudget > 0
  ? baseOutputTokens + effectiveThinkingBudget
  : baseOutputTokens;

Credit underestimation on generic v1 routes

MCP and A2A routes correctly add the thinking budget to estimatedOutputTokens before credit reservation. But /api/v1/chat, /api/v1/chat/completions, /api/v1/messages, and /api/v1/responses all spread mergeAnthropicCotProviderOptions(model) without adjusting their credit estimates. If ANTHROPIC_COT_BUDGET is set, these routes will silently under-reserve, which is especially risky for /api/v1/messages and /api/v1/responses (direct Anthropic API shape, typically larger prompts).

Consider either: (a) adjusting the estimate the same way MCP/A2A do, or (b) documenting why the asymmetry is intentional.

Silent temperature stripping on shared endpoints

When ANTHROPIC_COT_BUDGET is set and an Anthropic model is used, @ai-sdk/anthropic silently strips temperature/topP/topK. The /api/v1/chat/completions and /api/v1/responses routes accept user-supplied temperature that will be silently dropped. OpenAI-compatible clients relying on temperature control will get unexpected behavior with no signal. A warning log or API-level note would help.

Minor: Redundant null-coalescing in A2A route

In app/api/agents/[id]/a2a/route.ts, agentThinkingBudget is already number | undefined so ?? undefined is a no-op. Not a bug but noisy — the MCP route passes the value directly.

Positive callouts

Blocking thinking budget from MCP/A2A request bodies is the correct trust model — only the agent owner can set it via character settings.
Background services (SEO, Twitter, promotion, App Builder) all explicitly pass 0 to disable CoT — good discipline to preserve temperature control.
RATE_LIMIT_DISABLED is properly guarded with NODE_ENV !== "production" — no prod exposure.
mergeProviderOptions deep-merge avoids clobbering existing gateway/google provider options.
Test coverage in packages/tests/unit/anthropic-thinking.test.ts is solid: env parsing edge cases, model detection, precedence, and cap behavior all covered.
Env validation fail-fast for invalid ANTHROPIC_COT_BUDGET* values is a good addition.

Iteration 1 prr-fix:ic-4143706884-3 prr-fix:ic-4144015244-5

Iteration 1 prr-fix:ic-4144023511-5

Iteration 1 prr-fix:ic-4143990242-4

Iteration 1 prr-fix:prrc_kwdop6btdm6y8tp_

claude · 2026-03-27T17:36:36Z

Code Review: feat: Anthropic CoT controls

Overall this is a well-structured feature. The separation of concerns (character settings → env default → env cap) is sound, and the rationale for not allowing CoT budget in request bodies is clearly documented. A few issues need attention before merge.

Critical

1. Root-level test file imports a nonexistent module

anthropic-thinking.test.ts at the repo root imports from './anthropic-thinking', but there is no such file at the root. The actual implementation is at packages/lib/providers/anthropic-thinking.ts, which exports none of the names the test uses (validateBudgetTokens, getThinkingConfig, buildThinkingParam, ThinkingConfig, CharacterThinkingSettings). This test file will fail at import time and never runs.

Additionally, even if the import path were fixed, supportsExtendedThinking("claude-3.5-sonnet") is expected to return true but the regex /claude-3-5-sonnet/ (dashes) does not match the dot-notation string claude-3.5-sonnet. The pattern list should cover both forms, e.g. /claude-3[.-]5-sonnet/.

Either move the test to packages/lib/providers/ with the correct API surface, or delete it.

Medium

2. getRateLimitMultiplier silently truncates float values

The env-validator for RATE_LIMIT_MULTIPLIER accepts floats (regex \d+(\.\d+)?), but the function calls Number.parseInt:

const parsed = Number.parseInt(multiplier, 10);  // 1.5 → 1

A value of 1.5 passes startup validation but silently becomes 1 at runtime. Either use parseFloat in the function, or restrict the validator to integers only.

3. RateLimitPresets is no longer as const

Removing as const makes the exported object and its nested values mutable. Since rateLimitMultiplier is a runtime value as const cannot be used inline, but Object.freeze can be applied to each preset to restore the original immutability guarantee and prevent accidental mutation across requests.

Minor / Observations

4. No unit tests for the actual exported API

The key functions in packages/lib/providers/anthropic-thinking.ts (resolveAnthropicThinkingBudgetTokens, parseAnthropicCotBudgetFromEnv, parseAnthropicCotBudgetMaxFromEnv, mergeAnthropicCotProviderOptions) have no tests in this PR. The root test file was apparently intended to cover this but targets the wrong module. Consider adding tests alongside the implementation or in packages/tests/unit/.

5. Credit over-reservation for thinking budget is undocumented

Both MCP and A2A routes compute estimatedOutputTokens = 500 + effectiveThinkingBudget. The budget is a ceiling, not expected usage — actual thinking consumption is usually lower. This is safe, but it can cause users with low balances to hit reservation failures for agents with large budgets. A comment clarifying this is an intentional worst-case estimate would help future maintainers.

6. mergeProviderOptions only deep-merges 3 provider keys

If a future call site passes both base and extra with overlapping openai or mistral keys, the top-level spread silently drops the base values. Current usage is safe, but the function name implies a full deep-merge. Consider renaming it (e.g. mergeAnthropicGatewayGoogleOptions) or adding a warning near the signature.

What looks good

Resolution precedence (character → env default → env cap) in resolveAnthropicThinkingBudgetTokens is correct and clearly documented.
All background services (SEO, promotion, tweet, app builder) explicitly pass 0 to disable CoT with clear comments about the temperature-interaction risk — good defensive practice.
parsePositiveIntStrict + fail-fast env validation prevents runtime surprises from misconfigured env vars.
RATE_LIMIT_DISABLED bypass is appropriately guarded behind NODE_ENV !== 'production'.
Inline WHY comments throughout are well-written and follow the project convention.

Explains reasoning for dismissed issues inline in code

claude · 2026-03-27T17:37:51Z

Code Review

Overall this is a well-designed feature with good security thinking — keeping the thinking budget server-controlled (character settings + env) rather than client-controlled is the right call. The layered env default + per-agent override + hard cap model is solid. A few issues worth addressing before merge:

Bugs

1. anthropic-thinking.test.ts at repo root tests a different API than the production file

The test imports validateBudgetTokens, getThinkingConfig, buildThinkingParam, ThinkingConfig, and CharacterThinkingSettings from ./anthropic-thinking:

import {
  validateBudgetTokens,
  getThinkingConfig,
  buildThinkingParam,
  supportsExtendedThinking,
  type ThinkingConfig,
  type CharacterThinkingSettings,
} from "./anthropic-thinking";

None of these names exist in packages/lib/providers/anthropic-thinking.ts. That module exports parseAnthropicCotBudgetFromEnv, resolveAnthropicThinkingBudgetTokens, mergeAnthropicCotProviderOptions, etc. Either:

There's a stale/draft anthropic-thinking.ts at the repo root being tested (not visible in this diff), meaning the tests don't cover production code at all.
Or the test file will fail to import, making the CI step meaningless.

The unit tests for the production module should be colocated under packages/lib/providers/ and test the actual exported functions (e.g. resolveAnthropicThinkingBudgetTokens, parseThinkingBudgetFromCharacterSettings, mergeAnthropicCotProviderOptions).

2. as const removed from RateLimitPresets widens TypeScript types

-} as const;
+};

This changes RateLimitPresets.STRICT.maxRequests from a literal type (e.g. 10) to number. Any code that was typed against the narrow literal types will now have wider types. More importantly, RateLimitPresets is no longer readonly — callers could accidentally mutate the presets. Consider keeping as const or switching to Object.freeze().

Performance / Correctness

3. Credit reservation estimates the thinking budget twice for MCP vs A2A routes

MCP route (mcp/route.ts):

const effectiveThinkingBudget = resolveAnthropicThinkingBudgetTokens(...) ?? 0;
const estimatedOutputTokens = model.includes("claude") && effectiveThinkingBudget > 0
  ? baseOutputTokens + effectiveThinkingBudget
  : baseOutputTokens;

A2A route (a2a/route.ts):

const maxOutputTokens = effectiveThinkingBudget != null ? 500 + effectiveThinkingBudget : undefined;
const baseCost = await estimateRequestCost(model, fullMessages, maxOutputTokens);

The MCP route uses model.includes("claude") as an additional guard, but resolveAnthropicThinkingBudgetTokens already validated the model is Anthropic and supports extended thinking — if the result is non-null and positive, the model check is redundant and could silently fail if a provider-prefixed model ID like anthropic/claude-sonnet-4 is used (it includes "claude", so this is fine today, but it's an implicit coupling). The A2A route's guard != null is cleaner. Suggest aligning both routes.

Breaking Change for Developers

4. Rate limits now default to production values in dev — requires action to restore old behavior

The change removes the isDevelopment branch that gave dev environments 10,000 req/window. The new default is production-level limits with an opt-in RATE_LIMIT_MULTIPLIER env var. This is a breaking change for anyone running locally or in CI integration tests without setting RATE_LIMIT_MULTIPLIER or RATE_LIMIT_DISABLED=true.

The PR description notes this ("Rate-limit behavior in development also changes to production-level defaults") but the .env.example doesn't add RATE_LIMIT_MULTIPLIER with a recommended dev value. Worth adding:

# Increase rate limits for local development (e.g. RATE_LIMIT_MULTIPLIER=100)
# RATE_LIMIT_MULTIPLIER=100

5. Minor: getRateLimitMultiplier uses parseInt but env-validator allows floats

In rate-limit.ts:

const parsed = Number.parseInt(multiplier, 10);

In env-validator.ts:

if (!/^\d+(\.\d+)?$/.test(trimmed)) return false;
const n = Number.parseFloat(trimmed);
return n > 0 && Number.isFinite(n);

The validator accepts "1.5" as valid but the consumer truncates it to 1. Either validate as integer-only or use parseFloat at consumption.

What's Good

Security model is correct: thinking budget lives on the character record and in env, never in client request bodies. The mergeAnthropicCotProviderOptions(model, process.env, 0) pattern for background services (SEO, tweet gen, app-builder, promo assets) that need temperature control is well-reasoned.
failOnInvalid: true in env-validator gives fail-fast startup semantics for misconfigured ANTHROPIC_COT_BUDGET / ANTHROPIC_COT_BUDGET_MAX — good.
mergeProviderOptions deep-merges gateway, anthropic, and google sub-keys correctly; the JSDoc caveat about other providers being clobbered is appropriate.
Discord sendTyping guard ("sendTyping" in channel) correctly handles DM vs guild channel type variance.
ANTHROPIC_COT_BUDGET_MAX capping logic is clean and the precedence (character > env default, max caps both) is documented.

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

cursor · 2026-03-27T17:41:54Z

packages/lib/middleware/rate-limit.ts

+  if (!multiplier) return 1;
+  const parsed = Number.parseInt(multiplier, 10);
+  return Number.isNaN(parsed) || parsed < 1 ? 1 : parsed;
+}


Rate limit multiplier applies in production without guard

Medium Severity

RATE_LIMIT_MULTIPLIER is read and applied unconditionally in all environments, including production. Previously, production always used strict numeric limits (60, 10, 200, 5, 10, 100) while only development got relaxed limits (10000). Now, if RATE_LIMIT_MULTIPLIER is accidentally set in a production deployment (e.g., left over from staging), all rate limits are multiplied — RATE_LIMIT_MULTIPLIER=100 would allow 6000 requests/minute on STANDARD endpoints. getRateLimitMultiplier has no NODE_ENV === "production" guard to force the multiplier to 1.

Additional Locations (1)

packages/lib/middleware/rate-limit.ts#L324-L364

cursor · 2026-03-27T17:41:54Z

packages/lib/config/env-validator.ts

+      const n = Number.parseFloat(trimmed);
+      return n > 0 && Number.isFinite(n);
+    },
+    errorMessage: "Must be a positive number (e.g., 1, 10, 100)",


Rate limit multiplier validator accepts floats, parser truncates

Low Severity

The RATE_LIMIT_MULTIPLIER env validator accepts float strings (regex ^\d+(\.\d+)?$ and Number.parseFloat), but getRateLimitMultiplier uses Number.parseInt(multiplier, 10) which silently truncates decimals. Setting RATE_LIMIT_MULTIPLIER=1.5 passes validation but is treated as 1 at runtime. The validator and the consumer disagree on the accepted format.

Additional Locations (1)

packages/lib/middleware/rate-limit.ts#L310-L316

claude · 2026-03-27T17:42:43Z

Code Review: feat: Anthropic CoT controls

Overall the design is solid — locking the thinking budget to character settings rather than request bodies is the right security call, and the deep-merge helpers prevent accidental provider-key clobbering. A few issues need fixing before merge.

Critical — will break at runtime/test time

1. anthropic-thinking.test.ts imports symbols that do not exist

The test file at the repo root imports validateBudgetTokens, getThinkingConfig, buildThinkingParam, ThinkingConfig, and CharacterThinkingSettings from "./anthropic-thinking". None of these are exported from packages/lib/providers/anthropic-thinking.ts. Additionally, the relative import "./anthropic-thinking" from the project root resolves to a file that does not exist in this PR, and unit tests belong in packages/tests/unit/, not the repo root.

This test file appears to have been generated against a stale or different API surface and will fail to compile. Either update it to test the actual exported symbols with correct import paths, or move it to the proper test directory with matching function names.

2. supportsExtendedThinking("claude-3.5-sonnet") will return false, not true

The pattern /claude-3-5-sonnet/ uses a literal hyphen at position 8. The string "claude-3.5-sonnet" has a period there, so the regex does NOT match. The test asserts this returns true — it is a broken assertion. Fix: use /claude-3[.-]5-sonnet/ or add a separate pattern for the dot-notation variant used in some Anthropic model IDs.

Behavioral Change — developer experience regression

3. Rate-limit preset change silently breaks local dev

Removing the isDevelopment branch means local dev now defaults to production-level limits (60 req/min for STANDARD). The prior behavior was 10,000. Developers who do not add RATE_LIMIT_MULTIPLIER to .env.local will hit 429s during normal usage and integration tests. RATE_LIMIT_MULTIPLIER should be added as a commented example to .env.example.

Logic / Correctness

4. Dead code in A2A route: agentThinkingBudget ?? undefined

parseThinkingBudgetFromCharacterSettings already returns number | undefined. The ?? undefined is a no-op, inconsistent with the MCP route which passes agentThinkingBudget directly.

5. Inconsistent model-check heuristic for credit estimation

mcp/route.ts uses model.includes("claude") to gate adding thinking tokens to the output estimate; a2a/route.ts uses effectiveThinkingBudget != null. Since resolveAnthropicThinkingBudgetTokens already returns null for non-Anthropic models, model.includes("claude") is redundant and would miss provider-prefixed IDs like "anthropic/claude-sonnet-4". Prefer effectiveThinkingBudget > 0 consistently across both routes.

6. ANTHROPIC_COT_BUDGET_MAX=0 semantics are ambiguous

The env-validator accepts 0 and the error message says "0 = no cap", but an operator might expect 0 to mean "block all thinking." parseAnthropicCotBudgetMaxFromEnv also returns null for 0 (= no cap), which is functionally the same but semantically surprising. Consider rejecting 0 in the validator and requiring operators to unset the variable instead.

Minor / Nits

7. mergeProviderOptions silently clobbers unknown provider keys

Only gateway, anthropic, and google are deep-merged. Any other provider key present in both operands is overwritten by the top-level spread. The comment documents this, but CloudMergedProviderOptions is Record<string, JSONObject> so there is no type-level protection as more providers are added.

8. Credit reservation underestimates for thinking-heavy responses

500 + effectiveThinkingBudget covers worst-case thinking tokens but not additional text output. Acknowledged in comments; billUsage reconciles on finish. Worth noting in operator docs that reservations are intentionally conservative low-side estimates.

What is well done

Thinking budget locked to character settings, not request bodies — correct threat model.
resolveAnthropicThinkingBudgetTokens as a single resolution path with env fallback and max-cap clamping is clean.
Explicit 0 passed to background services (SEO, promo, app-builder, twitter) to preserve temperature control, with clear comments explaining why.
failOnInvalid: true on new env vars catches misconfigurations at startup rather than silently at request time.
mergeGoogleImageModalitiesWithAnthropicCot correctly preserves both google and anthropic provider keys via deep merge.

vercel bot had a problem deploying to Preview March 27, 2026 04:15 Failure

odilitime added 3 commits March 27, 2026 08:02

odilitime marked this pull request as ready for review March 27, 2026 08:06

Copilot AI review requested due to automatic review settings March 27, 2026 08:06

odilitime changed the title ~~fix(types): test project, split check-types, UI package, gateways~~ feat: Anthropic CoT controls Mar 27, 2026

Copilot started reviewing on behalf of odilitime March 27, 2026 08:07 View session

vercel bot had a problem deploying to Preview March 27, 2026 08:08 Failure

cursor bot reviewed Mar 27, 2026

View reviewed changes

packages/lib/api/a2a/skills.ts Outdated Show resolved Hide resolved

odilitime and others added 2 commits March 27, 2026 08:13

Copilot AI reviewed Mar 27, 2026

View reviewed changes

vercel bot had a problem deploying to Preview March 27, 2026 08:18 Failure

odilitime added 3 commits March 27, 2026 08:23

packages: improve error handling (env-validator.ts)

9b2bed4

Iteration 1 prr-fix:prrc_kwdop6btdm6yyjtn

packages: add tests for docker-ssh-cloud-deploy.test

d1ba7cb

Iteration 1 prr-fix:prrc_kwdop6btdm6yyjt7

packages: add tests for rate-limit (rate-limit.ts)

fe10e6b

Changes: - rate-limit.ts: PR Review: fix(types): test project, split check-types, UI package, gateways ...

cursor bot reviewed Mar 27, 2026

View reviewed changes

app: consolidate duplicate logic (route.ts)

8b14b89

Iteration 7 prr-fix:prrc_kwdop6btdm6y8iwm prr-fix:prrc_kwdop6btdm6y8iwq

odilitime added 3 commits March 27, 2026 17:17

app: improve a guard (route.ts)

421fda1

Iteration 1 prr-fix:ic-4143706884-3 prr-fix:ic-4144015244-5

packages: 5. app-builder-ai-sdk.ts inherits CoT without a comment

718e962

Iteration 1 prr-fix:ic-4144023511-5

app: 6. MCP generation tools inherit ANTHROPICCOTBUDGET

ddd4e20

Iteration 1 prr-fix:ic-4143990242-4

misc: add tests for anthropic-thinking (anthropic-thinking.ts)

3d6bd66

Iteration 4 prr-fix:ic-4144080947-4

odilitime added 4 commits March 27, 2026 17:21

packages: 7. a2a/skills.ts — no character context for CoT (skills.ts)

d292214

Iteration 1 prr-fix:ic-4144102325-5

packages: 9. Import ordering in runtime-factory.ts

8174ca9

Iteration 2 prr-fix:ic-4143785130-8

app: fix issues in app (route.ts)

4807344

Iteration 2 prr-fix:ic-4140973365-1 prr-fix:ic-4143807945-15

packages: 3. App-builder does not explicitly disable CoT

19f22c5

Iteration 2 prr-fix:ic-4144061538-2

vercel bot had a problem deploying to Preview March 27, 2026 17:24 Failure

cursor bot reviewed Mar 27, 2026

View reviewed changes

odilitime added 5 commits March 27, 2026 17:33

app: improve a guard (route.ts)

3465972

Iteration 1 prr-fix:ic-4143706884-3 prr-fix:ic-4144015244-5

packages: 5. app-builder-ai-sdk.ts inherits CoT without a comment

e6bb1a8

Iteration 1 prr-fix:ic-4144023511-5

app: 6. MCP generation tools inherit ANTHROPICCOTBUDGET

e093709

Iteration 1 prr-fix:ic-4143990242-4

packages: Cursor search/replace artifact left in production code

8d71533

Iteration 1 prr-fix:prrc_kwdop6btdm6y8tp_

misc: improve code quality

67138cc

docs: add review dismissal comments

3b0b079

Explains reasoning for dismissed issues inline in code

vercel bot had a problem deploying to Preview March 27, 2026 17:40 Failure

cursor bot reviewed Mar 27, 2026

View reviewed changes

-const createDatabaseAdapter = createDatabaseAdapterDefault as (
-  config: { postgresUrl: string },
-  agentId: UUID,
-) => IDatabaseAdapter;
+type CreateDatabaseAdapterFn = (
+  config: { postgresUrl: string },
+  agentId: UUID,
+) => IDatabaseAdapter;
+function ensureCreateDatabaseAdapter(fn: unknown): CreateDatabaseAdapterFn {
+  if (typeof fn !== "function") {
+    throw new TypeError(
+      'Default export from "@elizaos/plugin-sql/node" is not a callable database adapter factory',
+    );
+  }
+  return fn as CreateDatabaseAdapterFn;
+}
+const createDatabaseAdapter = ensureCreateDatabaseAdapter(
+  createDatabaseAdapterDefault,
+);

Conversation

odilitime commented Mar 27, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

vercel bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

claude bot commented Mar 27, 2026

Uh oh!

claude bot commented Mar 27, 2026

Uh oh!

claude bot commented Mar 27, 2026

Uh oh!

cursor bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

odilitime commented Mar 27, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

odilitime commented Mar 27, 2026

Uh oh!

coderabbitai bot commented Mar 27, 2026

Uh oh!

odilitime commented Mar 27, 2026

Uh oh!

coderabbitai bot commented Mar 27, 2026

Uh oh!

claude bot commented Mar 27, 2026

Code Review — PR #417: Anthropic CoT Controls

🐛 Potential Bugs

⚠️ Breaking Change for Developers

🔍 Code Quality

✅ Strengths

Uh oh!

cursor bot Mar 27, 2026

Choose a reason for hiding this comment

A2A route cost estimation inconsistent with MCP route

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 27, 2026

Choose a reason for hiding this comment

Duplicate comment block in A2A route handler

Uh oh!

odilitime commented Mar 27, 2026 •

edited by cursor bot

Loading

vercel bot commented Mar 27, 2026 •

edited

Loading

coderabbitai bot commented Mar 27, 2026 •

edited

Loading

cursor bot left a comment •

edited

Loading

claude bot commented Mar 27, 2026 •

edited

Loading

claude bot commented Mar 27, 2026 •

edited

Loading