Skip to content

feat: reduce agent token usage by 84% per deployment#10

Merged
raghavyuva merged 1 commit intomainfrom
feat/token-optimization
Apr 8, 2026
Merged

feat: reduce agent token usage by 84% per deployment#10
raghavyuva merged 1 commit intomainfrom
feat/token-optimization

Conversation

@raghavyuva
Copy link
Copy Markdown
Contributor

Summary

  • Token reduction: Simple deploy dropped from 724K → 113K input tokens (84%), complex deploy from 881K → 275K (69%)
  • Cost reduction: $0.89 → $0.13 per simple deploy, $0.93 → $0.29 per complex deploy
  • Agent architecture consolidation: Single delegateTool, instruction skills loaded on demand, trimmed workspace tools

Changes

Context pruning (biggest win)

  • Added TokenLimiterProcessor (128K cap) to prevent unbounded context growth within a run
  • Built ToolResultPruner processor to strip transient search_tools/load_tool results from conversation history

Tool routing fixes

  • Introduced queryKeys in tool-factory.ts to correctly route id as a query param instead of body, fixing add_application_domain 400 errors that caused retry spirals
  • Moved createProject and generateRandomSubdomain to core tools so they're always available without search/load overhead

Skill loading fix

  • Fixed SKILLS_SOURCE path in workspace-factory.ts using import.meta.url instead of process.cwd() — resolves correctly in both dev (tsx) and production (Docker bundled .mjs)

Response compaction

  • Enhanced compact-output.ts with recursive nested object flattening (e.g. GitHub owner objects)
  • Embedded critical domain and monitoring rules directly in system prompt as fallback

Production logging

  • Default log level to info in production, debug in development

Test plan

  • All 468 existing tests pass
  • Build succeeds (mastra build)
  • Verified simple deploy (vanilla-js): 113K tokens, $0.13
  • Verified complex deploy (django-cms with Dockerfile gen): 275K tokens, $0.29
  • Verified import.meta.url resolves correctly in Docker layout (/app/.mastra/output/../../skills/app/skills/)
  • Skills load successfully (hasSkills: true in workspace logs)
  • Domain attached at project creation time (no post-deploy add_application_domain needed)

- Add TokenLimiterProcessor (128K) and ToolResultPruner to strip
  search_tools/load_tool noise from conversation context
- Introduce queryKeys in tool-factory for correct param routing,
  fixing add_application_domain error→retry loops
- Move createProject and generateRandomSubdomain to core tools,
  embed domain/monitoring rules in system prompt
- Fix workspace-factory SKILLS_SOURCE path via import.meta.url
  so skills load correctly in both dev and Docker
- Add recursive compaction in compact-output for nested objects
- Consolidate agent architecture: single delegateTool, instruction
  skills, trimmed workspace tools
- Default log level to info in production

Reduces simple deploy from 724K to 113K input tokens ($0.89→$0.13).
@raghavyuva raghavyuva merged commit c35f4d7 into main Apr 8, 2026
1 check passed
@raghavyuva raghavyuva deleted the feat/token-optimization branch April 8, 2026 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant