Context
Follow-up to #140.
Issue #140 described a token blow-up caused by base64 video/preview image payloads entering model context. The exact implementation described there does not match the current MasterSelects checkout: the referenced workers/agent/src/* files, generateText/streamText, AI SDK toModelOutput, and scan-video handler are not present on staging or the current remote branches.
However, the underlying risk is real in the current app.
Verified Current-Code Risk
Current preview tools return base64 PNG data URLs:
src/services/aiTools/handlers/preview.ts
captureFrame returns dataUrl
getCutPreviewQuad delegates to frame-grid capture
getFramesAtTimes delegates to frame-grid capture
src/services/aiTools/utils.ts
captureFrameGrid() serializes a PNG grid via gridCanvas.toDataURL('image/png')
The immediate tool follow-up path is partially protected:
src/components/panels/AIChatPanel.tsx
formatToolResultForApi() truncates tool results before sending them back to the model in the same loop
- current caps:
MAX_TOOL_RESULT_MESSAGE_CHARS = 12000, MAX_TOOL_RESULT_STRING_CHARS = 1200
But the full tool result is still stored in chat state for UI/history:
content: JSON.stringify(result, null, 2)
Later API-message rebuilding can include old tool messages as-is:
So a full base64 dataUrl from captureFrame, getCutPreviewQuad, or getFramesAtTimes can re-enter the model context on a later user turn through conversation history, even if the immediate tool-result follow-up was truncated.
Why This Matters
A single preview/grid image can be hundreds of KB to multiple MB as base64 text. If preserved in chat history and sent to a text model as a tool message, it can cause:
- excessive prompt tokens
- model/API context overflows
- avoidable hosted-AI cost
- degraded editor-agent reliability after visual tool usage
Proposed Fix
Separate UI-visible tool results from model-visible tool results.
Suggested direction:
- Keep full image data only in UI/local state where needed for preview display.
- Store model-visible tool messages in sanitized form immediately, not just at send time.
- Replace base64
dataUrl fields with metadata/handles, for example:
{
"success": true,
"data": {
"width": 1280,
"height": 360,
"frameCount": 8,
"gridSize": "4x2",
"image": "[preview image omitted from text context]"
}
}
- If visual model input is needed later, add an explicit image-part path rather than serializing base64 as text.
- Add regression tests around
formatToolResultForApi() / chat-history serialization so data:image/...;base64,... cannot appear in outgoing messages.
Acceptance Criteria
- Outgoing OpenAI/hosted chat
messages never contain raw data:image/...;base64, strings from tool history.
- Preview tools still show captured images in the UI where intended.
- Same-turn tool follow-up remains concise and useful.
- A regression test covers a prior
captureFrame or getCutPreviewQuad result in chat history being rebuilt into API messages.
Notes
This is related to #140's base64-token issue, but the correct fix for the current codebase is not AI SDK toModelOutput; it is sanitized chat-history/model-context serialization for preview tool results.
Context
Follow-up to #140.
Issue #140 described a token blow-up caused by base64 video/preview image payloads entering model context. The exact implementation described there does not match the current MasterSelects checkout: the referenced
workers/agent/src/*files,generateText/streamText, AI SDKtoModelOutput, andscan-videohandler are not present onstagingor the current remote branches.However, the underlying risk is real in the current app.
Verified Current-Code Risk
Current preview tools return base64 PNG data URLs:
src/services/aiTools/handlers/preview.tscaptureFramereturnsdataUrlgetCutPreviewQuaddelegates to frame-grid capturegetFramesAtTimesdelegates to frame-grid capturesrc/services/aiTools/utils.tscaptureFrameGrid()serializes a PNG grid viagridCanvas.toDataURL('image/png')The immediate tool follow-up path is partially protected:
src/components/panels/AIChatPanel.tsxformatToolResultForApi()truncates tool results before sending them back to the model in the same loopMAX_TOOL_RESULT_MESSAGE_CHARS = 12000,MAX_TOOL_RESULT_STRING_CHARS = 1200But the full tool result is still stored in chat state for UI/history:
Later API-message rebuilding can include old tool messages as-is:
So a full base64
dataUrlfromcaptureFrame,getCutPreviewQuad, orgetFramesAtTimescan re-enter the model context on a later user turn through conversation history, even if the immediate tool-result follow-up was truncated.Why This Matters
A single preview/grid image can be hundreds of KB to multiple MB as base64 text. If preserved in chat history and sent to a text model as a tool message, it can cause:
Proposed Fix
Separate UI-visible tool results from model-visible tool results.
Suggested direction:
dataUrlfields with metadata/handles, for example:{ "success": true, "data": { "width": 1280, "height": 360, "frameCount": 8, "gridSize": "4x2", "image": "[preview image omitted from text context]" } }formatToolResultForApi()/ chat-history serialization sodata:image/...;base64,...cannot appear in outgoingmessages.Acceptance Criteria
messagesnever contain rawdata:image/...;base64,strings from tool history.captureFrameorgetCutPreviewQuadresult in chat history being rebuilt into API messages.Notes
This is related to #140's base64-token issue, but the correct fix for the current codebase is not AI SDK
toModelOutput; it is sanitized chat-history/model-context serialization for preview tool results.