[Fix] count compacted prompt tokens directly#883
Conversation
Walkthrough将 chat_history 中的批量消息令牌计数替换为逐条计数(使用 countMessageTokens),并调整 countMessageTokens 以在 AI 消息存在 tool_calls 时将其序列化负载计入令牌总数。 变更令牌计数方法统一重构
估计代码审查工作量🎯 3 (Moderate) | ⏱️ ~20 分钟 可能相关的 PR
诗歌
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request refactors token counting in chat_history.ts by removing the countMessagesTokens helper and instead iterating over messages to count tokens individually using countMessageTokens. It also simplifies input token pre-accounting. The reviewer suggests optimizing performance by replacing sequential await operations inside loops with parallel execution using Promise.all when counting tokens for scratchpads, message rounds, and the last round.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/core/src/llm-core/prompt/chat_history.ts (1)
37-42: ⚡ Quick win可并行的逐条计数被串行
await,建议改为并发聚合。这三处消息计数彼此独立,可用
Promise.all(...map(...))并发后求和,减少延迟并符合仓库规则。请先确认countMessageTokens对共享runtime.tokenCounter没有可见副作用后并发化。♻️ 建议修改
if (Array.isArray(runtime.agentScratchpad)) { - for (const msg of runtime.agentScratchpad) { - runtime.usedTokens += await countMessageTokens( - msg, - runtime.tokenCounter - ) - } + runtime.usedTokens += ( + await Promise.all( + runtime.agentScratchpad.map((msg) => + countMessageTokens(msg, runtime.tokenCounter) + ) + ) + ).reduce((sum, n) => sum + n, 0) } else { runtime.usedTokens += await countMessageTokens( runtime.agentScratchpad as BaseMessage, @@ - let roundTokens = 0 - for (const msg of round) { - roundTokens += await countMessageTokens( - msg, - runtime.tokenCounter - ) - } + const roundTokens = ( + await Promise.all( + round.map((msg) => + countMessageTokens(msg, runtime.tokenCounter) + ) + ) + ).reduce((sum, n) => sum + n, 0) @@ - for (const msg of lastRound) { - usedTokens += await countMessageTokens( - msg, - runtime.tokenCounter - ) - } + usedTokens += ( + await Promise.all( + lastRound.map((msg) => + countMessageTokens(msg, runtime.tokenCounter) + ) + ) + ).reduce((sum, n) => sum + n, 0)As per coding guidelines "ALWAYS USE PARALLEL TOOLS WHEN APPLICABLE".
Also applies to: 63-68, 90-95
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/core/src/llm-core/prompt/chat_history.ts` around lines 37 - 42, The loop that serially awaits countMessageTokens for each msg (iterating runtime.agentScratchpad and incrementing runtime.usedTokens) should be converted to a parallel aggregation using Promise.all over runtime.agentScratchpad.map(...) and then summing the results into runtime.usedTokens; first confirm countMessageTokens does not mutate or rely on shared state in runtime.tokenCounter (or otherwise make a thread-safe clone) before parallelizing, and apply the same change to the other two similar blocks (the occurrences mentioned at lines ~63-68 and ~90-95) so all independent message token counts run concurrently.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@packages/core/src/llm-core/prompt/chat_history.ts`:
- Around line 37-42: The loop that serially awaits countMessageTokens for each
msg (iterating runtime.agentScratchpad and incrementing runtime.usedTokens)
should be converted to a parallel aggregation using Promise.all over
runtime.agentScratchpad.map(...) and then summing the results into
runtime.usedTokens; first confirm countMessageTokens does not mutate or rely on
shared state in runtime.tokenCounter (or otherwise make a thread-safe clone)
before parallelizing, and apply the same change to the other two similar blocks
(the occurrences mentioned at lines ~63-68 and ~90-95) so all independent
message token counts run concurrently.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: a3f4e902-a63d-4113-92fa-6dfe671ff047
📒 Files selected for processing (1)
packages/core/src/llm-core/prompt/chat_history.ts
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 15195dfd6e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
🧹 Nitpick comments (2)
packages/core/src/llm-core/utils/count_tokens.ts (2)
230-233: ⚡ Quick win建议并行化令牌计数调用。
三个
tokenCounter调用是相互独立的,可以使用Promise.all并行执行以减少延迟。由于此函数在聊天历史截断循环中针对每条消息调用(参见chat_history.ts),并行化可以提升性能。编码准则要求:"ALWAYS USE PARALLEL TOOLS WHEN APPLICABLE"。⚡ 建议的并行化重构
- let tokens = - (await tokenCounter(content)) + - (await tokenCounter(messageTypeToOpenAIRole(message.getType()))) + - (message.name ? await tokenCounter(message.name) : 0) + const [contentTokens, roleTokens, nameTokens] = await Promise.all([ + tokenCounter(content), + tokenCounter(messageTypeToOpenAIRole(message.getType())), + message.name ? tokenCounter(message.name) : Promise.resolve(0) + ]) + let tokens = contentTokens + roleTokens + nameTokens🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/core/src/llm-core/utils/count_tokens.ts` around lines 230 - 233, The three independent tokenCounter calls should be executed in parallel: replace the sequential awaits used to compute tokens (calls to tokenCounter(content), tokenCounter(messageTypeToOpenAIRole(message.getType())), and tokenCounter(message.name) when present) with a Promise.all that runs them concurrently and then sum the resolved counts into the tokens variable; preserve the conditional 0 for missing message.name and keep the same variable names (tokens, tokenCounter, messageTypeToOpenAIRole, message.getType(), message.name).
239-239: 💤 Low value可简化防御性 Array.isArray 检查。
Array.isArray检查属于防御性编程。根据代码库模式(参见output_parser.ts:161和model.ts:383),可直接信任类型并使用可选链:toolCalls?.length > 0。编码准则要求:"Do NOT add defensive/fallback checks; use the most probable type directly"。♻️ 建议的简化重构
- if (Array.isArray(toolCalls) && toolCalls.length > 0) { + if (toolCalls?.length > 0) { tokens += await tokenCounter(JSON.stringify(toolCalls)) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/core/src/llm-core/utils/count_tokens.ts` at line 239, Replace the defensive Array.isArray check with the project's expected direct usage of the value: change the condition that uses Array.isArray(toolCalls) && toolCalls.length > 0 to use optional chaining like toolCalls?.length > 0 so the code trusts the declared type; update the conditional at the place referencing toolCalls (in count_tokens.ts where toolCalls is used) and remove the redundant Array.isArray check.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@packages/core/src/llm-core/utils/count_tokens.ts`:
- Around line 230-233: The three independent tokenCounter calls should be
executed in parallel: replace the sequential awaits used to compute tokens
(calls to tokenCounter(content),
tokenCounter(messageTypeToOpenAIRole(message.getType())), and
tokenCounter(message.name) when present) with a Promise.all that runs them
concurrently and then sum the resolved counts into the tokens variable; preserve
the conditional 0 for missing message.name and keep the same variable names
(tokens, tokenCounter, messageTypeToOpenAIRole, message.getType(),
message.name).
- Line 239: Replace the defensive Array.isArray check with the project's
expected direct usage of the value: change the condition that uses
Array.isArray(toolCalls) && toolCalls.length > 0 to use optional chaining like
toolCalls?.length > 0 so the code trusts the declared type; update the
conditional at the place referencing toolCalls (in count_tokens.ts where
toolCalls is used) and remove the redundant Array.isArray check.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d7a17628-abb8-4465-9afe-0ccf45f76e8d
📒 Files selected for processing (1)
packages/core/src/llm-core/utils/count_tokens.ts
This pr fixes token budgeting in the prompt compaction path.
New Features
None
Bug fixes
Other Changes
yarn lint-fixcompleted with no errors. Existing max-len warnings remain inread_chat_message.tsandextension-agent/src/sub-agent/builtin.ts.