Skip to content

[Fix] count compacted prompt tokens directly#883

Merged
dingyi222666 merged 2 commits into
v1-devfrom
fix/token-compact
May 28, 2026
Merged

[Fix] count compacted prompt tokens directly#883
dingyi222666 merged 2 commits into
v1-devfrom
fix/token-compact

Conversation

@dingyi222666
Copy link
Copy Markdown
Member

This pr fixes token budgeting in the prompt compaction path.

New Features

None

Bug fixes

  • Count runtime input messages directly instead of wrapping their content for token accounting.
  • Count scratchpad and selected history rounds message-by-message so prompt budgeting does not reuse list-level usage metadata baselines.

Other Changes

  • Validation: yarn lint-fix completed with no errors. Existing max-len warnings remain in read_chat_message.ts and extension-agent/src/sub-agent/builtin.ts.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 27, 2026

Review Change Stack

Walkthrough

将 chat_history 中的批量消息令牌计数替换为逐条计数(使用 countMessageTokens),并调整 countMessageTokens 以在 AI 消息存在 tool_calls 时将其序列化负载计入令牌总数。

变更

令牌计数方法统一重构

层次 / 文件 摘要
令牌计数导入更新
packages/core/src/llm-core/prompt/chat_history.ts
导入声明更新:用 countMessageTokens 替换原先的 countMessagesTokens
chat_history:逐条计数替换批量计数
packages/core/src/llm-core/prompt/chat_history.ts
在输入预估、agentScratchpad、从尾部累加 rounds 以及保证至少一轮的路径中,用 for...of + await countMessageTokens 逐条累加 runtime.usedTokens/roundTokens,替代原来的批量计数。
count_tokens:包含 tool_calls 的单条计数
packages/core/src/llm-core/utils/count_tokens.ts
countMessageTokens 改为先累积 content/role/name 的 token,总计后若为 AI 消息且存在 tool_calls(从 message.tool_callsadditional_kwargs?.tool_calls 获取),将其 JSON.stringify 后的内容也计入 token 并返回累积值。

估计代码审查工作量

🎯 3 (Moderate) | ⏱️ ~20 分钟

可能相关的 PR

诗歌

🐰 我是一只忙碌的小兔子蹦跶来,
逐条数令牌,不再成堆待,
tool_calls 的话语也记得在怀,
循环与累加,历史有序开来,
聊天更清晰,预算也安稳彩。

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title '[Fix] count compacted prompt tokens directly' is directly related to the main change: switching from bulk token-counting to per-message token counting for prompt compaction budgeting.
Description check ✅ Passed The description clearly relates to the changeset, explaining the bug fixes for token budgeting including direct counting of runtime input and per-message token counting for scratchpad/history rounds.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/token-compact

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors token counting in chat_history.ts by removing the countMessagesTokens helper and instead iterating over messages to count tokens individually using countMessageTokens. It also simplifies input token pre-accounting. The reviewer suggests optimizing performance by replacing sequential await operations inside loops with parallel execution using Promise.all when counting tokens for scratchpads, message rounds, and the last round.

Comment thread packages/core/src/llm-core/prompt/chat_history.ts
Comment thread packages/core/src/llm-core/prompt/chat_history.ts
Comment thread packages/core/src/llm-core/prompt/chat_history.ts
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
packages/core/src/llm-core/prompt/chat_history.ts (1)

37-42: ⚡ Quick win

可并行的逐条计数被串行 await,建议改为并发聚合。

这三处消息计数彼此独立,可用 Promise.all(...map(...)) 并发后求和,减少延迟并符合仓库规则。请先确认 countMessageTokens 对共享 runtime.tokenCounter 没有可见副作用后并发化。

♻️ 建议修改
             if (Array.isArray(runtime.agentScratchpad)) {
-                for (const msg of runtime.agentScratchpad) {
-                    runtime.usedTokens += await countMessageTokens(
-                        msg,
-                        runtime.tokenCounter
-                    )
-                }
+                runtime.usedTokens += (
+                    await Promise.all(
+                        runtime.agentScratchpad.map((msg) =>
+                            countMessageTokens(msg, runtime.tokenCounter)
+                        )
+                    )
+                ).reduce((sum, n) => sum + n, 0)
             } else {
                 runtime.usedTokens += await countMessageTokens(
                     runtime.agentScratchpad as BaseMessage,
@@
-            let roundTokens = 0
-            for (const msg of round) {
-                roundTokens += await countMessageTokens(
-                    msg,
-                    runtime.tokenCounter
-                )
-            }
+            const roundTokens = (
+                await Promise.all(
+                    round.map((msg) =>
+                        countMessageTokens(msg, runtime.tokenCounter)
+                    )
+                )
+            ).reduce((sum, n) => sum + n, 0)
@@
-            for (const msg of lastRound) {
-                usedTokens += await countMessageTokens(
-                    msg,
-                    runtime.tokenCounter
-                )
-            }
+            usedTokens += (
+                await Promise.all(
+                    lastRound.map((msg) =>
+                        countMessageTokens(msg, runtime.tokenCounter)
+                    )
+                )
+            ).reduce((sum, n) => sum + n, 0)

As per coding guidelines "ALWAYS USE PARALLEL TOOLS WHEN APPLICABLE".

Also applies to: 63-68, 90-95

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/llm-core/prompt/chat_history.ts` around lines 37 - 42, The
loop that serially awaits countMessageTokens for each msg (iterating
runtime.agentScratchpad and incrementing runtime.usedTokens) should be converted
to a parallel aggregation using Promise.all over
runtime.agentScratchpad.map(...) and then summing the results into
runtime.usedTokens; first confirm countMessageTokens does not mutate or rely on
shared state in runtime.tokenCounter (or otherwise make a thread-safe clone)
before parallelizing, and apply the same change to the other two similar blocks
(the occurrences mentioned at lines ~63-68 and ~90-95) so all independent
message token counts run concurrently.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@packages/core/src/llm-core/prompt/chat_history.ts`:
- Around line 37-42: The loop that serially awaits countMessageTokens for each
msg (iterating runtime.agentScratchpad and incrementing runtime.usedTokens)
should be converted to a parallel aggregation using Promise.all over
runtime.agentScratchpad.map(...) and then summing the results into
runtime.usedTokens; first confirm countMessageTokens does not mutate or rely on
shared state in runtime.tokenCounter (or otherwise make a thread-safe clone)
before parallelizing, and apply the same change to the other two similar blocks
(the occurrences mentioned at lines ~63-68 and ~90-95) so all independent
message token counts run concurrently.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: a3f4e902-a63d-4113-92fa-6dfe671ff047

📥 Commits

Reviewing files that changed from the base of the PR and between b272a80 and 15195df.

📒 Files selected for processing (1)
  • packages/core/src/llm-core/prompt/chat_history.ts

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 15195dfd6e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/core/src/llm-core/prompt/chat_history.ts
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
packages/core/src/llm-core/utils/count_tokens.ts (2)

230-233: ⚡ Quick win

建议并行化令牌计数调用。

三个 tokenCounter 调用是相互独立的,可以使用 Promise.all 并行执行以减少延迟。由于此函数在聊天历史截断循环中针对每条消息调用(参见 chat_history.ts),并行化可以提升性能。编码准则要求:"ALWAYS USE PARALLEL TOOLS WHEN APPLICABLE"。

⚡ 建议的并行化重构
-    let tokens =
-        (await tokenCounter(content)) +
-        (await tokenCounter(messageTypeToOpenAIRole(message.getType()))) +
-        (message.name ? await tokenCounter(message.name) : 0)
+    const [contentTokens, roleTokens, nameTokens] = await Promise.all([
+        tokenCounter(content),
+        tokenCounter(messageTypeToOpenAIRole(message.getType())),
+        message.name ? tokenCounter(message.name) : Promise.resolve(0)
+    ])
+    let tokens = contentTokens + roleTokens + nameTokens
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/llm-core/utils/count_tokens.ts` around lines 230 - 233, The
three independent tokenCounter calls should be executed in parallel: replace the
sequential awaits used to compute tokens (calls to tokenCounter(content),
tokenCounter(messageTypeToOpenAIRole(message.getType())), and
tokenCounter(message.name) when present) with a Promise.all that runs them
concurrently and then sum the resolved counts into the tokens variable; preserve
the conditional 0 for missing message.name and keep the same variable names
(tokens, tokenCounter, messageTypeToOpenAIRole, message.getType(),
message.name).

239-239: 💤 Low value

可简化防御性 Array.isArray 检查。

Array.isArray 检查属于防御性编程。根据代码库模式(参见 output_parser.ts:161model.ts:383),可直接信任类型并使用可选链:toolCalls?.length > 0。编码准则要求:"Do NOT add defensive/fallback checks; use the most probable type directly"。

♻️ 建议的简化重构
-    if (Array.isArray(toolCalls) && toolCalls.length > 0) {
+    if (toolCalls?.length > 0) {
         tokens += await tokenCounter(JSON.stringify(toolCalls))
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/llm-core/utils/count_tokens.ts` at line 239, Replace the
defensive Array.isArray check with the project's expected direct usage of the
value: change the condition that uses Array.isArray(toolCalls) &&
toolCalls.length > 0 to use optional chaining like toolCalls?.length > 0 so the
code trusts the declared type; update the conditional at the place referencing
toolCalls (in count_tokens.ts where toolCalls is used) and remove the redundant
Array.isArray check.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@packages/core/src/llm-core/utils/count_tokens.ts`:
- Around line 230-233: The three independent tokenCounter calls should be
executed in parallel: replace the sequential awaits used to compute tokens
(calls to tokenCounter(content),
tokenCounter(messageTypeToOpenAIRole(message.getType())), and
tokenCounter(message.name) when present) with a Promise.all that runs them
concurrently and then sum the resolved counts into the tokens variable; preserve
the conditional 0 for missing message.name and keep the same variable names
(tokens, tokenCounter, messageTypeToOpenAIRole, message.getType(),
message.name).
- Line 239: Replace the defensive Array.isArray check with the project's
expected direct usage of the value: change the condition that uses
Array.isArray(toolCalls) && toolCalls.length > 0 to use optional chaining like
toolCalls?.length > 0 so the code trusts the declared type; update the
conditional at the place referencing toolCalls (in count_tokens.ts where
toolCalls is used) and remove the redundant Array.isArray check.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: d7a17628-abb8-4465-9afe-0ccf45f76e8d

📥 Commits

Reviewing files that changed from the base of the PR and between 15195df and 2af5db1.

📒 Files selected for processing (1)
  • packages/core/src/llm-core/utils/count_tokens.ts

@dingyi222666 dingyi222666 merged commit 3bf7d6f into v1-dev May 28, 2026
5 checks passed
@dingyi222666 dingyi222666 deleted the fix/token-compact branch May 28, 2026 05:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant