[Fix] count compacted prompt tokens directly by dingyi222666 · Pull Request #883 · ChatLunaLab/chatluna

dingyi222666 · 2026-05-27T10:51:33Z

This pr fixes token budgeting in the prompt compaction path.

New Features

None

Bug fixes

Count runtime input messages directly instead of wrapping their content for token accounting.
Count scratchpad and selected history rounds message-by-message so prompt budgeting does not reuse list-level usage metadata baselines.

Other Changes

Validation: yarn lint-fix completed with no errors. Existing max-len warnings remain in read_chat_message.ts and extension-agent/src/sub-agent/builtin.ts.

coderabbitai · 2026-05-27T10:51:47Z

Walkthrough

将 chat_history 中的批量消息令牌计数替换为逐条计数（使用 countMessageTokens），并调整 countMessageTokens 以在 AI 消息存在 tool_calls 时将其序列化负载计入令牌总数。

变更

令牌计数方法统一重构

层次 / 文件	摘要
令牌计数导入更新 `packages/core/src/llm-core/prompt/chat_history.ts`	导入声明更新：用 `countMessageTokens` 替换原先的 `countMessagesTokens`。
chat_history：逐条计数替换批量计数 `packages/core/src/llm-core/prompt/chat_history.ts`	在输入预估、`agentScratchpad`、从尾部累加 rounds 以及保证至少一轮的路径中，用 `for...of` + `await countMessageTokens` 逐条累加 `runtime.usedTokens`/`roundTokens`，替代原来的批量计数。
count_tokens：包含 tool_calls 的单条计数 `packages/core/src/llm-core/utils/count_tokens.ts`	`countMessageTokens` 改为先累积 content/role/name 的 token，总计后若为 AI 消息且存在 `tool_calls`（从 `message.tool_calls` 或 `additional_kwargs?.tool_calls` 获取），将其 JSON.stringify 后的内容也计入 token 并返回累积值。

估计代码审查工作量

🎯 3 (Moderate) | ⏱️ ~20 分钟

可能相关的 PR

ChatLunaLab/chatluna#664: 与 tool-related 消息在上下文预算中计数和过滤逻辑有关联。
ChatLunaLab/chatluna#602: 也涉及 countMessageTokens 的使用与截断/预算逻辑变更。
ChatLunaLab/chatluna#874: 与上下文压缩与令牌预算触发逻辑直接相关，计数实现变化会影响其行为。

诗歌

🐰 我是一只忙碌的小兔子蹦跶来，
逐条数令牌，不再成堆待，
tool_calls 的话语也记得在怀，
循环与累加，历史有序开来，
聊天更清晰，预算也安稳彩。

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title '[Fix] count compacted prompt tokens directly' is directly related to the main change: switching from bulk token-counting to per-message token counting for prompt compaction budgeting.
Description check	✅ Passed	The description clearly relates to the changeset, explaining the bug fixes for token budgeting including direct counting of runtime input and per-message token counting for scratchpad/history rounds.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/token-compact

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request refactors token counting in chat_history.ts by removing the countMessagesTokens helper and instead iterating over messages to count tokens individually using countMessageTokens. It also simplifies input token pre-accounting. The reviewer suggests optimizing performance by replacing sequential await operations inside loops with parallel execution using Promise.all when counting tokens for scratchpads, message rounds, and the last round.

coderabbitai

🧹 Nitpick comments (1)

packages/core/src/llm-core/prompt/chat_history.ts (1)

37-42: ⚡ Quick win

可并行的逐条计数被串行 await，建议改为并发聚合。

这三处消息计数彼此独立，可用 Promise.all(...map(...)) 并发后求和，减少延迟并符合仓库规则。请先确认 countMessageTokens 对共享 runtime.tokenCounter 没有可见副作用后并发化。

♻️ 建议修改

             if (Array.isArray(runtime.agentScratchpad)) {
-                for (const msg of runtime.agentScratchpad) {
-                    runtime.usedTokens += await countMessageTokens(
-                        msg,
-                        runtime.tokenCounter
-                    )
-                }
+                runtime.usedTokens += (
+                    await Promise.all(
+                        runtime.agentScratchpad.map((msg) =>
+                            countMessageTokens(msg, runtime.tokenCounter)
+                        )
+                    )
+                ).reduce((sum, n) => sum + n, 0)
             } else {
                 runtime.usedTokens += await countMessageTokens(
                     runtime.agentScratchpad as BaseMessage,
@@
-            let roundTokens = 0
-            for (const msg of round) {
-                roundTokens += await countMessageTokens(
-                    msg,
-                    runtime.tokenCounter
-                )
-            }
+            const roundTokens = (
+                await Promise.all(
+                    round.map((msg) =>
+                        countMessageTokens(msg, runtime.tokenCounter)
+                    )
+                )
+            ).reduce((sum, n) => sum + n, 0)
@@
-            for (const msg of lastRound) {
-                usedTokens += await countMessageTokens(
-                    msg,
-                    runtime.tokenCounter
-                )
-            }
+            usedTokens += (
+                await Promise.all(
+                    lastRound.map((msg) =>
+                        countMessageTokens(msg, runtime.tokenCounter)
+                    )
+                )
+            ).reduce((sum, n) => sum + n, 0)

As per coding guidelines "ALWAYS USE PARALLEL TOOLS WHEN APPLICABLE".

Also applies to: 63-68, 90-95

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/llm-core/prompt/chat_history.ts` around lines 37 - 42, The
loop that serially awaits countMessageTokens for each msg (iterating
runtime.agentScratchpad and incrementing runtime.usedTokens) should be converted
to a parallel aggregation using Promise.all over
runtime.agentScratchpad.map(...) and then summing the results into
runtime.usedTokens; first confirm countMessageTokens does not mutate or rely on
shared state in runtime.tokenCounter (or otherwise make a thread-safe clone)
before parallelizing, and apply the same change to the other two similar blocks
(the occurrences mentioned at lines ~63-68 and ~90-95) so all independent
message token counts run concurrently.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@packages/core/src/llm-core/prompt/chat_history.ts`:
- Around line 37-42: The loop that serially awaits countMessageTokens for each
msg (iterating runtime.agentScratchpad and incrementing runtime.usedTokens)
should be converted to a parallel aggregation using Promise.all over
runtime.agentScratchpad.map(...) and then summing the results into
runtime.usedTokens; first confirm countMessageTokens does not mutate or rely on
shared state in runtime.tokenCounter (or otherwise make a thread-safe clone)
before parallelizing, and apply the same change to the other two similar blocks
(the occurrences mentioned at lines ~63-68 and ~90-95) so all independent
message token counts run concurrently.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: a3f4e902-a63d-4113-92fa-6dfe671ff047

📥 Commits

Reviewing files that changed from the base of the PR and between b272a80 and 15195df.

📒 Files selected for processing (1)

packages/core/src/llm-core/prompt/chat_history.ts

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 15195dfd6e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

coderabbitai

🧹 Nitpick comments (2)

packages/core/src/llm-core/utils/count_tokens.ts (2)

230-233: ⚡ Quick win

建议并行化令牌计数调用。

三个 tokenCounter 调用是相互独立的，可以使用 Promise.all 并行执行以减少延迟。由于此函数在聊天历史截断循环中针对每条消息调用（参见 chat_history.ts），并行化可以提升性能。编码准则要求："ALWAYS USE PARALLEL TOOLS WHEN APPLICABLE"。

⚡ 建议的并行化重构

-    let tokens =
-        (await tokenCounter(content)) +
-        (await tokenCounter(messageTypeToOpenAIRole(message.getType()))) +
-        (message.name ? await tokenCounter(message.name) : 0)
+    const [contentTokens, roleTokens, nameTokens] = await Promise.all([
+        tokenCounter(content),
+        tokenCounter(messageTypeToOpenAIRole(message.getType())),
+        message.name ? tokenCounter(message.name) : Promise.resolve(0)
+    ])
+    let tokens = contentTokens + roleTokens + nameTokens

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/llm-core/utils/count_tokens.ts` around lines 230 - 233, The
three independent tokenCounter calls should be executed in parallel: replace the
sequential awaits used to compute tokens (calls to tokenCounter(content),
tokenCounter(messageTypeToOpenAIRole(message.getType())), and
tokenCounter(message.name) when present) with a Promise.all that runs them
concurrently and then sum the resolved counts into the tokens variable; preserve
the conditional 0 for missing message.name and keep the same variable names
(tokens, tokenCounter, messageTypeToOpenAIRole, message.getType(),
message.name).

239-239: 💤 Low value

可简化防御性 Array.isArray 检查。

Array.isArray 检查属于防御性编程。根据代码库模式（参见 output_parser.ts:161 和 model.ts:383），可直接信任类型并使用可选链：toolCalls?.length > 0。编码准则要求："Do NOT add defensive/fallback checks; use the most probable type directly"。

♻️ 建议的简化重构

-    if (Array.isArray(toolCalls) && toolCalls.length > 0) {
+    if (toolCalls?.length > 0) {
         tokens += await tokenCounter(JSON.stringify(toolCalls))
     }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/llm-core/utils/count_tokens.ts` at line 239, Replace the
defensive Array.isArray check with the project's expected direct usage of the
value: change the condition that uses Array.isArray(toolCalls) &&
toolCalls.length > 0 to use optional chaining like toolCalls?.length > 0 so the
code trusts the declared type; update the conditional at the place referencing
toolCalls (in count_tokens.ts where toolCalls is used) and remove the redundant
Array.isArray check.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@packages/core/src/llm-core/utils/count_tokens.ts`:
- Around line 230-233: The three independent tokenCounter calls should be
executed in parallel: replace the sequential awaits used to compute tokens
(calls to tokenCounter(content),
tokenCounter(messageTypeToOpenAIRole(message.getType())), and
tokenCounter(message.name) when present) with a Promise.all that runs them
concurrently and then sum the resolved counts into the tokens variable; preserve
the conditional 0 for missing message.name and keep the same variable names
(tokens, tokenCounter, messageTypeToOpenAIRole, message.getType(),
message.name).
- Line 239: Replace the defensive Array.isArray check with the project's
expected direct usage of the value: change the condition that uses
Array.isArray(toolCalls) && toolCalls.length > 0 to use optional chaining like
toolCalls?.length > 0 so the code trusts the declared type; update the
conditional at the place referencing toolCalls (in count_tokens.ts where
toolCalls is used) and remove the redundant Array.isArray check.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: d7a17628-abb8-4465-9afe-0ccf45f76e8d

📥 Commits

Reviewing files that changed from the base of the PR and between 15195df and 2af5db1.

📒 Files selected for processing (1)

packages/core/src/llm-core/utils/count_tokens.ts

[Fix] count compacted prompt tokens directly

15195df

gemini-code-assist Bot reviewed May 27, 2026

View reviewed changes

Comment thread packages/core/src/llm-core/prompt/chat_history.ts

Comment thread packages/core/src/llm-core/prompt/chat_history.ts

Comment thread packages/core/src/llm-core/prompt/chat_history.ts

coderabbitai Bot reviewed May 27, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed May 27, 2026

View reviewed changes

Comment thread packages/core/src/llm-core/prompt/chat_history.ts

fix(count_tokens): include tool_calls payload in token counting

2af5db1

coderabbitai Bot reviewed May 27, 2026

View reviewed changes

dingyi222666 merged commit 3bf7d6f into v1-dev May 28, 2026
5 checks passed

dingyi222666 deleted the fix/token-compact branch May 28, 2026 05:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Fix] count compacted prompt tokens directly#883

[Fix] count compacted prompt tokens directly#883
dingyi222666 merged 2 commits into
v1-devfrom
fix/token-compact

dingyi222666 commented May 27, 2026

Uh oh!

coderabbitai Bot commented May 27, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

dingyi222666 commented May 27, 2026

New Features

Bug fixes

Other Changes

Uh oh!

coderabbitai Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

变更

估计代码审查工作量

可能相关的 PR

诗歌

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented May 27, 2026 •

edited

Loading