diff --git a/API.en.md b/API.en.md
index 46601886..c5b8f9b7 100644
--- a/API.en.md
+++ b/API.en.md
@@ -40,9 +40,9 @@ Docs: [Overview](README.en.md) / [Architecture](docs/ARCHITECTURE.en.md) / [Depl
 
 - OpenAI / Claude / Gemini protocols are now mounted on one shared `chi` router tree assembled in `internal/server/router.go`.
 - Adapter responsibilities are streamlined to: **request normalization → DeepSeek invocation → protocol-shaped rendering**, reducing legacy split-logic paths.
-- Tool-calling semantics are aligned between Go and Node runtime: models should output the fullwidth-separator DSML shell `<｜DSML｜tool_calls>` → `<｜DSML｜invoke name="...">` → `<｜DSML｜parameter name="...">`; DS2API also accepts the halfwidth DSML wrapper `<|DSML|tool_calls>`, DSML wrapper aliases such as `<dsml|tool_calls>`, `<|tool_calls>`, `<｜tool_calls>`, common DSML separator drift such as `<|DSML tool_calls>`, collapsed DSML local names such as `<DSMLtool_calls>`, control-separator drift such as `<DSML␂tool_calls>` / raw STX `\x02`, CJK angle bracket, fullwidth-bang / ideographic-comma separator drift, PascalCase local-name drift, and trailing attribute separator drift such as `<DSM｜parameter name="command"｜>...〈/DSM｜parameter〉`, `<！DSML！invoke name=“Bash”>`, `<、DSML、tool_calls>`, `<DSmartToolCalls>`, or `<DSMLtool_calls※>`, arbitrary protocol prefixes such as `<proto💥tool_calls>`, and legacy canonical XML `<tool_calls>` → `<invoke name="...">` → `<parameter name="...">`. The scanner normalizes fixed local names (`tool_calls` / `invoke` / `parameter`) with non-structural separators before or after them back to XML before parsing, and also tolerates CDATA opener drift such as `<！[CDATA[` / `<、[CDATA[`; only wrapped tool blocks or the narrow missing-opening-wrapper repair path enter the tool path, while bare `<invoke>` does not count as supported syntax. JSON literal parameter bodies are preserved as structured values, explicit empty or whitespace-only parameters are preserved as empty strings, malformed complete wrappers are released as plain text, and loose CDATA is narrowly repaired at final parse/flush when it can preserve a complete outer tool call.
+- Tool-calling semantics are aligned between Go and Node runtime: models should output the halfwidth-pipe DSML shell `<|DSML|tool_calls>` → `<|DSML|invoke name="...">` → `<|DSML|parameter name="...">`; DS2API also accepts DSML wrapper aliases such as `<dsml|tool_calls>` and `<|tool_calls>`, common DSML separator drift such as `<|DSML tool_calls>`, collapsed DSML local names such as `<DSMLtool_calls>`, control-separator drift such as `<DSML␂tool_calls>` / raw STX `\x02`, CJK angle bracket, fullwidth-bang / ideographic-comma separator drift, PascalCase local-name drift, and trailing attribute separator drift such as `<DSM|parameter name="command"|>...〈/DSM|parameter〉`, `<！DSML！invoke name=“Bash”>`, `<、DSML、tool_calls>`, `<DSmartToolCalls>`, or `<DSMLtool_calls※>`, arbitrary protocol prefixes such as `<proto💥tool_calls>`, and legacy canonical XML `<tool_calls>` → `<invoke name="...">` → `<parameter name="...">`. The scanner normalizes fixed local names (`tool_calls` / `invoke` / `parameter`) with non-structural separators before or after them back to XML before parsing, and also tolerates CDATA opener drift such as `<！[CDATA[` / `<、[CDATA[`; only wrapped tool blocks or the narrow missing-opening-wrapper repair path enter the tool path, while bare `<invoke>` does not count as supported syntax. JSON literal parameter bodies are preserved as structured values, explicit empty or whitespace-only parameters are preserved as empty strings, malformed complete wrappers are released as plain text, and loose CDATA is narrowly repaired at final parse/flush when it can preserve a complete outer tool call.
 - `Admin API` separates static config from runtime policy: `/admin/config*` for configuration state, `/admin/settings*` for runtime behavior.
-- When upstream returns a thinking-only response with no visible text, the Go main path for both streaming and non-streaming completions retries once in the same DeepSeek session: it appends the prompt suffix `"Previous reply had no visible output. Please regenerate the visible final answer or tool call now."` and sets `parent_message_id`. If that same-account retry would still end as `429 upstream_empty_output`, managed-account mode switches to the next available account, creates a fresh session, and retries the original payload once before returning 429.
+- When upstream returns a thinking-only response with no visible text, the Go main path and the Vercel Node streaming path retry once in the same DeepSeek session: it appends the prompt suffix `"Previous reply had no visible output. Please regenerate the visible final answer or tool call now."` and sets `parent_message_id`. If that same-account retry would still end as `429 upstream_empty_output`, managed-account mode switches to the next available account, creates a fresh session, and retries the original payload once before returning 429.
 - Citation/reference marker boundary: streaming output hides upstream `[citation:N]` / `[reference:N]` placeholders by default; non-stream output converts DeepSeek search reference markers into Markdown links.
 
 ---
@@ -355,7 +355,7 @@ When `tools` is present, DS2API performs anti-leak handling:
 
 Additional notes:
 
-- The parser treats the recommended DSML shell tool blocks (`<｜DSML｜tool_calls>` / `<｜DSML｜invoke name="...">` / `<｜DSML｜parameter name="...">`), halfwidth DSML shell blocks (`<|DSML|tool_calls>` / `<|DSML|invoke name="...">` / `<|DSML|parameter name="...">`), DSML wrapper aliases (`<dsml|tool_calls>`, `<|tool_calls>`, `<｜tool_calls>`), common DSML separator drift (`<|DSML tool_calls>` / `<|DSML invoke>` / `<|DSML parameter>`), collapsed DSML local names (`<DSMLtool_calls>` / `<DSMLinvoke>` / `<DSMLparameter>`), control-separator drift (`<DSML␂tool_calls>` / raw STX `\x02`), CJK angle bracket, fullwidth-bang / ideographic-comma separator drift, PascalCase local-name drift, and trailing attribute separator drift (`<DSM｜parameter name="command"｜>...〈/DSM｜parameter〉` / `<！DSML！invoke name=“Bash”>` / `<、DSML、tool_calls>` / `<DSmartToolCalls>` / `<DSMLtool_calls※>`), arbitrary protocol prefixes (`<proto💥tool_calls>`), and legacy canonical XML tool blocks (`<tool_calls>` / `<invoke name="...">` / `<parameter name="...">`) as executable tool calls. These shells normalize non-structural separators back to XML first, while internal parsing remains XML-based; CDATA opener drift such as `<！[CDATA[` / `<、[CDATA[` is also normalized for parameter bodies. Legacy `<tools>`, `<tool_call>`, `<tool_name>`, `<param>`, `<function_call>`, `tool_use`, antml variants, and standalone JSON `tool_calls` payloads are treated as plain text; complete but malformed wrappers are also released as plain text.
+- The parser treats the recommended halfwidth-pipe DSML shell tool blocks (`<|DSML|tool_calls>` / `<|DSML|invoke name="...">` / `<|DSML|parameter name="...">`), DSML wrapper aliases (`<dsml|tool_calls>`, `<|tool_calls>`), common DSML separator drift (`<|DSML tool_calls>` / `<|DSML invoke>` / `<|DSML parameter>`), collapsed DSML local names (`<DSMLtool_calls>` / `<DSMLinvoke>` / `<DSMLparameter>`), control-separator drift (`<DSML␂tool_calls>` / raw STX `\x02`), CJK angle bracket, fullwidth-bang / ideographic-comma separator drift, PascalCase local-name drift, and trailing attribute separator drift (`<DSM|parameter name="command"|>...〈/DSM|parameter〉` / `<！DSML！invoke name=“Bash”>` / `<、DSML、tool_calls>` / `<DSmartToolCalls>` / `<DSMLtool_calls※>`), arbitrary protocol prefixes (`<proto💥tool_calls>`), and legacy canonical XML tool blocks (`<tool_calls>` / `<invoke name="...">` / `<parameter name="...">`) as executable tool calls. These shells normalize non-structural separators back to XML first, while internal parsing remains XML-based; CDATA opener drift such as `<！[CDATA[` / `<、[CDATA[` is also normalized for parameter bodies. Legacy `<tools>`, `<tool_call>`, `<tool_name>`, `<param>`, `<function_call>`, `tool_use`, antml variants, and standalone JSON `tool_calls` payloads are treated as plain text; complete but malformed wrappers are also released as plain text.
 - The parser no longer drops tool calls solely because parameter values are empty; explicit empty strings or whitespace-only parameters become empty strings in structured `tool_calls`. Prompting still tells the model not to emit blank parameters, and missing/empty argument rejection belongs in the tool executor or client schema validation.
 - If the final visible response text is empty but the reasoning stream contains an executable tool call, Chat / Responses emits a standard OpenAI `tool_calls` / `function_call` output during finalization. If thinking/reasoning was not enabled by the client, that reasoning text is used only for detection and is not exposed as visible text or `reasoning_content`.
 - `tool_calls` shown inside fenced markdown code blocks (for example, ```json ... ```) are treated as examples, not executable calls.
diff --git a/API.md b/API.md
index 9809ecac..63b4539d 100644
--- a/API.md
+++ b/API.md
@@ -40,9 +40,9 @@
 
 - OpenAI / Claude / Gemini 三套协议已统一挂在同一 `chi` 路由树上，由 `internal/server/router.go` 负责装配。
 - 适配器层职责收敛为：**请求归一化 → DeepSeek 调用 → 协议形态渲染**，减少历史版本中“同能力多处实现”的分叉。
-- Tool Calling 的解析策略在 Go 与 Node Runtime 间保持一致：推荐模型输出全角分隔符 DSML 外壳 `<｜DSML｜tool_calls>` → `<｜DSML｜invoke name="...">` → `<｜DSML｜parameter name="...">`；兼容层也接受半角 DSML wrapper `<|DSML|tool_calls>`、DSML wrapper 别名 `<dsml|tool_calls>`、`<|tool_calls>`、`<｜tool_calls>`、常见 DSML 分隔符漏写形态（如 `<|DSML tool_calls>`）、`DSML` 与工具标签名黏连的常见 typo（如 `<DSMLtool_calls>`）、控制分隔符漂移（如 `<DSML␂tool_calls>` / 原始 STX `\x02`）、CJK 尖括号、全角感叹号、顿号、PascalCase 本地名、弯引号属性值与属性尾部分隔符漂移（如 `<DSM｜parameter name="command"｜>...〈/DSM｜parameter〉` / `<！DSML！invoke name=“Bash”>` / `<、DSML、tool_calls>` / `<DSmartToolCalls>` / `<DSMLtool_calls※>`）、任意协议前缀壳（如 `<proto💥tool_calls>`），以及旧式 canonical XML `<tool_calls>` → `<invoke name="...">` → `<parameter name="...">`。实现上采用结构扫描：只要固定本地标签名是 `tool_calls` / `invoke` / `parameter`，标签名前或标签名后的非结构性分隔符会在解析入口归一化；CDATA 开头也会容错 `<！[CDATA[` / `<、[CDATA[` 这类分隔符漂移；只有 `tool_calls` wrapper 或可修复的缺失 opening wrapper 会进入工具路径，裸 `<invoke>` 不计为已支持语法；流式场景继续执行防泄漏筛分。若参数体本身是合法 JSON 字面量（如 `123`、`true`、`null`、数组或对象），会按结构化值输出，不再一律当作字符串；显式空字符串和纯空白参数会结构化保留为空字符串，是否拒绝缺参由工具执行侧决定；完整但 malformed 的 wrapper 会作为普通文本释放，不会吞掉或伪造成工具调用；若 CDATA 偶发漏闭合，则会在最终 parse / flush 恢复阶段做窄修复，尽量保住已完整包裹的外层工具调用。
+- Tool Calling 的解析策略在 Go 与 Node Runtime 间保持一致：推荐模型输出半角管道符 DSML 外壳 `<|DSML|tool_calls>` → `<|DSML|invoke name="...">` → `<|DSML|parameter name="...">`；兼容层也接受 DSML wrapper 别名 `<dsml|tool_calls>`、`<|tool_calls>`、常见 DSML 分隔符漏写形态（如 `<|DSML tool_calls>`）、`DSML` 与工具标签名黏连的常见 typo（如 `<DSMLtool_calls>`）、控制分隔符漂移（如 `<DSML␂tool_calls>` / 原始 STX `\x02`）、CJK 尖括号、全角感叹号、顿号、PascalCase 本地名、弯引号属性值与属性尾部分隔符漂移（如 `<DSM|parameter name="command"|>...〈/DSM|parameter〉` / `<！DSML！invoke name=“Bash”>` / `<、DSML、tool_calls>` / `<DSmartToolCalls>` / `<DSMLtool_calls※>`）、任意协议前缀壳（如 `<proto💥tool_calls>`），以及旧式 canonical XML `<tool_calls>` → `<invoke name="...">` → `<parameter name="...">`。实现上采用结构扫描：只要固定本地标签名是 `tool_calls` / `invoke` / `parameter`，标签名前或标签名后的非结构性分隔符会在解析入口归一化；CDATA 开头也会容错 `<！[CDATA[` / `<、[CDATA[` 这类分隔符漂移；只有 `tool_calls` wrapper 或可修复的缺失 opening wrapper 会进入工具路径，裸 `<invoke>` 不计为已支持语法；流式场景继续执行防泄漏筛分。若参数体本身是合法 JSON 字面量（如 `123`、`true`、`null`、数组或对象），会按结构化值输出，不再一律当作字符串；显式空字符串和纯空白参数会结构化保留为空字符串，是否拒绝缺参由工具执行侧决定；完整但 malformed 的 wrapper 会作为普通文本释放，不会吞掉或伪造成工具调用；若 CDATA 偶发漏闭合，则会在最终 parse / flush 恢复阶段做窄修复，尽量保住已完整包裹的外层工具调用。
 - `Admin API` 将配置与运行时策略分开：`/admin/config*` 管静态配置，`/admin/settings*` 管运行时行为。
-- 当上游返回 thinking-only 响应（模型输出了推理链但无可见文本）时，Go 主路径的流式与非流式补全都会先自动重试一次：以多轮对话 follow-up 方式追加 prompt 后缀 `"Previous reply had no visible output. Please regenerate the visible final answer or tool call now."` 并设置 `parent_message_id` 在同一 DeepSeek session 内让模型重新输出；同账号重试最大 1 次。若同账号重试后仍即将返回 `429 upstream_empty_output`，托管账号模式会在返回 429 前自动切换到下一个可用账号，新建 session，用原始 payload 再 fresh retry 一次。
+- 当上游返回 thinking-only 响应（模型输出了推理链但无可见文本）时，Go 主路径与 Vercel Node 流式路径都会先自动重试一次：以多轮对话 follow-up 方式追加 prompt 后缀 `"Previous reply had no visible output. Please regenerate the visible final answer or tool call now."` 并设置 `parent_message_id` 在同一 DeepSeek session 内让模型重新输出；同账号重试最大 1 次。若同账号重试后仍即将返回 `429 upstream_empty_output`，托管账号模式会在返回 429 前自动切换到下一个可用账号，新建 session，用原始 payload 再 fresh retry 一次。
 - 引用标记处理边界：流式输出默认隐藏 `[citation:N]` / `[reference:N]` 这类上游内部占位符；非流式输出默认把 DeepSeek 搜索引用标记转换为 Markdown 引用链接。
 
 ---
@@ -357,7 +357,7 @@ data: [DONE]
 补充说明：
 
 - **非代码块上下文**下，工具负载即使与普通文本混合，也会按特征识别并产出可执行 tool call（前后普通文本仍可透传）。
-- 解析器当前把推荐 DSML 外壳（`<｜DSML｜tool_calls>` / `<｜DSML｜invoke name="...">` / `<｜DSML｜parameter name="...">`）、半角 DSML 外壳（`<|DSML|tool_calls>` / `<|DSML|invoke name="...">` / `<|DSML|parameter name="...">`）、DSML wrapper 别名（`<dsml|tool_calls>`、`<|tool_calls>`、`<｜tool_calls>`）、常见 DSML 分隔符漏写形态（如 `<|DSML tool_calls>` / `<|DSML invoke>` / `<|DSML parameter>`）、`DSML` 与工具标签名黏连的常见 typo（如 `<DSMLtool_calls>` / `<DSMLinvoke>` / `<DSMLparameter>`）、控制分隔符漂移（如 `<DSML␂tool_calls>` / 原始 STX `\x02`）、CJK 尖括号、全角感叹号、顿号、PascalCase 本地名、弯引号属性值与属性尾部分隔符漂移（如 `<DSM｜parameter name="command"｜>...〈/DSM｜parameter〉` / `<！DSML！invoke name=“Bash”>` / `<、DSML、tool_calls>` / `<DSmartToolCalls>` / `<DSMLtool_calls※>`）、任意协议前缀壳（如 `<proto💥tool_calls>`）和旧式 canonical XML 工具块（`<tool_calls>` / `<invoke name="...">` / `<parameter name="...">`）作为可执行调用解析；这些非结构性分隔符壳会先归一化回 XML，内部仍以 XML 解析语义为准，CDATA 开头也会容错 `<！[CDATA[` / `<、[CDATA[`。旧式 `<tools>`、`<tool_call>`、`<tool_name>`、`<param>`、`<function_call>`、`tool_use`、antml 风格与纯 JSON `tool_calls` 片段默认都会按普通文本处理；完整但 malformed 的 wrapper 同样会作为普通文本释放。
+- 解析器当前把推荐半角管道符 DSML 外壳（`<|DSML|tool_calls>` / `<|DSML|invoke name="...">` / `<|DSML|parameter name="...">`）、DSML wrapper 别名（`<dsml|tool_calls>`、`<|tool_calls>`）、常见 DSML 分隔符漏写形态（如 `<|DSML tool_calls>` / `<|DSML invoke>` / `<|DSML parameter>`）、`DSML` 与工具标签名黏连的常见 typo（如 `<DSMLtool_calls>` / `<DSMLinvoke>` / `<DSMLparameter>`）、控制分隔符漂移（如 `<DSML␂tool_calls>` / 原始 STX `\x02`）、CJK 尖括号、全角感叹号、顿号、PascalCase 本地名、弯引号属性值与属性尾部分隔符漂移（如 `<DSM|parameter name="command"|>...〈/DSM|parameter〉` / `<！DSML！invoke name=“Bash”>` / `<、DSML、tool_calls>` / `<DSmartToolCalls>` / `<DSMLtool_calls※>`）、任意协议前缀壳（如 `<proto💥tool_calls>`）和旧式 canonical XML 工具块（`<tool_calls>` / `<invoke name="...">` / `<parameter name="...">`）作为可执行调用解析；这些非结构性分隔符壳会先归一化回 XML，内部仍以 XML 解析语义为准，CDATA 开头也会容错 `<！[CDATA[` / `<、[CDATA[`。旧式 `<tools>`、`<tool_call>`、`<tool_name>`、`<param>`、`<function_call>`、`tool_use`、antml 风格与纯 JSON `tool_calls` 片段默认都会按普通文本处理；完整但 malformed 的 wrapper 同样会作为普通文本释放。
 - 解析层不会因为参数值为空而丢弃工具调用；显式空字符串或纯空白参数会按空字符串进入结构化 `tool_calls`。Prompt 会要求模型不要主动输出空参数，缺参/空命令的拒绝应由工具执行侧或客户端 schema 校验负责。
 - 当最终可见正文为空但思维链里包含可执行工具调用时，Chat / Responses 会在收尾阶段补发标准 OpenAI `tool_calls` / `function_call` 输出；如果客户端未开启 thinking / reasoning，该思维链只用于检测，不会作为可见正文或 `reasoning_content` 暴露。
 - Markdown fenced code block（例如 ```json ... ```）中的 `tool_calls` 仅视为示例文本，不会被执行。
diff --git a/README.MD b/README.MD
index ae5eafc5..c32c09c8 100644
--- a/README.MD
+++ b/README.MD
@@ -196,7 +196,7 @@ OpenAI `/v1/*` 仍是推荐的规范路径；同时支持 `/models`、`/chat/com
 - `ANTHROPIC_BASE_URL` 推荐直接指向 DS2API 根地址（例如 `http://127.0.0.1:5001`），Claude Code 会请求 `/v1/messages?beta=true`。
 - `ANTHROPIC_API_KEY` 需要与 `config.json` 中 `keys` 一致；建议同时保留常规 key 与 `sk-ant-*` 形态 key，兼容不同客户端校验习惯。
 - 若系统设置了代理，建议对 DS2API 地址配置 `NO_PROXY=127.0.0.1,localhost,<你的主机IP>`，避免本地回环请求被代理拦截。
-- 如遇“工具调用输出成文本、未执行”问题，请优先检查模型输出是否为推荐的全角分隔符 DSML 工具块：`<｜DSML｜tool_calls><｜DSML｜invoke name="..."><｜DSML｜parameter name="...">...`。兼容层也接受半角 DSML 与旧式 canonical XML：`<tool_calls><invoke name="..."><parameter name="...">...`；旧式 `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`、`<function_call>`、`tool_use` 或纯 JSON `tool_calls` 片段不会执行，会作为普通文本处理。
+- 如遇“工具调用输出成文本、未执行”问题，请优先检查模型输出是否为推荐的半角管道符 DSML 工具块：`<|DSML|tool_calls><|DSML|invoke name="..."><|DSML|parameter name="...">...`。兼容层也接受旧式 canonical XML：`<tool_calls><invoke name="..."><parameter name="...">...`；旧式 `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`、`<function_call>`、`tool_use` 或纯 JSON `tool_calls` 片段不会执行，会作为普通文本处理。
 
 ### Gemini 接口
 
@@ -373,7 +373,7 @@ Gemini 路由还可以使用 `x-goog-api-key`，或在没有认证头时使用 `
 当请求中带 `tools` 时，DS2API 会做防泄漏处理与结构化转译：
 
 1. 只在**非代码块上下文**启用执行型 toolcall 识别（代码块示例默认不触发）
-2. 解析层当前把全角分隔符 DSML 外壳视为推荐可执行调用：`<｜DSML｜tool_calls>` → `<｜DSML｜invoke name="...">` → `<｜DSML｜parameter name="...">`；兼容半角 DSML、旧式 canonical XML `<tool_calls>` → `<invoke name="...">` → `<parameter name="...">`，以及若干 DSML 前缀/分隔符漂移。DSML 只是外壳别名，内部仍以 XML 解析语义为准；旧式 `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`、`<function_call>`、`tool_use` / antml 变体与纯 JSON `tool_calls` 片段都会按普通文本处理，完整但 malformed 的 wrapper 也会作为普通文本释放
+2. 解析层当前把半角管道符 DSML 外壳视为推荐可执行调用：`<|DSML|tool_calls>` → `<|DSML|invoke name="...">` → `<|DSML|parameter name="...">`；兼容旧式 canonical XML `<tool_calls>` → `<invoke name="...">` → `<parameter name="...">`，以及若干 DSML 前缀/分隔符漂移。DSML 只是外壳别名，内部仍以 XML 解析语义为准；旧式 `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`、`<function_call>`、`tool_use` / antml 变体与纯 JSON `tool_calls` 片段都会按普通文本处理，完整但 malformed 的 wrapper 也会作为普通文本释放
 3. `responses` 流式严格使用官方 item 生命周期事件（`response.output_item.*`、`response.content_part.*`、`response.function_call_arguments.*`）
 4. `responses` 支持并执行 `tool_choice`（`auto`/`none`/`required`/强制函数）；`required` 违规时非流式返回 `422`，流式返回 `response.failed`
 5. 客户端请求哪种协议，就按该协议返回工具调用（OpenAI/Claude/Gemini 各自原生结构）；模型侧优先约束输出规范 XML，再由兼容层转译
diff --git a/README.en.md b/README.en.md
index 81ef3137..afb4c7dd 100644
--- a/README.en.md
+++ b/README.en.md
@@ -185,7 +185,7 @@ Besides the primary aliases above, `/anthropic/v1/models` also returns Claude 4.
 - Set `ANTHROPIC_BASE_URL` to the DS2API root URL (for example `http://127.0.0.1:5001`). Claude Code sends requests to `/v1/messages?beta=true`.
 - `ANTHROPIC_API_KEY` must match an entry in `keys` from `config.json`. Keeping both a regular key and an `sk-ant-*` style key improves client compatibility.
 - If your environment has proxy variables, set `NO_PROXY=127.0.0.1,localhost,<your_host_ip>` for DS2API to avoid proxy interception of local traffic.
-- If tool calls are rendered as plain text and not executed, first verify the model output uses the recommended fullwidth-separator DSML block: `<｜DSML｜tool_calls><｜DSML｜invoke name="..."><｜DSML｜parameter name="...">...`. DS2API also accepts halfwidth DSML and legacy canonical XML: `<tool_calls><invoke name="..."><parameter name="...">...`; legacy `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`, `<function_call>`, `tool_use`, or standalone JSON `tool_calls` are not executed and stay plain text.
+- If tool calls are rendered as plain text and not executed, first verify the model output uses the recommended halfwidth-pipe DSML block: `<|DSML|tool_calls><|DSML|invoke name="..."><|DSML|parameter name="...">...`. DS2API also accepts legacy canonical XML: `<tool_calls><invoke name="..."><parameter name="...">...`; legacy `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`, `<function_call>`, `tool_use`, or standalone JSON `tool_calls` are not executed and stay plain text.
 
 ### Gemini Endpoint
 
@@ -359,7 +359,7 @@ Queue limit = DS2API_ACCOUNT_MAX_QUEUE (default = recommended concurrency)
 When `tools` is present in the request, DS2API performs anti-leak handling:
 
 1. Toolcall feature matching is enabled only in **non-code-block context** (fenced examples are ignored)
-2. The parser treats the fullwidth-separator DSML shell as the recommended executable tool-calling syntax: `<｜DSML｜tool_calls>` → `<｜DSML｜invoke name="...">` → `<｜DSML｜parameter name="...">`; it also accepts halfwidth DSML, legacy canonical XML `<tool_calls>` → `<invoke name="...">` → `<parameter name="...">`, plus common DSML prefix/separator drift. DSML is a shell alias and internal parsing remains XML-based; legacy `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`, `<function_call>`, `tool_use`, antml variants, and standalone JSON `tool_calls` payloads are treated as plain text, and complete but malformed wrappers are released as plain text too
+2. The parser treats the halfwidth-pipe DSML shell as the recommended executable tool-calling syntax: `<|DSML|tool_calls>` → `<|DSML|invoke name="...">` → `<|DSML|parameter name="...">`; it also accepts legacy canonical XML `<tool_calls>` → `<invoke name="...">` → `<parameter name="...">`, plus common DSML prefix/separator drift. DSML is a shell alias and internal parsing remains XML-based; legacy `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`, `<function_call>`, `tool_use`, antml variants, and standalone JSON `tool_calls` payloads are treated as plain text, and complete but malformed wrappers are released as plain text too
 3. `responses` streaming strictly uses official item lifecycle events (`response.output_item.*`, `response.content_part.*`, `response.function_call_arguments.*`)
 4. `responses` supports and enforces `tool_choice` (`auto`/`none`/`required`/forced function); `required` violations return `422` for non-stream and `response.failed` for stream
 5. The output protocol follows the client request (OpenAI / Claude / Gemini native shapes); model-side prompting can prefer XML, and the compatibility layer handles the protocol-specific translation
diff --git a/VERSION b/VERSION
index a84947d6..6016e8ad 100644
--- a/VERSION
+++ b/VERSION
@@ -1 +1 @@
-4.5.0
+4.6.0
diff --git a/docs/DEPLOY.md b/docs/DEPLOY.md
index d0f23dee..d2050bd5 100644
--- a/docs/DEPLOY.md
+++ b/docs/DEPLOY.md
@@ -4,7 +4,7 @@
 
 本指南基于当前 Go 代码库，详细说明各种部署方式。
 
-本页导航：[文档总索引](./README.md)｜[架构说明](./ARCHITECTURE.md)｜[接口文档](../API.md)｜[测试指南](./TESTING.md)
+本页导航：[文档总索引](./README.md)|[架构说明](./ARCHITECTURE.md)|[接口文档](../API.md)|[测试指南](./TESTING.md)
 
 ---
 
diff --git a/docs/prompt-compatibility.md b/docs/prompt-compatibility.md
index fb030219..7b49865e 100644
--- a/docs/prompt-compatibility.md
+++ b/docs/prompt-compatibility.md
@@ -89,7 +89,7 @@ DS2API 当前的核心思路，不是把客户端传来的 `messages`、`tools`
   "chat_session_id": "session-id",
   "model_type": "default",
   "parent_message_id": null,
-  "prompt": "<｜begin▁of▁sentence｜>...",
+  "prompt": "<|begin▁of▁sentence|>...",
   "ref_file_ids": [
     "file-history",
     "file-systemprompt",
@@ -112,7 +112,7 @@ DS2API 当前的核心思路，不是把客户端传来的 `messages`、`tools`
 - Vercel Node 流式路径本轮不迁移，仍使用现有 Node bridge / stream-tool-sieve 实现；后续若变更 Node 流式语义，需要按 `assistantturn` 的 Go canonical 输出语义同步对齐。
 - 客户端传入的 thinking / reasoning 开关会被归一到下游 `thinking_enabled`。Gemini `generationConfig.thinkingConfig.thinkingBudget` 会翻译成同一套 thinking 开关；关闭时即使上游返回 `response/thinking_content`，兼容层也不会把它当作可见正文输出。若最终解析出的模型名带 `-nothinking` 后缀，则会无条件强制关闭 thinking，优先级高于请求体中的 `thinking` / `reasoning` / `reasoning_effort`。未显式关闭时，各 surface 会按解析后的 DeepSeek 模型默认能力开启 thinking，并用各自协议的原生形态暴露：OpenAI Chat 为 `reasoning_content`，OpenAI Responses 为 `response.reasoning.delta` / `reasoning` content，Claude 为 `thinking` block / `thinking_delta`，Gemini 为 `thought: true` part。
 - 对 OpenAI Chat / Responses 的非流式收尾，如果最终可见正文为空，兼容层会优先尝试把思维链中的独立 DSML / XML 工具块当作真实工具调用解析出来。流式链路也会在收尾阶段做同样的 fallback 检测，但不会因为思维链内容去中途拦截或改写流式输出；真正的工具识别始终基于原始上游文本，而不是基于“已经做过可见输出清洗”的版本。最终可见层会剥离已经成功解析成工具调用的完整 leaked DSML / XML `tool_calls` wrapper；如果遇到完整 wrapper 但内部形态不符合可执行工具调用语义（例如 `<param>` 这类 malformed XML 工具壳），流式 sieve 会把该块作为普通文本释放，而不是吞掉或伪造成工具调用。补发结果会作为本轮 assistant 的结构化 `tool_calls` / `function_call` 输出返回，而不是塞进 `content` 文本；如果客户端没有开启 thinking / reasoning，思维链只用于检测，不会作为 `reasoning_content` 或可见正文暴露。只有正文为空且思维链里也没有可执行工具调用时，才继续按空回复错误处理。
-- OpenAI Chat / Responses、Claude Messages、Gemini generateContent 的空回复错误处理之前会默认做一次内部补偿重试：第一次上游完整结束后，如果最终可见正文为空、没有解析到工具调用、也没有已经向客户端流式发出工具调用，并且终止原因不是 `content_filter`，兼容层会复用同一个 `chat_session_id`、账号、token 与工具策略，把原始 completion `prompt` 追加固定后缀 `Previous reply had no visible output. Please regenerate the visible final answer or tool call now.` 后重新提交一次。Go 主路径的非流式重试由 `completionruntime.ExecuteNonStreamWithRetry` 统一处理；流式重试由 `completionruntime.ExecuteStreamWithRetry` 统一处理，各协议 runtime 只负责消费/渲染本协议 SSE framing。重试遵循 DeepSeek 多轮对话协议：从第一次上游 SSE 流中提取 `response_message_id`，并在重试 payload 中设置 `parent_message_id` 为该值，使重试成为同一会话的后续轮次而非断裂的根消息；同时重新获取一次 PoW（若 PoW 获取失败则回退到原始 PoW）。该同账号重试不会重新标准化消息、不会新建 session，也不会向流式客户端插入重试标记；第二次 thinking / reasoning 会按正常增量直接接到第一次之后，并继续使用 overlap trim 去重。若同账号补偿重试后即将返回 429 `upstream_empty_output`，并且当前是托管账号模式，Go 主路径会在返回 429 前切换到下一个可用账号，新建 `chat_session_id`，使用原始 completion payload 再做一次 fresh retry；该切号重试不携带空回复 prompt 后缀，也不设置上一账号的 `parent_message_id`。如果没有可切换账号，或切号后的 fresh retry 仍没有可见正文或工具调用，则继续按原错误返回：无任何输出为 503 `upstream_unavailable`，有 reasoning 但没有可见正文或工具调用为 429 `upstream_empty_output`。若任一尝试触发空 `content_filter`，不做补偿重试并保持 `content_filter` 错误。JS Vercel 运行时同样设置 `parent_message_id`，但因无法直接调用 PoW API 而复用原始 PoW；切号 fresh retry 目前由 Go 主路径提供。
+- OpenAI Chat / Responses、Claude Messages、Gemini generateContent 的空回复错误处理之前会默认做一次内部补偿重试：第一次上游完整结束后，如果最终可见正文为空、没有解析到工具调用、也没有已经向客户端流式发出工具调用，并且终止原因不是 `content_filter`，兼容层会复用同一个 `chat_session_id`、账号、token 与工具策略，把原始 completion `prompt` 追加固定后缀 `Previous reply had no visible output. Please regenerate the visible final answer or tool call now.` 后重新提交一次。Go 主路径的非流式重试由 `completionruntime.ExecuteNonStreamWithRetry` 统一处理；流式重试由 `completionruntime.ExecuteStreamWithRetry` 统一处理，各协议 runtime 只负责消费/渲染本协议 SSE framing。重试遵循 DeepSeek 多轮对话协议：从第一次上游 SSE 流中提取 `response_message_id`，并在重试 payload 中设置 `parent_message_id` 为该值，使重试成为同一会话的后续轮次而非断裂的根消息；同时重新获取一次 PoW（若 PoW 获取失败则回退到原始 PoW）。该同账号重试不会重新标准化消息、不会新建 session，也不会向流式客户端插入重试标记；第二次 thinking / reasoning 会按正常增量直接接到第一次之后，并继续使用 overlap trim 去重。若同账号补偿重试后即将返回 429 `upstream_empty_output`，并且当前是托管账号模式，runtime 会在返回 429 前切换到下一个可用账号，新建 `chat_session_id`，使用原始 completion payload 再做一次 fresh retry；该切号重试不携带空回复 prompt 后缀，也不设置上一账号的 `parent_message_id`。如果 current input file 已触发，切号前会在新账号上重新上传同一份 `DS2API_HISTORY.txt`（以及需要时的 `DS2API_TOOLS.txt`），并用新账号可见的 file_id 替换自动生成的旧 file_id；客户端原本传入的其他文件引用保持不变。如果没有可切换账号，或切号后的 fresh retry 仍没有可见正文或工具调用，则继续按原错误返回：无任何输出为 503 `upstream_unavailable`，有 reasoning 但没有可见正文或工具调用为 429 `upstream_empty_output`。若任一尝试触发空 `content_filter`，不做补偿重试并保持 `content_filter` 错误。Vercel Node 流式路径通过 Go 内部 prepare / pow / switch 端点获取初始 payload、重试 PoW 和切号 fresh retry payload，因此同样会重新上传 current-input 自动文件并替换为新账号 file_id。
 
 - 非流式 OpenAI Chat / Responses、Claude Messages、Gemini generateContent 在最终可见正文渲染阶段，会把 DeepSeek 搜索返回中的 `[citation:N]` / `[reference:N]` 标记替换成对应 Markdown 链接。`citation` 标记按一基序号解析；`reference` 标记只有在同一段正文中出现 `[reference:0]`（允许冒号后有空格）时才按零基序号映射，并且不会影响同段正文里的 `citation` 标记。
 - 流式输出仍默认隐藏 `[citation:N]` / `[reference:N]` 这类上游内部标记，避免分片输出中泄漏尚未完成映射的引用占位符。
@@ -135,14 +135,14 @@ OpenAI Chat / Responses 在标准化后、current input file 之前，会默认
 
 最终 prompt 使用 DeepSeek 风格角色标记：
 
-- `<｜begin▁of▁sentence｜>`
-- `<｜System｜>`
-- `<｜User｜>`
-- `<｜Assistant｜>`
-- `<｜Tool｜>`
-- `<｜end▁of▁instructions｜>`
-- `<｜end▁of▁sentence｜>`
-- `<｜end▁of▁toolresults｜>`
+- `<|begin▁of▁sentence|>`
+- `<|System|>`
+- `<|User|>`
+- `<|Assistant|>`
+- `<|Tool|>`
+- `<|end▁of▁instructions|>`
+- `<|end▁of▁sentence|>`
+- `<|end▁of▁toolresults|>`
 
 实现位置：
 [internal/prompt/messages.go](../internal/prompt/messages.go)
@@ -165,10 +165,10 @@ OpenAI Chat / Responses 在标准化后、current input file 之前，会默认
 1. 把每个 tool 的名称、描述、参数 schema 序列化成文本。
 2. 拼成 `You have access to these tools:` 大段说明。
 3. 再附上统一的 DSML tool call 外壳格式约束。
-4. 把这整段内容并入 system prompt。
+4. 普通直传请求会把“工具描述 + 格式约束”一起并入 system prompt；如果 `current_input_file` 触发，则工具描述/schema 会单独上传成 `DS2API_TOOLS.txt`，live prompt 和 system tool 格式提示都会明确要求模型把 `DS2API_TOOLS.txt` 当作可调用工具和参数 schema 的权威来源。
 
-工具调用正例现在优先示范全角分隔符 DSML 风格：`<｜DSML｜tool_calls>` → `<｜DSML｜invoke name="...">` → `<｜DSML｜parameter name="...">`。
-兼容层仍接受旧式纯 `<tool_calls>` wrapper，并会容错若干 DSML 标签变体，包括短横线形式 `<dsml-tool-calls>` / `<dsml-invoke>` / `<dsml-parameter>`、下划线形式 `<dsml_tool_calls>` / `<dsml_invoke>` / `<dsml_parameter>`，以及其他前缀分隔形态如 `<vendor|tool_calls>` / `<vendor_tool_calls>` / `<vendor - tool_calls>`；标签壳扫描还会把全角 ASCII 漂移归一化，例如 `<ｄＳＭＬ｜tool_calls>` 与全角 `＞` 结束符，也会容错 CJK 尖括号、全角感叹号或顿号分隔符、弯引号属性值、PascalCase 本地名和属性尾部分隔符漂移，例如 `<DSM｜parameter name="command"｜>...〈/DSM｜parameter〉`、`<！DSML！invoke name=“Bash”>`、`<、DSML、tool_calls>`、`<DSmartToolCalls>`、`<DSMLtool_calls※>`。更一般地，Go / Node tag 扫描以固定本地标签名 `tool_calls` / `invoke` / `parameter` 为准，标签名前或标签名后的非结构性协议分隔符都会在解析入口剥离，例如 `<DSML␂tool_calls>`、`<proto💥tool_calls>` 这类控制符或非 ASCII 分隔符漂移也会归一化回现有 XML 标签后继续走同一套 parser；结构性字符如 `<` / `>` / `/` / `=` / 引号、空白和 ASCII 字母数字不会被当作这类分隔符。CDATA 开头也使用同一类扫描式容错，`<![CDATA[` / `<！[CDATA[` / `<、[CDATA[` 都会作为参数原文容器处理。但提示词会优先要求模型输出官方 DSML 标签，并强调不能只输出 closing wrapper 而漏掉 opening tag。需要注意：这是“兼容 DSML 外壳，内部仍以 XML 解析语义为准”，不是原生 DSML 全链路实现。解析器会先截获非代码块中的疑似工具 wrapper，完整解析失败或工具语义无效时再按普通文本放行。
+工具调用正例现在优先示范半角管道符 DSML 风格：`<|DSML|tool_calls>` → `<|DSML|invoke name="...">` → `<|DSML|parameter name="...">`。
+兼容层仍接受旧式纯 `<tool_calls>` wrapper，并会容错若干 DSML 标签变体，包括短横线形式 `<dsml-tool-calls>` / `<dsml-invoke>` / `<dsml-parameter>`、下划线形式 `<dsml_tool_calls>` / `<dsml_invoke>` / `<dsml_parameter>`，以及其他前缀分隔形态如 `<vendor|tool_calls>` / `<vendor_tool_calls>` / `<vendor - tool_calls>`；标签壳扫描还会把全角 ASCII 漂移归一化，例如 `<ｄＳＭＬ|tool_calls>` 与全角 `＞` 结束符，也会容错 CJK 尖括号、全角感叹号或顿号分隔符、弯引号属性值、PascalCase 本地名和属性尾部分隔符漂移，例如 `<DSM|parameter name="command"|>...〈/DSM|parameter〉`、`<！DSML！invoke name=“Bash”>`、`<、DSML、tool_calls>`、`<DSmartToolCalls>`、`<DSMLtool_calls※>`。更一般地，Go / Node tag 扫描以固定本地标签名 `tool_calls` / `invoke` / `parameter` 为准，标签名前或标签名后的非结构性协议分隔符都会在解析入口剥离，例如 `<DSML␂tool_calls>`、`<proto💥tool_calls>` 这类控制符或非 ASCII 分隔符漂移也会归一化回现有 XML 标签后继续走同一套 parser；结构性字符如 `<` / `>` / `/` / `=` / 引号、空白和 ASCII 字母数字不会被当作这类分隔符。进入现有 DSML rewrite / XML parse 之前，Go / Node 还会先对“已经识别成工具标签壳的 candidate span”做一次窄 canonicalization：只折叠 wrapper / `invoke` / `parameter` / `name` / `CDATA` / `DSML` 及其壳层分隔符里的 confusable 字符，清理零宽 / BOM / 控制类干扰，并把引号、空白、dash / underscore 变体等统一回可解析的工具语法。这个阶段不会广义改写普通正文、参数内容、CDATA 里的示例文本或其他非工具 XML。CDATA 开头也使用同一类扫描式容错，`<![CDATA[` / `<！[CDATA[` / `<、[CDATA[` 都会作为参数原文容器处理。但提示词会优先要求模型输出官方 DSML 标签，并强调不能只输出 closing wrapper 而漏掉 opening tag。需要注意：这是“兼容 DSML 外壳，内部仍以 XML 解析语义为准”，不是原生 DSML 全链路实现。解析器会先截获非代码块中的疑似工具 wrapper，完整解析失败或工具语义无效时再按普通文本放行。
 数组参数使用 `<item>...</item>` 子节点表示；当某个参数体只包含 item 子节点时，Go / Node 解析器会把它还原成数组，避免 `questions` / `options` 这类 schema 中要求 array 的参数被误解析成 `{ "item": ... }` 对象。除此之外，解析器还会回收一些更松散的列表写法，例如 JSON array 字面量或逗号分隔的 JSON 项序列，只要它们足够明确；但 `<item>` 仍然是首选形态。若模型把完整结构化 XML fragment 误包进 CDATA，兼容层会在保护 `content` / `command` 等原文字段的前提下，尝试把非原文字段中的 CDATA XML fragment 还原成 object / array。不过，如果 CDATA 只是单个平面的 XML/HTML 标签，例如 `<b>urgent</b>` 这种行内标记，兼容层会保留原始字符串，不会强行升成 object / array；只有明显表示结构的 CDATA 片段，例如多兄弟节点、嵌套子节点或 `item` 列表，才会触发结构化恢复。对 `command` / `content` 等长文本参数，CDATA 内部的 Markdown fenced DSML / XML 示例会作为原文保护；示例里的 `]]></parameter>` 或 `</tool_calls>` 不会截断外层工具调用，解析器会继续等待围栏外真正的参数 / wrapper 结束标签。
 Go 侧读取 DeepSeek SSE 时不再依赖 `bufio.Scanner` 的固定 2MiB 单行上限；当写文件类工具把很长的 `content` 放在单个 `data:` 行里返回时，非流式收集、流式解析和 auto-continue 透传都会保留完整行，再进入同一套工具解析与序列化流程。
 在 assistant 最终回包阶段，如果某个 tool 参数在声明 schema 中明确是 `string`，兼容层会在把解析后的 `tool_calls` / `function_call` 重新序列化成 OpenAI / Responses / Claude 可见参数前，递归把该路径上的 number / bool / object / array 统一转成字符串；其中 object / array 会压成紧凑 JSON 字符串。这个保护只对 schema 明确声明为 string 的路径生效，不会改写本来就是 `number` / `boolean` / `object` / `array` 的参数。这样可以兼容 DeepSeek 输出了结构化片段、但上游客户端工具 schema 又严格要求字符串参数的场景（例如 `content`、`prompt`、`path`、`taskId` 等）。
@@ -215,17 +215,17 @@ assistant 的 reasoning 会变成一个显式标签块：
 assistant 历史 `tool_calls` 不会保留成 OpenAI 原生 JSON，而会转成 prompt 可见的 DSML 外壳：
 
 ```xml
-<｜DSML｜tool_calls>
-  <｜DSML｜invoke name="read_file">
-    <｜DSML｜parameter name="path"><![CDATA[src/main.go]]></｜DSML｜parameter>
-  </｜DSML｜invoke>
-</｜DSML｜tool_calls>
+<|DSML|tool_calls>
+  <|DSML|invoke name="read_file">
+    <|DSML|parameter name="path"><![CDATA[src/main.go]]></|DSML|parameter>
+  </|DSML|invoke>
+</|DSML|tool_calls>
 ```
 
 如果客户端历史里没有结构化 `tool_calls` 字段、却把一个可独立解析的 assistant 工具块放进了普通 `content`，兼容层会在写入后续 prompt 前先按工具调用解析它，再重渲染为规范 DSML 历史外壳。这样可以避免一次 malformed 工具块未被结构化保存后，作为普通 assistant 文本回灌，继续污染后续模型的 few-shot 工具格式。
 
 解析层同时兼容旧式纯 XML 形态：`<tool_calls>` / `<invoke>` / `<parameter>`。两者都会先归一到现有 XML 解析语义；其他旧格式都会作为普通文本保留，不会作为可执行调用语法。
-例外是 parser 会对一个非常窄的模型失误做修复：如果 assistant 输出了 `<invoke ...>` ... `</tool_calls>`（或 DSML 对应标签），但漏掉最前面的 opening wrapper，解析阶段会补回 wrapper 后再尝试识别。
+例外是 parser 会对一个非常窄的模型失误做修复：如果 assistant 输出了 `<invoke ...>` ... `</tool_calls>`（或 DSML 对应标签），但漏掉最前面的 opening wrapper，解析阶段会在 wrapper-confidence 足够高时补回 wrapper 后再尝试识别。这里的 wrapper-confidence 指 scanner 已经识别出白名单工具壳结构，剩余失败只像壳层结构漂移，而不是语义上接近但不在白名单内的 near-miss 标签名。修复成功时，wrapper 后面的 suffix prose 会继续保留在可见文本里；修复失败时，该块仍按普通文本处理。
 
 这件事很重要，因为它决定了：
 
@@ -237,7 +237,7 @@ assistant 历史 `tool_calls` 不会保留成 OpenAI 原生 JSON，而会转成
 
 ### 7.3 tool result 保留方式
 
-tool / function role 的结果会作为 `<｜Tool｜>...<｜end▁of▁toolresults｜>` 进入 prompt。
+tool / function role 的结果会作为 `<|Tool|>...<|end▁of▁toolresults|>` 进入 prompt。
 
 如果 tool content 为空，当前会补成字符串 `"null"`，避免整个 tool turn 丢失。
 
@@ -278,7 +278,7 @@ OpenAI 的文件上传现在不再是“只传文件本体”的通用路径，
 
 兼容层现在只保留 `current_input_file` 这一种拆分方式；旧的 `history_split` 配置字段已移除，读取旧配置时会忽略它且不会再写回。
 
-- `current_input_file` 默认开启；它在统一 completion runtime 入口全局生效，用于把“完整上下文”合并进 `DS2API_HISTORY.txt` 上下文文件。当最新 user turn 的纯文本长度达到 `current_input_file.min_chars`（默认 `0`）时，runtime 会上传一个文件名为 `DS2API_HISTORY.txt` 的上下文文件。文件内容会先经过各协议入口的标准化，再序列化成按轮次编号的 `DS2API_HISTORY.txt` 风格 transcript，带有 `# DS2API_HISTORY.txt` 标题和 `=== N. ROLE ===` 分段；live prompt 中则会给出一个 continuation 语气的 user 消息，引导模型从 `DS2API_HISTORY.txt` 的最新状态继续推进，并直接回答最新请求，避免把任务拉回起点。
+- `current_input_file` 默认开启；它在统一 completion runtime 入口全局生效，用于把“完整上下文”合并进 `DS2API_HISTORY.txt` 上下文文件。当最新 user turn 的纯文本长度达到 `current_input_file.min_chars`（默认 `0`）时，runtime 会上传一个文件名为 `DS2API_HISTORY.txt` 的上下文文件。文件内容会先经过各协议入口的标准化，再序列化成按轮次编号的 `DS2API_HISTORY.txt` 风格 transcript，带有 `# DS2API_HISTORY.txt` 标题和 `=== N. ROLE ===` 分段；如果当前请求声明了可用工具，还会把工具名称、描述和参数 schema 单独上传成 `DS2API_TOOLS.txt`，带有 `# DS2API_TOOLS.txt` 标题。live prompt 中则会给出一个 continuation 语气的 user 消息，引导模型从 `DS2API_HISTORY.txt` 的最新状态继续推进，并在有工具文件时明确可用工具 schema 位于 `DS2API_TOOLS.txt`；system prompt 也会在统一 DSML 工具格式约束前说明 `DS2API_TOOLS.txt` 是可调用工具和 schema 的权威来源，同时保留本轮工具选择策略，避免把任务拉回起点。
 - 如果 `current_input_file.enabled=false`，请求会直接透传，不上传任何拆分上下文文件。
 - 即使触发 `current_input_file` 后 live prompt 被缩短，对客户端回包里的上下文 token 统计，仍会沿用**拆分前的完整 prompt 语义**做计数，而不是按缩短后的占位 prompt 计算；否则会把真实上下文显著算小。
 
@@ -291,7 +291,7 @@ OpenAI 的文件上传现在不再是“只传文件本体”的通用路径，
 - 全局 completion runtime 应用点：
   [internal/completionruntime/nonstream.go](../internal/completionruntime/nonstream.go)
 
-当前输入转文件启用并触发时，上传文件的真实文件名是 `DS2API_HISTORY.txt`，文件内容是完整 `messages` 上下文；它会使用 OpenAI-compatible 的消息/transcript 序列化规则和 DeepSeek 角色标记，再按轮次编号成 `DS2API_HISTORY.txt` 风格的 transcript（不再注入文件边界标签）：
+当前输入转文件启用并触发时，上传的历史文件真实文件名是 `DS2API_HISTORY.txt`，文件内容是完整 `messages` 上下文；它会使用 OpenAI-compatible 的消息/transcript 序列化规则和 DeepSeek 角色标记，再按轮次编号成 `DS2API_HISTORY.txt` 风格的 transcript（不再注入文件边界标签）：
 
 ```text
 [uploaded filename]: DS2API_HISTORY.txt
@@ -311,7 +311,21 @@ Prior conversation history and tool progress.
 ...
 ```
 
-开启后，请求的 live prompt 不再直接内联完整上下文，而是保留一个 user role 的短提示，提示模型基于已提供上下文直接回答最新请求；上传后的 `file_id` 会进入 `ref_file_ids`。
+如果当前请求带有工具，runtime 同时上传 `DS2API_TOOLS.txt`：
+
+```text
+[uploaded filename]: DS2API_TOOLS.txt
+# DS2API_TOOLS.txt
+Available tool descriptions and parameter schemas for this request.
+
+You have access to these tools:
+
+Tool: ...
+Description: ...
+Parameters: ...
+```
+
+开启后，请求的 live prompt 不再直接内联完整上下文，也不再内联大段工具 schema；它保留一个 user role 的短提示，提示模型基于已提供上下文直接回答最新请求，并在有工具时引用 `DS2API_TOOLS.txt`。上传后的 `DS2API_HISTORY.txt` file_id 会排在 `ref_file_ids` 最前；如果存在 `DS2API_TOOLS.txt`，它的 file_id 紧随其后；客户端已有的其他 file_id 保持在后面。上下文 token 统计会包含上传的历史文件、工具文件和 live prompt。自动生成的 current-input 文件引用会被记录为 runtime 状态；如果托管账号模式切号 fresh retry，runtime 会重新上传这些自动文件，而不是把上一账号的 file_id 交给新账号。
 
 ## 10. 各协议入口的差异
 
@@ -321,7 +335,7 @@ Prior conversation history and tool progress.
 
 - `developer` 会映射到 `system`
 - Responses `instructions` 会 prepend 为 system message
-- `tools` 会注入 system prompt
+- 普通直传时 `tools` 会注入 system prompt；`current_input_file` 触发时工具描述/schema 会拆成 `DS2API_TOOLS.txt`，system prompt 保留格式/策略规则并明确要求模型从 `DS2API_TOOLS.txt` 获取可调用工具和 schema
 - `attachments` / `input_file` / inline 文件会进入 `ref_file_ids`
 - current input file 在统一 completion runtime 入口全局生效
 
@@ -331,7 +345,7 @@ Prior conversation history and tool progress.
 
 - top-level `system` 优先作为系统提示
 - `tool_use` / `tool_result` 会被转换成统一的 assistant/tool 历史语义
-- `tools` 同样会被并进 system prompt
+- 普通直传时 `tools` 同样会被并进 system prompt；`current_input_file` 触发时会沿用统一的 `DS2API_TOOLS.txt` 拆分上传路径
 - 常规执行通过 `internal/httpapi/claude/handler_messages.go` 转到 OpenAI chat 路径，模型 alias 会先解析成 DeepSeek 原生模型
 - 当前代码里没有像 OpenAI 那样完整的 `ref_file_ids` 附件链路
 
@@ -341,7 +355,7 @@ Prior conversation history and tool progress.
 
 - `systemInstruction`、`contents.parts`、`functionCall`、`functionResponse` 会先归一
 - tools 会转成 OpenAI 风格 function schema
-- prompt 构建复用 OpenAI 的 `promptcompat.BuildOpenAIPromptForAdapter`
+- prompt 构建复用 OpenAI 的 `promptcompat.BuildOpenAIPromptForAdapter`，`current_input_file` 触发时也会使用统一的 `DS2API_TOOLS.txt` 拆分上传路径
 - 未识别的非文本 part 会被安全序列化进 prompt，并对二进制/疑似 base64 内容做省略或截断处理
 
 也就是说，Gemini 在“最终 prompt 语义”上，尽量和 OpenAI 保持一致。
@@ -360,9 +374,10 @@ Prior conversation history and tool progress.
 
 ```json
 {
-  "prompt": "<｜begin▁of▁sentence｜><｜System｜>原 system / developer\n\nYou have access to these tools: ...<｜end▁of▁instructions｜><｜User｜>Continue from the latest state in the attached DS2API_HISTORY.txt context. Treat it as the current working state and answer the latest user request directly.<｜Assistant｜>",
+  "prompt": "<|begin▁of▁sentence|><|System|>原 system / developer\n\nTOOL CALL FORMAT — FOLLOW EXACTLY: ...<|end▁of▁instructions|><|User|>Continue from the latest state in the attached DS2API_HISTORY.txt context. Treat it as the current working state and answer the latest user request directly. Available tool descriptions and parameter schemas are attached in DS2API_TOOLS.txt; use only those tools and follow the tool-call format rules in this prompt.<|Assistant|>",
   "ref_file_ids": [
-    "file-current-input-ignore",
+    "file-ds2api-history",
+    "file-ds2api-tools",
     "file-systemprompt",
     "file-other-attachment"
   ],
diff --git a/docs/toolcall-semantics.md b/docs/toolcall-semantics.md
index 4deb80dd..7988d5a8 100644
--- a/docs/toolcall-semantics.md
+++ b/docs/toolcall-semantics.md
@@ -6,14 +6,14 @@
 
 ## 1) 当前可执行格式
 
-当前版本推荐模型输出全角分隔符 DSML 外壳：
+当前版本推荐模型输出半角管道符 DSML 外壳：
 
 ```xml
-<｜DSML｜tool_calls>
-  <｜DSML｜invoke name="read_file">
-    <｜DSML｜parameter name="path"><![CDATA[README.MD]]></｜DSML｜parameter>
-  </｜DSML｜invoke>
-</｜DSML｜tool_calls>
+<|DSML|tool_calls>
+  <|DSML|invoke name="read_file">
+    <|DSML|parameter name="path"><![CDATA[README.MD]]></|DSML|parameter>
+  </|DSML|invoke>
+</|DSML|tool_calls>
 ```
 
 兼容层仍接受旧式 canonical XML：
@@ -30,17 +30,20 @@
 
 约束：
 
-- 必须有 `<｜DSML｜tool_calls>...</｜DSML｜tool_calls>` 或 `<tool_calls>...</tool_calls>` wrapper
-- 每个调用必须在 `<｜DSML｜invoke name="...">...</｜DSML｜invoke>` 或 `<invoke name="...">...</invoke>` 内
+- 必须有 `<|DSML|tool_calls>...</|DSML|tool_calls>` 或 `<tool_calls>...</tool_calls>` wrapper
+- 每个调用必须在 `<|DSML|invoke name="...">...</|DSML|invoke>` 或 `<invoke name="...">...</invoke>` 内
 - 工具名必须放在 `invoke` 的 `name` 属性
-- 参数必须使用 `<｜DSML｜parameter name="...">...</｜DSML｜parameter>` 或 `<parameter name="...">...</parameter>`
+- 参数必须使用 `<|DSML|parameter name="...">...</|DSML|parameter>` 或 `<parameter name="...">...</parameter>`
 - 同一个工具块内不要混用 DSML 标签和旧 XML 工具标签；混搭会被视为非法工具块
 
 兼容修复：
 
 - 如果模型漏掉 opening wrapper，但后面仍输出了一个或多个 invoke 并以 closing wrapper 收尾，Go 解析链路会在解析前补回缺失的 opening wrapper。
-- Go / Node 解析层不再枚举每一种 DSML typo。它以固定本地标签名 `tool_calls` / `invoke` / `parameter` 为准，把标签名前的任意协议前缀壳视为可容忍噪声，并继续兼容管道符 `|` / `｜`、全角感叹号 `！`、顿号 `、`、空白、重复 leading `<`、可视控制符 `␂`、原始 STX `\x02`、非 ASCII 分隔符、CJK 尖括号 `〈` / `〉`、弯引号属性值、PascalCase 本地名等漂移。例如 `<DSML|tool_calls>`、`<<|DSML|tool_calls>`、`<|DSML tool_calls>`、`<DSMLtool_calls>`、`<DSmartToolCalls>`、`<<DSML|DSML|tool_calls>`、`<DSML␂tool_calls>`、`<proto💥tool_calls>`、`<DSM｜tool_calls>...〈/DSM｜tool_calls〉`、`<！DSML！tool_calls>...<！/DSML！tool_calls>`、`<、DSML、tool_calls>...<、/DSML、tool_calls>` 都会归一化；相似但非固定标签名（如 `tool_calls_extra` / `ToolCallsExtra`）仍按普通文本处理。
-- 如果模型在固定工具标签名后多输出一个非结构性分隔符，例如 `<|DSML|tool_calls|` / `<|DSML|invoke|` / `<|DSML|parameter|` / `<DSMLtool_calls※>`，或在带属性标签的结束符前多输出一个尾部分隔符（如 `<DSM｜parameter name="command"｜>`），兼容层会把这个尾部分隔符当作异常标签终止符并补齐或归一化；如果后面已经有 `>` / `〉`，也会消费这个多余分隔符后再归一化。结构性字符如 `<` / `>` / `/` / `=` / 引号、空白和 ASCII 字母数字不会被当作这类分隔符。
+- 在进入现有 DSML rewrite / XML parse 之前，Go / Node 都会先做一次非常窄的 candidate-span canonicalization：只处理已经被 scanner 识别为工具标签壳的 wrapper / `invoke` / `parameter` / `name` / `CDATA` / `DSML` 及其结构分隔符；这里会移除零宽 / BOM / 控制类干扰字符，并把 `<`、`>`、`/`、`|`、`=`、引号、Unicode 空白、常见 dash / underscore 变体这类工具语法外壳符号折回 ASCII 语义。
+- Go / Node 解析层不再枚举每一种 DSML typo。它以固定本地标签名 `tool_calls` / `invoke` / `parameter` 为准，把标签名前的任意协议前缀壳视为可容忍噪声，并继续兼容半角管道符、全角感叹号 `！`、顿号 `、`、空白、重复 leading `<`、可视控制符 `␂`、原始 STX `\x02`、非 ASCII 分隔符、CJK 尖括号 `〈` / `〉`、弯引号属性值、PascalCase 本地名等漂移。例如 `<DSML|tool_calls>`、`<<|DSML|tool_calls>`、`<|DSML tool_calls>`、`<DSMLtool_calls>`、`<DSmartToolCalls>`、`<<DSML|DSML|tool_calls>`、`<DSML␂tool_calls>`、`<proto💥tool_calls>`、`<DSM|tool_calls>...〈/DSM|tool_calls〉`、`<！DSML！tool_calls>...<！/DSML！tool_calls>`、`<、DSML、tool_calls>...<、/DSML、tool_calls>` 都会归一化；相似但非固定标签名（如 `tool_calls_extra` / `ToolCallsExtra`）仍按普通文本处理。
+- 这个 candidate-span canonicalization 不会对普通 prose、参数正文、CDATA 内容或嵌套的非工具 XML 做广义 Unicode 归一化。也就是说，参数里的示例 `<invοke>`、普通聊天文本里的 confusable 单词、或其他非工具壳 XML 片段都保持原样；只有真正落在工具标签壳上的 whitelist 关键字和结构符号会被折叠。
+- 如果模型在固定工具标签名后多输出一个非结构性分隔符，例如 `<|DSML|tool_calls|` / `<|DSML|invoke|` / `<|DSML|parameter|` / `<DSMLtool_calls※>`，或在带属性标签的结束符前多输出一个尾部分隔符（如 `<DSM|parameter name="command"|>`），兼容层会把这个尾部分隔符当作异常标签终止符并补齐或归一化；如果后面已经有 `>` / `〉`，也会消费这个多余分隔符后再归一化。结构性字符如 `<` / `>` / `/` / `=` / 引号、空白和 ASCII 字母数字不会被当作这类分隔符。
+- “缺失 opening wrapper”的修复只会在 wrapper-confidence 足够高时触发：scanner 必须已经识别出白名单工具壳结构（wrapper / invoke / parameter / `name=` 等），且剩余失败看起来只是壳层结构问题。相似但不在白名单内的 near-miss 标签名，或缺少足够 wrapper 证据的 malformed 片段，仍会按普通文本透传。
 - 这是一个针对常见模型失误的窄修复，不改变推荐输出格式；prompt 仍要求模型直接输出完整 DSML 外壳。
 - 裸 `<invoke ...>` / `<parameter ...>` 不会被当成“已支持的工具语法”；只有 `tool_calls` wrapper 或可修复的缺失 opening wrapper 才会进入工具调用路径。
 
@@ -54,10 +57,11 @@
 
 在流式链路中（Go / Node 一致）：
 
-- DSML `<｜DSML｜tool_calls>` wrapper、短横线形式（如 `<dsml-tool-calls>` / `<dsml-invoke>` / `<dsml-parameter>`）、基于固定本地标签名的 DSML 噪声容错形态、尾部非结构性分隔符形态（如 `<|DSML|tool_calls|` / `<DSMLtool_calls※>`）和 canonical `<tool_calls>` wrapper 都会进入结构化捕获
+- DSML `<|DSML|tool_calls>` wrapper、短横线形式（如 `<dsml-tool-calls>` / `<dsml-invoke>` / `<dsml-parameter>`）、基于固定本地标签名的 DSML 噪声容错形态、尾部非结构性分隔符形态（如 `<|DSML|tool_calls|` / `<DSMLtool_calls※>`）和 canonical `<tool_calls>` wrapper 都会进入结构化捕获
 - 如果流里直接从 invoke 开始，但后面补上了 closing wrapper，Go 流式筛分也会按缺失 opening wrapper 的修复路径尝试恢复
 - 已识别成功的工具调用不会再次回流到普通文本
 - 不符合新格式的块不会执行，并继续按原样文本透传
+- 如果一个 confusable / 漂移过的工具壳在 candidate-span canonicalization + repair 后仍能形成有效工具调用，wrapper 后面的 suffix prose 会继续按普通文本输出；如果 canonicalization 后仍不满足 wrapper-confidence 或 XML 语义，整块就作为普通文本释放，不会半吞半漏。
 - fenced code block（反引号 `` ``` `` 和波浪线 `~~~`）中的 XML 示例始终按普通文本处理
 - 支持嵌套围栏（如 4 反引号嵌套 3 反引号）和 CDATA 内围栏保护
 - 对 `command` / `content` 等长文本参数，CDATA 内部如果包含 Markdown fenced DSML / XML 示例，即使示例里出现 `]]></parameter>` / `</tool_calls>` 这类看起来像外层结束标签的片段，也会继续按参数原文保留，直到真正位于围栏外的外层结束标签
@@ -101,9 +105,9 @@ go test -v -run 'TestParseToolCalls|TestProcessToolSieve' ./internal/toolcall ./
 
 重点覆盖：
 
-- DSML `<｜DSML｜tool_calls>` wrapper 正常解析
+- DSML `<|DSML|tool_calls>` wrapper 正常解析
 - legacy canonical `<tool_calls>` wrapper 正常解析
-- 固定本地标签名的 DSML 噪声容错形态（如 `<DSML|tool_calls>`、`<<|DSML|tool_calls>`、`<|DSML tool_calls>`、`<DSMLtool_calls>`、`<DSmartToolCalls>`、`<<DSML|DSML|tool_calls>`、`<DSM｜tool_calls>...〈/DSM｜tool_calls〉`、`<！DSML！tool_calls>...<！/DSML！tool_calls>`）正常解析
+- 固定本地标签名的 DSML 噪声容错形态（如 `<DSML|tool_calls>`、`<<|DSML|tool_calls>`、`<|DSML tool_calls>`、`<DSMLtool_calls>`、`<DSmartToolCalls>`、`<<DSML|DSML|tool_calls>`、`<DSM|tool_calls>...〈/DSM|tool_calls〉`、`<！DSML！tool_calls>...<！/DSML！tool_calls>`）正常解析
 - 混搭标签（DSML wrapper + canonical inner）归一化后正常解析
 - 波浪线围栏 `~~~` 内的示例不执行
 - 嵌套围栏（4 反引号嵌套 3 反引号）内的示例不执行
diff --git a/internal/completionruntime/nonstream.go b/internal/completionruntime/nonstream.go
index 921d3b4d..bc589c61 100644
--- a/internal/completionruntime/nonstream.go
+++ b/internal/completionruntime/nonstream.go
@@ -114,7 +114,7 @@ func ExecuteNonStreamStartedWithRetry(ctx context.Context, ds DeepSeekCaller, a
 		turn, outErr := collectAttempt(currentResp, stdReq, usagePrompt, opts)
 		if outErr != nil {
 			if canRetryOnAlternateAccount(ctx, a, outErr, opts.RetryEnabled, &accountSwitchAttempted) {
-				switched, switchErr := startStandardCompletionOnAlternateAccount(ctx, ds, a, stdReq, maxAttempts)
+				switched, switchErr := startStandardCompletionOnAlternateAccount(ctx, ds, a, stdReq, opts, maxAttempts)
 				if switchErr != nil {
 					return NonStreamResult{SessionID: sessionID, Payload: payload, Attempts: attempts}, switchErr
 				}
@@ -154,7 +154,7 @@ func ExecuteNonStreamStartedWithRetry(ctx context.Context, ds DeepSeekCaller, a
 		}
 		if !opts.RetryEnabled || !assistantturn.ShouldRetryEmptyOutput(turn, attempts, retryMax) {
 			if canRetryOnAlternateAccount(ctx, a, turn.Error, opts.RetryEnabled, &accountSwitchAttempted) {
-				switched, switchErr := startStandardCompletionOnAlternateAccount(ctx, ds, a, stdReq, maxAttempts)
+				switched, switchErr := startStandardCompletionOnAlternateAccount(ctx, ds, a, stdReq, opts, maxAttempts)
 				if switchErr != nil {
 					return NonStreamResult{SessionID: sessionID, Payload: payload, Turn: turn, Attempts: attempts}, switchErr
 				}
@@ -205,7 +205,12 @@ func canRetryOnAlternateAccount(ctx context.Context, a *auth.RequestAuth, outErr
 	return a.SwitchAccount(ctx)
 }
 
-func startStandardCompletionOnAlternateAccount(ctx context.Context, ds DeepSeekCaller, a *auth.RequestAuth, stdReq promptcompat.StandardRequest, maxAttempts int) (StartResult, *assistantturn.OutputError) {
+func startStandardCompletionOnAlternateAccount(ctx context.Context, ds DeepSeekCaller, a *auth.RequestAuth, stdReq promptcompat.StandardRequest, opts Options, maxAttempts int) (StartResult, *assistantturn.OutputError) {
+	var prepErr *assistantturn.OutputError
+	stdReq, prepErr = reuploadCurrentInputFileForAccount(ctx, ds, a, stdReq, opts)
+	if prepErr != nil {
+		return StartResult{Request: stdReq}, prepErr
+	}
 	sessionID, err := ds.CreateSession(ctx, a, maxAttempts)
 	if err != nil {
 		return StartResult{}, authOutputError(a)
@@ -222,6 +227,18 @@ func startStandardCompletionOnAlternateAccount(ctx context.Context, ds DeepSeekC
 	return StartResult{SessionID: sessionID, Payload: payload, Pow: pow, Response: resp, Request: stdReq}, nil
 }
 
+func reuploadCurrentInputFileForAccount(ctx context.Context, ds DeepSeekCaller, a *auth.RequestAuth, stdReq promptcompat.StandardRequest, opts Options) (promptcompat.StandardRequest, *assistantturn.OutputError) {
+	if opts.CurrentInputFile == nil || !stdReq.CurrentInputFileApplied {
+		return stdReq, nil
+	}
+	out, err := (history.Service{Store: opts.CurrentInputFile, DS: ds}).ReuploadAppliedCurrentInputFile(ctx, a, stdReq)
+	if err != nil {
+		status, message := history.MapError(err)
+		return out, &assistantturn.OutputError{Status: status, Message: message, Code: "error"}
+	}
+	return out, nil
+}
+
 func collectAttempt(resp *http.Response, stdReq promptcompat.StandardRequest, usagePrompt string, opts Options) (assistantturn.Turn, *assistantturn.OutputError) {
 	defer func() {
 		if err := resp.Body.Close(); err != nil {
diff --git a/internal/completionruntime/nonstream_test.go b/internal/completionruntime/nonstream_test.go
index 7c5959ad..12598ab3 100644
--- a/internal/completionruntime/nonstream_test.go
+++ b/internal/completionruntime/nonstream_test.go
@@ -38,8 +38,11 @@ func (f *fakeDeepSeekCaller) GetPow(context.Context, *auth.RequestAuth, int) (st
 	return "pow", nil
 }
 
-func (f *fakeDeepSeekCaller) UploadFile(_ context.Context, _ *auth.RequestAuth, req dsclient.UploadFileRequest, _ int) (*dsclient.UploadFileResult, error) {
+func (f *fakeDeepSeekCaller) UploadFile(_ context.Context, a *auth.RequestAuth, req dsclient.UploadFileRequest, _ int) (*dsclient.UploadFileResult, error) {
 	f.uploads = append(f.uploads, req)
+	if a != nil && a.AccountID != "" {
+		return &dsclient.UploadFileResult{ID: "file-runtime-" + a.AccountID}, nil
+	}
 	return &dsclient.UploadFileResult{ID: "file-runtime-1"}, nil
 }
 
@@ -162,6 +165,66 @@ func TestExecuteNonStreamWithRetrySwitchesManagedAccountBeforeFinal429(t *testin
 	}
 }
 
+func TestExecuteNonStreamWithRetryReuploadsCurrentInputFileAfterAccountSwitch(t *testing.T) {
+	t.Setenv("DS2API_CONFIG_JSON", `{
+		"keys":["managed-key"],
+		"accounts":[
+			{"email":"acc1@test.com","password":"pwd"},
+			{"email":"acc2@test.com","password":"pwd"}
+		]
+	}`)
+	store := config.LoadStore()
+	resolver := auth.NewResolver(store, account.NewPool(store), func(_ context.Context, acc config.Account) (string, error) {
+		return "token-" + acc.Identifier(), nil
+	})
+	req, _ := http.NewRequest(http.MethodPost, "/", nil)
+	req.Header.Set("Authorization", "Bearer managed-key")
+	a, err := resolver.Determine(req)
+	if err != nil {
+		t.Fatalf("determine failed: %v", err)
+	}
+	defer resolver.Release(a)
+
+	ds := &fakeDeepSeekCaller{
+		sessionByAccount: true,
+		responses: []*http.Response{
+			sseHTTPResponse(http.StatusOK, `data: {"response_message_id":11,"p":"response/thinking_content","v":"first empty"}`),
+			sseHTTPResponse(http.StatusOK, `data: {"response_message_id":12,"p":"response/thinking_content","v":"retry empty"}`),
+			sseHTTPResponse(http.StatusOK, `data: {"response_message_id":21,"p":"response/content","v":"ok from second account"}`),
+		},
+	}
+	stdReq := promptcompat.StandardRequest{
+		Surface:        "test",
+		RequestedModel: "deepseek-v4-flash",
+		ResolvedModel:  "deepseek-v4-flash",
+		ResponseModel:  "deepseek-v4-flash",
+		Messages: []any{
+			map[string]any{"role": "user", "content": "large current input"},
+		},
+		PromptTokenText: "large current input",
+		FinalPrompt:     "large current input",
+		Thinking:        true,
+	}
+
+	result, outErr := ExecuteNonStreamWithRetry(context.Background(), ds, a, stdReq, Options{
+		RetryEnabled:     true,
+		CurrentInputFile: currentInputRuntimeConfig{},
+	})
+	if outErr != nil {
+		t.Fatalf("unexpected output error after account switch retry: %#v", outErr)
+	}
+	if result.Turn.Text != "ok from second account" {
+		t.Fatalf("text mismatch after switch retry: %q", result.Turn.Text)
+	}
+	if len(ds.uploads) != 2 {
+		t.Fatalf("expected current input file uploaded once per account, got %d", len(ds.uploads))
+	}
+	refIDs, _ := ds.payloads[2]["ref_file_ids"].([]any)
+	if len(refIDs) != 1 || refIDs[0] != "file-runtime-acc2@test.com" {
+		t.Fatalf("expected switched account ref_file_ids to use reuploaded file, got %#v", ds.payloads[2]["ref_file_ids"])
+	}
+}
+
 func TestExecuteNonStreamWithRetryUsesParentMessageForEmptyRetry(t *testing.T) {
 	ds := &fakeDeepSeekCaller{responses: []*http.Response{
 		sseHTTPResponse(http.StatusOK, `data: {"response_message_id":77,"p":"response/thinking_content","v":"plan"}`),
diff --git a/internal/completionruntime/stream_retry.go b/internal/completionruntime/stream_retry.go
index 03c9dc75..6007ceab 100644
--- a/internal/completionruntime/stream_retry.go
+++ b/internal/completionruntime/stream_retry.go
@@ -9,7 +9,9 @@ import (
 	"ds2api/internal/assistantturn"
 	"ds2api/internal/auth"
 	"ds2api/internal/config"
+	"ds2api/internal/httpapi/openai/history"
 	"ds2api/internal/httpapi/openai/shared"
+	"ds2api/internal/promptcompat"
 )
 
 type StreamRetryOptions struct {
@@ -19,6 +21,8 @@ type StreamRetryOptions struct {
 	RetryMaxAttempts int
 	MaxAttempts      int
 	UsagePrompt      string
+	Request          promptcompat.StandardRequest
+	CurrentInputFile history.CurrentInputConfigReader
 }
 
 type StreamRetryHooks struct {
@@ -71,7 +75,7 @@ func ExecuteStreamWithRetry(ctx context.Context, ds DeepSeekCaller, a *auth.Requ
 
 		if attempts >= retryMax {
 			if canRetryOnAlternateAccount(ctx, a, &assistantturn.OutputError{Status: http.StatusTooManyRequests}, opts.RetryEnabled, &accountSwitchAttempted) {
-				switched, switchErr := startPayloadCompletionOnAlternateAccount(ctx, ds, a, payload, maxAttempts)
+				switched, switchErr := startPayloadCompletionOnAlternateAccount(ctx, ds, a, payload, opts, maxAttempts)
 				if switchErr != nil {
 					if hooks.OnRetryFailure != nil {
 						hooks.OnRetryFailure(switchErr.Status, switchErr.Message, switchErr.Code)
@@ -142,7 +146,7 @@ func ExecuteStreamWithRetry(ctx context.Context, ds DeepSeekCaller, a *auth.Requ
 	}
 }
 
-func startPayloadCompletionOnAlternateAccount(ctx context.Context, ds DeepSeekCaller, a *auth.RequestAuth, payload map[string]any, maxAttempts int) (StartResult, *assistantturn.OutputError) {
+func startPayloadCompletionOnAlternateAccount(ctx context.Context, ds DeepSeekCaller, a *auth.RequestAuth, payload map[string]any, opts StreamRetryOptions, maxAttempts int) (StartResult, *assistantturn.OutputError) {
 	sessionID, err := ds.CreateSession(ctx, a, maxAttempts)
 	if err != nil {
 		return StartResult{}, authOutputError(a)
@@ -152,6 +156,13 @@ func startPayloadCompletionOnAlternateAccount(ctx context.Context, ds DeepSeekCa
 		return StartResult{SessionID: sessionID}, &assistantturn.OutputError{Status: http.StatusUnauthorized, Message: "Failed to get PoW (invalid token or unknown error).", Code: "error"}
 	}
 	nextPayload := clonePayload(payload)
+	if opts.CurrentInputFile != nil && opts.Request.CurrentInputFileApplied {
+		stdReq, prepErr := reuploadCurrentInputFileForAccount(ctx, ds, a, opts.Request, Options{CurrentInputFile: opts.CurrentInputFile})
+		if prepErr != nil {
+			return StartResult{SessionID: sessionID}, prepErr
+		}
+		nextPayload = stdReq.CompletionPayload(sessionID)
+	}
 	nextPayload["chat_session_id"] = sessionID
 	delete(nextPayload, "parent_message_id")
 	resp, err := ds.CallCompletion(ctx, a, nextPayload, pow, maxAttempts)
diff --git a/internal/deepseek/client/client_completion.go b/internal/deepseek/client/client_completion.go
index 1b91ce2f..0563d334 100644
--- a/internal/deepseek/client/client_completion.go
+++ b/internal/deepseek/client/client_completion.go
@@ -5,9 +5,7 @@ import (
 	"context"
 	dsprotocol "ds2api/internal/deepseek/protocol"
 	"encoding/json"
-	"errors"
 	"net/http"
-	"time"
 
 	"ds2api/internal/auth"
 	"ds2api/internal/config"
@@ -15,39 +13,33 @@ import (
 )
 
 func (c *Client) CallCompletion(ctx context.Context, a *auth.RequestAuth, payload map[string]any, powResp string, maxAttempts int) (*http.Response, error) {
-	if maxAttempts <= 0 {
-		maxAttempts = c.maxRetries
-	}
+	_ = maxAttempts
 	clients := c.requestClientsForAuth(ctx, a)
 	headers := c.authHeaders(a.DeepSeekToken)
 	headers["x-ds-pow-response"] = powResp
 	captureSession := c.capture.Start("deepseek_completion", dsprotocol.DeepSeekCompletionURL, a.AccountID, payload)
-	attempts := 0
-	for attempts < maxAttempts {
-		resp, err := c.streamPost(ctx, clients.stream, dsprotocol.DeepSeekCompletionURL, headers, payload)
-		if err != nil {
-			attempts++
-			time.Sleep(time.Second)
-			continue
-		}
-		if resp.StatusCode == http.StatusOK {
-			if captureSession != nil {
-				resp.Body = captureSession.WrapBody(resp.Body, resp.StatusCode)
-			}
-			resp = c.wrapCompletionWithAutoContinue(ctx, a, payload, powResp, resp)
-			return resp, nil
-		}
-		if captureSession != nil {
-			resp.Body = captureSession.WrapBody(resp.Body, resp.StatusCode)
-		}
-		_ = resp.Body.Close()
-		attempts++
-		time.Sleep(time.Second)
+	resp, err := c.streamPostOnce(ctx, clients.stream, dsprotocol.DeepSeekCompletionURL, headers, payload)
+	if err != nil {
+		return nil, err
+	}
+	if captureSession != nil {
+		resp.Body = captureSession.WrapBody(resp.Body, resp.StatusCode)
+	}
+	if resp.StatusCode == http.StatusOK {
+		resp = c.wrapCompletionWithAutoContinue(ctx, a, payload, powResp, resp)
 	}
-	return nil, errors.New("completion failed")
+	return resp, nil
 }
 
 func (c *Client) streamPost(ctx context.Context, doer trans.Doer, url string, headers map[string]string, payload any) (*http.Response, error) {
+	return c.streamPostWithFallback(ctx, doer, url, headers, payload, true)
+}
+
+func (c *Client) streamPostOnce(ctx context.Context, doer trans.Doer, url string, headers map[string]string, payload any) (*http.Response, error) {
+	return c.streamPostWithFallback(ctx, doer, url, headers, payload, false)
+}
+
+func (c *Client) streamPostWithFallback(ctx context.Context, doer trans.Doer, url string, headers map[string]string, payload any, allowFallback bool) (*http.Response, error) {
 	b, err := json.Marshal(payload)
 	if err != nil {
 		return nil, err
@@ -63,15 +55,18 @@ func (c *Client) streamPost(ctx context.Context, doer trans.Doer, url string, he
 	}
 	resp, err := doer.Do(req)
 	if err != nil {
-		config.Logger.Warn("[deepseek] fingerprint stream request failed, fallback to std transport", "url", url, "error", err)
-		req2, reqErr := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(b))
-		if reqErr != nil {
-			return nil, reqErr
-		}
-		for k, v := range headers {
-			req2.Header.Set(k, v)
+		if allowFallback {
+			config.Logger.Warn("[deepseek] fingerprint stream request failed, fallback to std transport", "url", url, "error", err)
+			req2, reqErr := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(b))
+			if reqErr != nil {
+				return nil, reqErr
+			}
+			for k, v := range headers {
+				req2.Header.Set(k, v)
+			}
+			return clients.fallbackS.Do(req2)
 		}
-		return clients.fallbackS.Do(req2)
+		return nil, err
 	}
 	return resp, nil
 }
diff --git a/internal/deepseek/client/client_completion_test.go b/internal/deepseek/client/client_completion_test.go
new file mode 100644
index 00000000..5244c800
--- /dev/null
+++ b/internal/deepseek/client/client_completion_test.go
@@ -0,0 +1,36 @@
+package client
+
+import (
+	"context"
+	"errors"
+	"net/http"
+	"testing"
+
+	"ds2api/internal/auth"
+)
+
+func TestCallCompletionDoesNotFallbackForNonIdempotentCompletion(t *testing.T) {
+	var fallbackCalled bool
+	client := &Client{
+		stream: doerFunc(func(*http.Request) (*http.Response, error) {
+			return nil, errors.New("ambiguous completion write failure")
+		}),
+		fallbackS: &http.Client{Transport: roundTripperFunc(func(*http.Request) (*http.Response, error) {
+			fallbackCalled = true
+			return &http.Response{StatusCode: http.StatusOK}, nil
+		})},
+	}
+	_, err := client.CallCompletion(
+		context.Background(),
+		&auth.RequestAuth{DeepSeekToken: "token"},
+		map[string]any{"prompt": "hello"},
+		"pow",
+		3,
+	)
+	if err == nil {
+		t.Fatal("expected completion error")
+	}
+	if fallbackCalled {
+		t.Fatal("completion fallback should not be called for a non-idempotent request")
+	}
+}
diff --git a/internal/deepseek/client/client_upload.go b/internal/deepseek/client/client_upload.go
index c3334c35..3dc778dd 100644
--- a/internal/deepseek/client/client_upload.go
+++ b/internal/deepseek/client/client_upload.go
@@ -95,11 +95,7 @@ func (c *Client) UploadFile(ctx context.Context, a *auth.RequestAuth, req Upload
 		resp, err := c.doUpload(ctx, clients.regular, clients.fallback, dsprotocol.DeepSeekUploadFileURL, headers, body)
 		if err != nil {
 			config.Logger.Warn("[upload_file] request error", "error", err, "account", a.AccountID, "filename", filename)
-			powHeader = ""
-			lastFailureKind = FailureUnknown
-			lastFailureMessage = err.Error()
-			attempts++
-			continue
+			return nil, err
 		}
 		if captureSession != nil {
 			resp.Body = captureSession.WrapBody(resp.Body, resp.StatusCode)
@@ -201,7 +197,7 @@ func escapeMultipartFilename(filename string) string {
 	return filename
 }
 
-func (c *Client) doUpload(ctx context.Context, doer trans.Doer, fallback trans.Doer, url string, headers map[string]string, body []byte) (*http.Response, error) {
+func (c *Client) doUpload(ctx context.Context, doer trans.Doer, _ trans.Doer, url string, headers map[string]string, body []byte) (*http.Response, error) {
 	req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(body))
 	if err != nil {
 		return nil, err
@@ -213,15 +209,7 @@ func (c *Client) doUpload(ctx context.Context, doer trans.Doer, fallback trans.D
 	if err == nil {
 		return resp, nil
 	}
-	config.Logger.Warn("[deepseek] fingerprint upload request failed, fallback to std transport", "url", url, "error", err)
-	req2, reqErr := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(body))
-	if reqErr != nil {
-		return nil, reqErr
-	}
-	for k, v := range headers {
-		req2.Header.Set(k, v)
-	}
-	return fallback.Do(req2)
+	return nil, err
 }
 
 func extractUploadFileResult(resp map[string]any) *UploadFileResult {
diff --git a/internal/deepseek/client/client_upload_test.go b/internal/deepseek/client/client_upload_test.go
index e7d1cc02..ff547da3 100644
--- a/internal/deepseek/client/client_upload_test.go
+++ b/internal/deepseek/client/client_upload_test.go
@@ -6,6 +6,7 @@ import (
 	"encoding/base64"
 	"encoding/hex"
 	"encoding/json"
+	"errors"
 	"io"
 	"net/http"
 	"strings"
@@ -39,6 +40,31 @@ func TestBuildUploadMultipartBodyOmitsPurposeAndIncludesFilePart(t *testing.T) {
 	}
 }
 
+func TestDoUploadDoesNotFallbackForNonIdempotentUpload(t *testing.T) {
+	var fallbackCalled bool
+	client := &Client{}
+	_, err := client.doUpload(
+		context.Background(),
+		doerFunc(func(req *http.Request) (*http.Response, error) {
+			_, _ = io.ReadAll(req.Body)
+			return nil, errors.New("ambiguous upload write failure")
+		}),
+		doerFunc(func(*http.Request) (*http.Response, error) {
+			fallbackCalled = true
+			return &http.Response{StatusCode: http.StatusOK, Header: make(http.Header), Body: io.NopCloser(strings.NewReader("{}"))}, nil
+		}),
+		dsprotocol.DeepSeekUploadFileURL,
+		map[string]string{"Content-Type": "multipart/form-data"},
+		[]byte("body"),
+	)
+	if err == nil {
+		t.Fatal("expected upload error")
+	}
+	if fallbackCalled {
+		t.Fatal("upload fallback should not be called for a non-idempotent request")
+	}
+}
+
 func TestExtractUploadFileResultSupportsNestedShapes(t *testing.T) {
 	got := extractUploadFileResult(map[string]any{
 		"data": map[string]any{
diff --git a/internal/httpapi/claude/current_input_file_test.go b/internal/httpapi/claude/current_input_file_test.go
index fa6b34b0..d49646ef 100644
--- a/internal/httpapi/claude/current_input_file_test.go
+++ b/internal/httpapi/claude/current_input_file_test.go
@@ -93,7 +93,11 @@ func (d *claudeCurrentInputDS) GetPow(context.Context, *auth.RequestAuth, int) (
 
 func (d *claudeCurrentInputDS) UploadFile(_ context.Context, _ *auth.RequestAuth, req dsclient.UploadFileRequest, _ int) (*dsclient.UploadFileResult, error) {
 	d.uploads = append(d.uploads, req)
-	return &dsclient.UploadFileResult{ID: "file-claude-history"}, nil
+	id := "file-claude-history"
+	if len(d.uploads) > 1 {
+		id = "file-claude-tools"
+	}
+	return &dsclient.UploadFileResult{ID: id}, nil
 }
 
 func (d *claudeCurrentInputDS) CallCompletion(_ context.Context, _ *auth.RequestAuth, payload map[string]any, _ string, _ int) (*http.Response, error) {
@@ -156,3 +160,47 @@ func TestClaudeDirectAppliesCurrentInputFile(t *testing.T) {
 		t.Fatalf("expected persisted message to match upstream continuation prompt, got %#v", full.Messages)
 	}
 }
+
+func TestClaudeCurrentInputFileUploadsToolsSeparately(t *testing.T) {
+	ds := &claudeCurrentInputDS{}
+	h := &Handler{
+		Store: mockClaudeConfig{aliases: map[string]string{"claude-sonnet-4-6": "deepseek-v4-flash"}},
+		Auth:  claudeCurrentInputAuth{},
+		DS:    ds,
+	}
+	reqBody := `{"model":"claude-sonnet-4-6","messages":[{"role":"user","content":"hello from claude"}],"tools":[{"name":"search","description":"Search docs","input_schema":{"type":"object"}}],"max_tokens":1024}`
+	req := httptest.NewRequest(http.MethodPost, "/v1/messages", strings.NewReader(reqBody))
+	req.Header.Set("Content-Type", "application/json")
+	rec := httptest.NewRecorder()
+
+	h.Messages(rec, req)
+
+	if rec.Code != http.StatusOK {
+		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
+	}
+	if len(ds.uploads) != 2 {
+		t.Fatalf("expected history and tools uploads, got %d", len(ds.uploads))
+	}
+	if ds.uploads[0].Filename != "DS2API_HISTORY.txt" || ds.uploads[1].Filename != "DS2API_TOOLS.txt" {
+		t.Fatalf("unexpected upload filenames: %#v", ds.uploads)
+	}
+	historyText := string(ds.uploads[0].Data)
+	if strings.Contains(historyText, "You have access to these tools") || strings.Contains(historyText, "Description: Search docs") {
+		t.Fatalf("history transcript should not embed tool descriptions, got %q", historyText)
+	}
+	toolsText := string(ds.uploads[1].Data)
+	if !strings.Contains(toolsText, "# DS2API_TOOLS.txt") || !strings.Contains(toolsText, "Tool: search") || !strings.Contains(toolsText, "Description: Search docs") {
+		t.Fatalf("expected tools transcript to include tool schema, got %q", toolsText)
+	}
+	refIDs, _ := ds.payload["ref_file_ids"].([]any)
+	if len(refIDs) < 2 || refIDs[0] != "file-claude-history" || refIDs[1] != "file-claude-tools" {
+		t.Fatalf("expected history and tools ref ids first, got %#v", ds.payload["ref_file_ids"])
+	}
+	prompt, _ := ds.payload["prompt"].(string)
+	if !strings.Contains(prompt, "DS2API_TOOLS.txt") || !strings.Contains(prompt, "TOOL CALL FORMAT") {
+		t.Fatalf("expected live prompt to reference tools file and retain format instructions, got %q", prompt)
+	}
+	if strings.Contains(prompt, "Description: Search docs") {
+		t.Fatalf("live prompt should not inline tool descriptions, got %q", prompt)
+	}
+}
diff --git a/internal/httpapi/claude/handler_messages.go b/internal/httpapi/claude/handler_messages.go
index e22a1edc..a89ed8da 100644
--- a/internal/httpapi/claude/handler_messages.go
+++ b/internal/httpapi/claude/handler_messages.go
@@ -145,7 +145,7 @@ func (h *Handler) handleClaudeDirectStream(w http.ResponseWriter, r *http.Reques
 		return
 	}
 	streamReq := start.Request
-	h.handleClaudeStreamRealtimeWithRetry(w, r, a, start.Response, start.Payload, start.Pow, streamReq.ResponseModel, streamReq.Messages, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw, streamReq.PromptTokenText, historySession)
+	h.handleClaudeStreamRealtimeWithRetry(w, r, a, start.Response, start.Payload, start.Pow, streamReq, streamReq.ResponseModel, streamReq.Messages, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw, streamReq.PromptTokenText, historySession)
 }
 
 func (h *Handler) proxyViaOpenAI(w http.ResponseWriter, r *http.Request, store ConfigReader) bool {
@@ -361,7 +361,7 @@ func (h *Handler) handleClaudeStreamRealtime(w http.ResponseWriter, r *http.Requ
 	})
 }
 
-func (h *Handler) handleClaudeStreamRealtimeWithRetry(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow, model string, messages []any, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, promptTokenText string, historySession *responsehistory.Session) {
+func (h *Handler) handleClaudeStreamRealtimeWithRetry(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow string, stdReq promptcompat.StandardRequest, model string, messages []any, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, promptTokenText string, historySession *responsehistory.Session) {
 	if resp.StatusCode != http.StatusOK {
 		defer func() { _ = resp.Body.Close() }()
 		body, _ := io.ReadAll(resp.Body)
@@ -399,11 +399,13 @@ func (h *Handler) handleClaudeStreamRealtimeWithRetry(w http.ResponseWriter, r *
 	streamRuntime.sendMessageStart()
 
 	completionruntime.ExecuteStreamWithRetry(r.Context(), h.DS, a, resp, payload, pow, completionruntime.StreamRetryOptions{
-		Surface:      "claude.messages",
-		Stream:       true,
-		RetryEnabled: true,
-		MaxAttempts:  3,
-		UsagePrompt:  promptTokenText,
+		Surface:          "claude.messages",
+		Stream:           true,
+		RetryEnabled:     true,
+		MaxAttempts:      3,
+		UsagePrompt:      promptTokenText,
+		Request:          stdReq,
+		CurrentInputFile: h.Store,
 	}, completionruntime.StreamRetryHooks{
 		ConsumeAttempt: func(currentResp *http.Response, allowDeferEmpty bool) (bool, bool) {
 			return h.consumeClaudeStreamAttempt(r, currentResp, streamRuntime, thinkingEnabled, allowDeferEmpty)
diff --git a/internal/httpapi/claude/handler_util_test.go b/internal/httpapi/claude/handler_util_test.go
index a624b01f..7d229fbc 100644
--- a/internal/httpapi/claude/handler_util_test.go
+++ b/internal/httpapi/claude/handler_util_test.go
@@ -93,10 +93,10 @@ func TestNormalizeClaudeMessagesToolUseToAssistantToolCalls(t *testing.T) {
 		t.Fatalf("expected call id preserved, got %#v", call)
 	}
 	content, _ := m["content"].(string)
-	if !containsStr(content, "<｜DSML｜tool_calls>") || !containsStr(content, `<｜DSML｜invoke name="search_web">`) {
+	if !containsStr(content, "<|DSML|tool_calls>") || !containsStr(content, `<|DSML|invoke name="search_web">`) {
 		t.Fatalf("expected assistant content to include DSML tool call history, got %q", content)
 	}
-	if !containsStr(content, `<｜DSML｜parameter name="query"><![CDATA[latest]]></｜DSML｜parameter>`) {
+	if !containsStr(content, `<|DSML|parameter name="query"><![CDATA[latest]]></|DSML|parameter>`) {
 		t.Fatalf("expected assistant content to include serialized parameters, got %q", content)
 	}
 }
@@ -133,7 +133,7 @@ func TestNormalizeClaudeMessagesPreservesThinkingOnToolUseHistory(t *testing.T)
 	if !containsStr(prompt, "[reasoning_content]\nneed live search before answering\n[/reasoning_content]") {
 		t.Fatalf("expected thinking in prompt history, got %q", prompt)
 	}
-	if !containsStr(prompt, `<｜DSML｜invoke name="search_web">`) {
+	if !containsStr(prompt, `<|DSML|invoke name="search_web">`) {
 		t.Fatalf("expected tool call in prompt history, got %q", prompt)
 	}
 }
@@ -329,7 +329,7 @@ func TestBuildClaudeToolPromptSingleTool(t *testing.T) {
 	if !containsStr(prompt, "Search the web") {
 		t.Fatalf("expected description in prompt")
 	}
-	if !containsStr(prompt, "<｜DSML｜tool_calls>") {
+	if !containsStr(prompt, "<|DSML|tool_calls>") {
 		t.Fatalf("expected DSML tool_calls format in prompt")
 	}
 	if !containsStr(prompt, "TOOL CALL FORMAT") {
diff --git a/internal/httpapi/claude/standard_request.go b/internal/httpapi/claude/standard_request.go
index 49d9bffc..4998eb94 100644
--- a/internal/httpapi/claude/standard_request.go
+++ b/internal/httpapi/claude/standard_request.go
@@ -52,7 +52,7 @@ func normalizeClaudeRequest(store ConfigReader, req map[string]any) (claudeNorma
 			RequestedModel:  strings.TrimSpace(model),
 			ResolvedModel:   dsModel,
 			ResponseModel:   strings.TrimSpace(model),
-			Messages:        payload["messages"].([]any),
+			Messages:        normalizedMessages,
 			PromptTokenText: finalPrompt,
 			ToolsRaw:        toolsRequested,
 			FinalPrompt:     finalPrompt,
diff --git a/internal/httpapi/gemini/convert_messages_test.go b/internal/httpapi/gemini/convert_messages_test.go
index a4293254..6f0890f3 100644
--- a/internal/httpapi/gemini/convert_messages_test.go
+++ b/internal/httpapi/gemini/convert_messages_test.go
@@ -89,7 +89,7 @@ func TestGeminiMessagesFromRequestPreservesThoughtOnFunctionCallHistory(t *testi
 	if !strings.Contains(prompt, "[reasoning_content]\nneed current state before answering\n[/reasoning_content]") {
 		t.Fatalf("expected thought in prompt history, got %q", prompt)
 	}
-	if !strings.Contains(prompt, `<｜DSML｜invoke name="search_web">`) {
+	if !strings.Contains(prompt, `<|DSML|invoke name="search_web">`) {
 		t.Fatalf("expected tool call in prompt history, got %q", prompt)
 	}
 }
diff --git a/internal/httpapi/gemini/handler_generate.go b/internal/httpapi/gemini/handler_generate.go
index 784ff757..b9a648d4 100644
--- a/internal/httpapi/gemini/handler_generate.go
+++ b/internal/httpapi/gemini/handler_generate.go
@@ -137,7 +137,7 @@ func (h *Handler) handleGeminiDirectStream(w http.ResponseWriter, r *http.Reques
 		return
 	}
 	streamReq := start.Request
-	h.handleStreamGenerateContentWithRetry(w, r, a, start.Response, start.Payload, start.Pow, streamReq.ResponseModel, streamReq.PromptTokenText, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw, historySession)
+	h.handleStreamGenerateContentWithRetry(w, r, a, start.Response, start.Payload, start.Pow, streamReq, streamReq.ResponseModel, streamReq.PromptTokenText, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw, historySession)
 }
 
 func (h *Handler) proxyViaOpenAI(w http.ResponseWriter, r *http.Request, stream bool) bool {
diff --git a/internal/httpapi/gemini/handler_stream_runtime.go b/internal/httpapi/gemini/handler_stream_runtime.go
index a1244ad6..6a98a4e6 100644
--- a/internal/httpapi/gemini/handler_stream_runtime.go
+++ b/internal/httpapi/gemini/handler_stream_runtime.go
@@ -12,6 +12,7 @@ import (
 	"ds2api/internal/auth"
 	"ds2api/internal/completionruntime"
 	dsprotocol "ds2api/internal/deepseek/protocol"
+	"ds2api/internal/promptcompat"
 	"ds2api/internal/responsehistory"
 	"ds2api/internal/sse"
 	streamengine "ds2api/internal/stream"
@@ -87,7 +88,7 @@ type geminiStreamRuntime struct {
 	history           *responsehistory.Session
 }
 
-func (h *Handler) handleStreamGenerateContentWithRetry(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow, model, finalPrompt string, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, historySession *responsehistory.Session) {
+func (h *Handler) handleStreamGenerateContentWithRetry(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow string, stdReq promptcompat.StandardRequest, model, finalPrompt string, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, historySession *responsehistory.Session) {
 	if resp.StatusCode != http.StatusOK {
 		defer func() { _ = resp.Body.Close() }()
 		body, _ := io.ReadAll(resp.Body)
@@ -108,11 +109,13 @@ func (h *Handler) handleStreamGenerateContentWithRetry(w http.ResponseWriter, r
 	runtime := newGeminiStreamRuntime(w, rc, canFlush, model, finalPrompt, thinkingEnabled, searchEnabled, stripReferenceMarkersEnabled(), toolNames, toolsRaw, historySession)
 
 	completionruntime.ExecuteStreamWithRetry(r.Context(), h.DS, a, resp, payload, pow, completionruntime.StreamRetryOptions{
-		Surface:      "gemini.generate_content",
-		Stream:       true,
-		RetryEnabled: true,
-		MaxAttempts:  3,
-		UsagePrompt:  finalPrompt,
+		Surface:          "gemini.generate_content",
+		Stream:           true,
+		RetryEnabled:     true,
+		MaxAttempts:      3,
+		UsagePrompt:      finalPrompt,
+		Request:          stdReq,
+		CurrentInputFile: h.Store,
 	}, completionruntime.StreamRetryHooks{
 		ConsumeAttempt: func(currentResp *http.Response, allowDeferEmpty bool) (bool, bool) {
 			return h.consumeGeminiStreamAttempt(r.Context(), currentResp, runtime, thinkingEnabled, allowDeferEmpty)
diff --git a/internal/httpapi/gemini/handler_test.go b/internal/httpapi/gemini/handler_test.go
index 90a1fe9a..9409b722 100644
--- a/internal/httpapi/gemini/handler_test.go
+++ b/internal/httpapi/gemini/handler_test.go
@@ -67,7 +67,11 @@ func (m *testGeminiDS) GetPow(_ context.Context, _ *auth.RequestAuth, _ int) (st
 //nolint:unused // reserved test double for native Gemini DS-call path coverage.
 func (m *testGeminiDS) UploadFile(_ context.Context, _ *auth.RequestAuth, req dsclient.UploadFileRequest, _ int) (*dsclient.UploadFileResult, error) {
 	m.uploadCalls = append(m.uploadCalls, req)
-	return &dsclient.UploadFileResult{ID: "file-gemini-history"}, nil
+	id := "file-gemini-history"
+	if len(m.uploadCalls) > 1 {
+		id = "file-gemini-tools"
+	}
+	return &dsclient.UploadFileResult{ID: id}, nil
 }
 
 //nolint:unused // reserved test double for native Gemini DS-call path coverage.
@@ -201,6 +205,57 @@ func TestGeminiDirectAppliesCurrentInputFile(t *testing.T) {
 	}
 }
 
+func TestGeminiCurrentInputFileUploadsToolsSeparately(t *testing.T) {
+	ds := &testGeminiDS{
+		resp: makeGeminiUpstreamResponse(`data: {"p":"response/content","v":"ok"}`),
+	}
+	h := &Handler{
+		Store: testGeminiConfig{},
+		Auth:  testGeminiAuth{},
+		DS:    ds,
+	}
+	reqBody := `{
+		"contents":[{"role":"user","parts":[{"text":"run code"}]}],
+		"tools":[{"functionDeclarations":[{"name":"eval_javascript","description":"eval","parameters":{"type":"object","properties":{"code":{"type":"string"}}}}]}]
+	}`
+	req := httptest.NewRequest(http.MethodPost, "/v1beta/models/gemini-2.5-pro:generateContent", strings.NewReader(reqBody))
+	req.Header.Set("Content-Type", "application/json")
+	rec := httptest.NewRecorder()
+	r := chi.NewRouter()
+	RegisterRoutes(r, h)
+
+	r.ServeHTTP(rec, req)
+
+	if rec.Code != http.StatusOK {
+		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
+	}
+	if len(ds.uploadCalls) != 2 {
+		t.Fatalf("expected history and tools uploads, got %d", len(ds.uploadCalls))
+	}
+	if ds.uploadCalls[0].Filename != "DS2API_HISTORY.txt" || ds.uploadCalls[1].Filename != "DS2API_TOOLS.txt" {
+		t.Fatalf("unexpected upload filenames: %#v", ds.uploadCalls)
+	}
+	historyText := string(ds.uploadCalls[0].Data)
+	if strings.Contains(historyText, "Description: eval") {
+		t.Fatalf("history transcript should not embed tool descriptions, got %q", historyText)
+	}
+	toolsText := string(ds.uploadCalls[1].Data)
+	if !strings.Contains(toolsText, "# DS2API_TOOLS.txt") || !strings.Contains(toolsText, "Tool: eval_javascript") || !strings.Contains(toolsText, "Description: eval") {
+		t.Fatalf("expected tools transcript to include Gemini tool schema, got %q", toolsText)
+	}
+	refIDs, _ := ds.payloads[0]["ref_file_ids"].([]any)
+	if len(refIDs) < 2 || refIDs[0] != "file-gemini-history" || refIDs[1] != "file-gemini-tools" {
+		t.Fatalf("expected history and tools ref ids first, got %#v", ds.payloads[0]["ref_file_ids"])
+	}
+	prompt, _ := ds.payloads[0]["prompt"].(string)
+	if !strings.Contains(prompt, "DS2API_TOOLS.txt") || !strings.Contains(prompt, "TOOL CALL FORMAT") {
+		t.Fatalf("expected live prompt to reference tools file and retain format instructions, got %q", prompt)
+	}
+	if strings.Contains(prompt, "Description: eval") {
+		t.Fatalf("live prompt should not inline tool descriptions, got %q", prompt)
+	}
+}
+
 func TestGeminiRoutesRegistered(t *testing.T) {
 	h := &Handler{
 		Store: testGeminiConfig{},
diff --git a/internal/httpapi/openai/chat/empty_retry_runtime.go b/internal/httpapi/openai/chat/empty_retry_runtime.go
index 1dc8ca94..3494b6de 100644
--- a/internal/httpapi/openai/chat/empty_retry_runtime.go
+++ b/internal/httpapi/openai/chat/empty_retry_runtime.go
@@ -66,7 +66,7 @@ func (h *Handler) handleNonStreamWithRetry(w http.ResponseWriter, ctx context.Co
 	writeJSON(w, http.StatusOK, respBody)
 }
 
-func (h *Handler) handleStreamWithRetry(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow, completionID string, sessionIDRef *string, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, toolChoice promptcompat.ToolChoicePolicy, historySession *chatHistorySession) {
+func (h *Handler) handleStreamWithRetry(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow, completionID string, sessionIDRef *string, stdReq promptcompat.StandardRequest, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, toolChoice promptcompat.ToolChoicePolicy, historySession *chatHistorySession) {
 	streamRuntime, initialType, ok := h.prepareChatStreamRuntime(w, resp, completionID, model, finalPrompt, refFileTokens, thinkingEnabled, searchEnabled, toolNames, toolsRaw, toolChoice, historySession)
 	if !ok {
 		return
@@ -78,6 +78,8 @@ func (h *Handler) handleStreamWithRetry(w http.ResponseWriter, r *http.Request,
 		RetryMaxAttempts: emptyOutputRetryMaxAttempts(),
 		MaxAttempts:      3,
 		UsagePrompt:      finalPrompt,
+		Request:          stdReq,
+		CurrentInputFile: h.Store,
 	}, completionruntime.StreamRetryHooks{
 		ConsumeAttempt: func(currentResp *http.Response, allowDeferEmpty bool) (bool, bool) {
 			return h.consumeChatStreamAttempt(r, currentResp, streamRuntime, initialType, thinkingEnabled, historySession, allowDeferEmpty)
diff --git a/internal/httpapi/openai/chat/handler.go b/internal/httpapi/openai/chat/handler.go
index da0ad4a2..d91091d2 100644
--- a/internal/httpapi/openai/chat/handler.go
+++ b/internal/httpapi/openai/chat/handler.go
@@ -33,6 +33,8 @@ type Handler struct {
 
 type streamLease struct {
 	Auth      *auth.RequestAuth
+	Standard  promptcompat.StandardRequest
+	SessionID string
 	ExpiresAt time.Time
 }
 
diff --git a/internal/httpapi/openai/chat/handler_chat.go b/internal/httpapi/openai/chat/handler_chat.go
index 9d86cf74..c46278bd 100644
--- a/internal/httpapi/openai/chat/handler_chat.go
+++ b/internal/httpapi/openai/chat/handler_chat.go
@@ -28,6 +28,10 @@ func (h *Handler) ChatCompletions(w http.ResponseWriter, r *http.Request) {
 		h.handleVercelStreamPow(w, r)
 		return
 	}
+	if isVercelStreamSwitchRequest(r) {
+		h.handleVercelStreamSwitch(w, r)
+		return
+	}
 	if isVercelStreamPrepareRequest(r) {
 		h.handleVercelStreamPrepare(w, r)
 		return
@@ -114,7 +118,7 @@ func (h *Handler) ChatCompletions(w http.ResponseWriter, r *http.Request) {
 	}
 	streamReq := start.Request
 	refFileTokens := streamReq.RefFileTokens
-	h.handleStreamWithRetry(w, r, a, start.Response, start.Payload, start.Pow, sessionID, &sessionID, streamReq.ResponseModel, streamReq.PromptTokenText, refFileTokens, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw, streamReq.ToolChoice, historySession)
+	h.handleStreamWithRetry(w, r, a, start.Response, start.Payload, start.Pow, sessionID, &sessionID, streamReq, streamReq.ResponseModel, streamReq.PromptTokenText, refFileTokens, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw, streamReq.ToolChoice, historySession)
 }
 
 func (h *Handler) autoDeleteRemoteSession(ctx context.Context, a *auth.RequestAuth, sessionID string) {
diff --git a/internal/httpapi/openai/chat/test_helpers_test.go b/internal/httpapi/openai/chat/test_helpers_test.go
index d8284cd7..8a8baa9f 100644
--- a/internal/httpapi/openai/chat/test_helpers_test.go
+++ b/internal/httpapi/openai/chat/test_helpers_test.go
@@ -2,6 +2,7 @@ package chat
 
 import (
 	"context"
+	"fmt"
 	"io"
 	"net/http"
 	"strings"
@@ -148,8 +149,12 @@ func (m *inlineUploadDSStub) UploadFile(ctx context.Context, _ *auth.RequestAuth
 	if m.uploadErr != nil {
 		return nil, m.uploadErr
 	}
+	id := "file-inline-1"
+	if len(m.uploadCalls) > 1 {
+		id = "file-inline-" + fmt.Sprint(len(m.uploadCalls))
+	}
 	return &dsclient.UploadFileResult{
-		ID:       "file-inline-1",
+		ID:       id,
 		Filename: req.Filename,
 		Bytes:    int64(len(req.Data)),
 		Status:   "uploaded",
diff --git a/internal/httpapi/openai/chat/vercel_prepare_test.go b/internal/httpapi/openai/chat/vercel_prepare_test.go
index 38fccc2f..b8811807 100644
--- a/internal/httpapi/openai/chat/vercel_prepare_test.go
+++ b/internal/httpapi/openai/chat/vercel_prepare_test.go
@@ -1,6 +1,7 @@
 package chat
 
 import (
+	"context"
 	"encoding/json"
 	"net/http"
 	"net/http/httptest"
@@ -8,8 +9,11 @@ import (
 	"testing"
 	"time"
 
+	"ds2api/internal/account"
 	"ds2api/internal/auth"
+	"ds2api/internal/config"
 	dsclient "ds2api/internal/deepseek/client"
+	"ds2api/internal/promptcompat"
 )
 
 func TestIsVercelStreamPrepareRequest(t *testing.T) {
@@ -64,14 +68,16 @@ func TestVercelInternalSecret(t *testing.T) {
 
 func TestStreamLeaseLifecycle(t *testing.T) {
 	h := &Handler{}
-	leaseID := h.holdStreamLease(&auth.RequestAuth{UseConfigToken: false})
+	leaseID := h.holdStreamLease(&auth.RequestAuth{UseConfigToken: false}, promptcompat.StandardRequest{}, "test-session-id")
 	if leaseID == "" {
 		t.Fatalf("expected non-empty lease id")
 	}
-	if ok := h.releaseStreamLease(leaseID); !ok {
+	if lease, ok := h.releaseStreamLease(leaseID); !ok {
 		t.Fatalf("expected lease release success")
+	} else if lease.SessionID != "test-session-id" {
+		t.Fatalf("expected released session id, got %q", lease.SessionID)
 	}
-	if ok := h.releaseStreamLease(leaseID); ok {
+	if _, ok := h.releaseStreamLease(leaseID); ok {
 		t.Fatalf("expected duplicate release to fail")
 	}
 }
@@ -141,6 +147,243 @@ func TestHandleVercelStreamPrepareAppliesCurrentInputFile(t *testing.T) {
 	}
 }
 
+func TestHandleVercelStreamPrepareUsesHalfwidthDSMLToolPrompt(t *testing.T) {
+	t.Setenv("VERCEL", "1")
+	t.Setenv("DS2API_VERCEL_INTERNAL_SECRET", "stream-secret")
+
+	h := &Handler{
+		Store: mockOpenAIConfig{},
+		Auth:  streamStatusAuthStub{},
+		DS:    &inlineUploadDSStub{},
+	}
+
+	reqBody, _ := json.Marshal(map[string]any{
+		"model": "deepseek-v4-flash",
+		"messages": []any{
+			map[string]any{"role": "user", "content": "search docs"},
+		},
+		"tools": []any{
+			map[string]any{
+				"type": "function",
+				"function": map[string]any{
+					"name":        "search",
+					"description": "search docs",
+					"parameters": map[string]any{
+						"type": "object",
+						"properties": map[string]any{
+							"query": map[string]any{"type": "string"},
+						},
+						"required": []any{"query"},
+					},
+				},
+			},
+		},
+		"stream": true,
+	})
+	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions?__stream_prepare=1", strings.NewReader(string(reqBody)))
+	req.Header.Set("Authorization", "Bearer direct-token")
+	req.Header.Set("Content-Type", "application/json")
+	req.Header.Set("X-Ds2-Internal-Token", "stream-secret")
+	rec := httptest.NewRecorder()
+
+	h.handleVercelStreamPrepare(rec, req)
+
+	if rec.Code != http.StatusOK {
+		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
+	}
+	var body map[string]any
+	if err := json.NewDecoder(rec.Body).Decode(&body); err != nil {
+		t.Fatalf("decode failed: %v", err)
+	}
+	finalPrompt, _ := body["final_prompt"].(string)
+	payload, _ := body["payload"].(map[string]any)
+	payloadPrompt, _ := payload["prompt"].(string)
+	for label, promptText := range map[string]string{"final_prompt": finalPrompt, "payload.prompt": payloadPrompt} {
+		if !strings.Contains(promptText, "<|DSML|tool_calls>") || !strings.Contains(promptText, "Tag punctuation alphabet: ASCII < > / = \" plus the halfwidth pipe |.") {
+			t.Fatalf("expected %s to contain halfwidth DSML tool instructions, got %q", label, promptText)
+		}
+		if strings.Contains(promptText, "\uff5c") || strings.Contains(promptText, "full"+"width vertical bar") {
+			t.Fatalf("expected %s not to contain legacy pipe guidance, got %q", label, promptText)
+		}
+	}
+	toolNames, _ := body["tool_names"].([]any)
+	if len(toolNames) != 1 || toolNames[0] != "search" {
+		t.Fatalf("expected prepared tool names to align with request tools, got %#v", body["tool_names"])
+	}
+}
+
+type vercelReleaseAutoDeleteDSStub struct {
+	resp             *http.Response
+	deleteCallCount  int
+	deletedSessionID string
+	deletedToken     string
+	deleteErr        error
+	events           *[]string
+}
+
+func (m *vercelReleaseAutoDeleteDSStub) CreateSession(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
+	return "session-id", nil
+}
+
+func (m *vercelReleaseAutoDeleteDSStub) GetPow(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
+	return "pow", nil
+}
+
+func (m *vercelReleaseAutoDeleteDSStub) UploadFile(_ context.Context, _ *auth.RequestAuth, _ dsclient.UploadFileRequest, _ int) (*dsclient.UploadFileResult, error) {
+	return &dsclient.UploadFileResult{ID: "file-id", Filename: "file.txt", Bytes: 1, Status: "uploaded"}, nil
+}
+
+func (m *vercelReleaseAutoDeleteDSStub) CallCompletion(_ context.Context, _ *auth.RequestAuth, _ map[string]any, _ string, _ int) (*http.Response, error) {
+	return m.resp, nil
+}
+
+func (m *vercelReleaseAutoDeleteDSStub) DeleteSessionForToken(_ context.Context, token string, sessionID string) (*dsclient.DeleteSessionResult, error) {
+	if m.events != nil {
+		*m.events = append(*m.events, "delete")
+	}
+	m.deleteCallCount++
+	m.deletedSessionID = sessionID
+	m.deletedToken = token
+	if m.deleteErr != nil {
+		return nil, m.deleteErr
+	}
+	return &dsclient.DeleteSessionResult{SessionID: sessionID, Success: true}, nil
+}
+
+func (m *vercelReleaseAutoDeleteDSStub) DeleteAllSessionsForToken(_ context.Context, _ string) error {
+	return nil
+}
+
+type vercelReleaseAuthStub struct {
+	events *[]string
+}
+
+func (a *vercelReleaseAuthStub) Determine(_ *http.Request) (*auth.RequestAuth, error) {
+	return &auth.RequestAuth{DeepSeekToken: "test-token", AccountID: "test-account"}, nil
+}
+
+func (a *vercelReleaseAuthStub) DetermineCaller(_ *http.Request) (*auth.RequestAuth, error) {
+	return &auth.RequestAuth{DeepSeekToken: "test-token", AccountID: "test-account"}, nil
+}
+
+func (a *vercelReleaseAuthStub) Release(_ *auth.RequestAuth) {
+	if a.events != nil {
+		*a.events = append(*a.events, "release")
+	}
+}
+
+func TestHandleVercelStreamReleaseTriggersAutoDelete(t *testing.T) {
+	t.Setenv("VERCEL", "1")
+	t.Setenv("DS2API_VERCEL_INTERNAL_SECRET", "stream-secret")
+
+	events := []string{}
+	ds := &vercelReleaseAutoDeleteDSStub{events: &events}
+	h := &Handler{
+		Store: mockOpenAIConfig{
+			autoDeleteMode: "single",
+		},
+		Auth: &vercelReleaseAuthStub{events: &events},
+		DS:   ds,
+	}
+
+	leaseID := h.holdStreamLease(&auth.RequestAuth{DeepSeekToken: "test-token", AccountID: "test-account"}, promptcompat.StandardRequest{}, "session-to-delete")
+	if leaseID == "" {
+		t.Fatalf("expected non-empty lease id")
+	}
+
+	reqBody := map[string]any{"lease_id": leaseID}
+	reqJSON, _ := json.Marshal(reqBody)
+	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions?__stream_release=1", strings.NewReader(string(reqJSON)))
+	req.Header.Set("X-Ds2-Internal-Token", "stream-secret")
+	req.Header.Set("Content-Type", "application/json")
+	rec := httptest.NewRecorder()
+
+	h.handleVercelStreamRelease(rec, req)
+
+	if rec.Code != http.StatusOK {
+		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
+	}
+	if ds.deleteCallCount != 1 {
+		t.Fatalf("expected auto delete call count=1, got %d", ds.deleteCallCount)
+	}
+	if ds.deletedSessionID != "session-to-delete" {
+		t.Fatalf("expected deleted session id=session-to-delete, got %q", ds.deletedSessionID)
+	}
+	if got, want := strings.Join(events, ","), "delete,release"; got != want {
+		t.Fatalf("expected auto-delete before auth release, got %s", got)
+	}
+}
+
+func TestHandleVercelStreamPrepareUploadsToolsSeparately(t *testing.T) {
+	t.Setenv("VERCEL", "1")
+	t.Setenv("DS2API_VERCEL_INTERNAL_SECRET", "stream-secret")
+
+	ds := &inlineUploadDSStub{}
+	h := &Handler{
+		Store: mockOpenAIConfig{currentInputEnabled: true},
+		Auth:  streamStatusAuthStub{},
+		DS:    ds,
+	}
+
+	reqBody, _ := json.Marshal(map[string]any{
+		"model": "deepseek-v4-flash",
+		"messages": []any{
+			map[string]any{"role": "user", "content": "search docs"},
+		},
+		"tools": []any{
+			map[string]any{
+				"type": "function",
+				"function": map[string]any{
+					"name":        "search",
+					"description": "search docs",
+					"parameters":  map[string]any{"type": "object"},
+				},
+			},
+		},
+		"stream": true,
+	})
+	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions?__stream_prepare=1", strings.NewReader(string(reqBody)))
+	req.Header.Set("Authorization", "Bearer direct-token")
+	req.Header.Set("Content-Type", "application/json")
+	req.Header.Set("X-Ds2-Internal-Token", "stream-secret")
+	rec := httptest.NewRecorder()
+
+	h.handleVercelStreamPrepare(rec, req)
+
+	if rec.Code != http.StatusOK {
+		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
+	}
+	if len(ds.uploadCalls) != 2 {
+		t.Fatalf("expected history and tools uploads, got %d", len(ds.uploadCalls))
+	}
+	if ds.uploadCalls[0].Filename != "DS2API_HISTORY.txt" || ds.uploadCalls[1].Filename != "DS2API_TOOLS.txt" {
+		t.Fatalf("unexpected upload filenames: %#v", ds.uploadCalls)
+	}
+	if strings.Contains(string(ds.uploadCalls[0].Data), "Description: search docs") {
+		t.Fatalf("history transcript should not embed tool descriptions, got %q", string(ds.uploadCalls[0].Data))
+	}
+
+	var body map[string]any
+	if err := json.NewDecoder(rec.Body).Decode(&body); err != nil {
+		t.Fatalf("decode failed: %v", err)
+	}
+	finalPrompt, _ := body["final_prompt"].(string)
+	payload, _ := body["payload"].(map[string]any)
+	payloadPrompt, _ := payload["prompt"].(string)
+	for label, promptText := range map[string]string{"final_prompt": finalPrompt, "payload.prompt": payloadPrompt} {
+		if !strings.Contains(promptText, "DS2API_TOOLS.txt") || !strings.Contains(promptText, "TOOL CALL FORMAT") {
+			t.Fatalf("expected %s to reference tools file and retain tool instructions, got %q", label, promptText)
+		}
+		if strings.Contains(promptText, "Description: search docs") {
+			t.Fatalf("expected %s not to inline tool descriptions, got %q", label, promptText)
+		}
+	}
+	refIDs, _ := payload["ref_file_ids"].([]any)
+	if len(refIDs) < 2 || refIDs[0] != "file-inline-1" || refIDs[1] != "file-inline-2" {
+		t.Fatalf("expected history and tools ref ids first, got %#v", payload["ref_file_ids"])
+	}
+}
+
 func TestHandleVercelStreamPrepareMapsCurrentInputFileManagedAuthFailureTo401(t *testing.T) {
 	t.Setenv("VERCEL", "1")
 	t.Setenv("DS2API_VERCEL_INTERNAL_SECRET", "stream-secret")
@@ -176,3 +419,88 @@ func TestHandleVercelStreamPrepareMapsCurrentInputFileManagedAuthFailureTo401(t
 		t.Fatalf("expected managed auth error message, got %s", rec.Body.String())
 	}
 }
+
+func TestHandleVercelStreamSwitchReuploadsCurrentInputFile(t *testing.T) {
+	t.Setenv("VERCEL", "1")
+	t.Setenv("DS2API_VERCEL_INTERNAL_SECRET", "stream-secret")
+	t.Setenv("DS2API_CONFIG_JSON", `{
+		"keys":["managed-key"],
+		"accounts":[
+			{"email":"acc1@test.com","password":"pwd"},
+			{"email":"acc2@test.com","password":"pwd"}
+		]
+	}`)
+	store := config.LoadStore()
+	resolver := auth.NewResolver(store, account.NewPool(store), func(_ context.Context, acc config.Account) (string, error) {
+		return "token-" + acc.Identifier(), nil
+	})
+	authReq := httptest.NewRequest(http.MethodPost, "/", nil)
+	authReq.Header.Set("Authorization", "Bearer managed-key")
+	a, err := resolver.Determine(authReq)
+	if err != nil {
+		t.Fatalf("determine failed: %v", err)
+	}
+	defer resolver.Release(a)
+
+	ds := &inlineUploadDSStub{}
+	h := &Handler{
+		Store: mockOpenAIConfig{currentInputEnabled: true},
+		Auth:  resolver,
+		DS:    ds,
+	}
+	stdReq := promptcompat.StandardRequest{
+		RequestedModel:          "deepseek-v4-flash",
+		ResolvedModel:           "deepseek-v4-flash",
+		ResponseModel:           "deepseek-v4-flash",
+		FinalPrompt:             "Continue from the latest state in the attached DS2API_HISTORY.txt context. Available tool descriptions and parameter schemas are attached in DS2API_TOOLS.txt; use only those tools and follow the tool-call format rules in this prompt.",
+		PromptTokenText:         "# DS2API_HISTORY.txt\n\n=== 1. USER ===\nhello\n\n# DS2API_TOOLS.txt\nAvailable tool descriptions and parameter schemas for this request.\n\nYou have access to these tools:\n\nTool: search\nDescription: search docs\nParameters: {\"type\":\"object\"}\n",
+		HistoryText:             "# DS2API_HISTORY.txt\n\n=== 1. USER ===\nhello\n",
+		CurrentInputFileApplied: true,
+		CurrentInputFileID:      "file-old",
+		CurrentToolsFileID:      "file-old-tools",
+		ToolsRaw: []any{
+			map[string]any{
+				"type": "function",
+				"function": map[string]any{
+					"name":        "search",
+					"description": "search docs",
+					"parameters":  map[string]any{"type": "object"},
+				},
+			},
+		},
+		RefFileIDs: []string{"file-old", "file-old-tools", "client-file"},
+		Thinking:   true,
+	}
+	leaseID := h.holdStreamLease(a, stdReq, "")
+	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions?__stream_switch=1", strings.NewReader(`{"lease_id":"`+leaseID+`"}`))
+	req.Header.Set("X-Ds2-Internal-Token", "stream-secret")
+	rec := httptest.NewRecorder()
+
+	h.handleVercelStreamSwitch(rec, req)
+
+	if rec.Code != http.StatusOK {
+		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
+	}
+	if len(ds.uploadCalls) != 2 {
+		t.Fatalf("expected current input and tools reupload on switched account, got %d", len(ds.uploadCalls))
+	}
+	if ds.uploadCalls[0].Filename != "DS2API_HISTORY.txt" || ds.uploadCalls[1].Filename != "DS2API_TOOLS.txt" {
+		t.Fatalf("unexpected reupload filenames: %#v", ds.uploadCalls)
+	}
+	var body map[string]any
+	if err := json.NewDecoder(rec.Body).Decode(&body); err != nil {
+		t.Fatalf("decode failed: %v", err)
+	}
+	if body["deepseek_token"] != "token-acc2@test.com" {
+		t.Fatalf("expected switched account token, got %#v", body["deepseek_token"])
+	}
+	payload, _ := body["payload"].(map[string]any)
+	refIDs, _ := payload["ref_file_ids"].([]any)
+	if len(refIDs) != 3 || refIDs[0] != "file-inline-1" || refIDs[1] != "file-inline-2" || refIDs[2] != "client-file" {
+		t.Fatalf("expected reuploaded current input ref plus client ref, got %#v", payload["ref_file_ids"])
+	}
+	promptText, _ := payload["prompt"].(string)
+	if !strings.Contains(promptText, "DS2API_TOOLS.txt") {
+		t.Fatalf("expected switched payload prompt to retain tools file reference, got %q", promptText)
+	}
+}
diff --git a/internal/httpapi/openai/chat/vercel_stream.go b/internal/httpapi/openai/chat/vercel_stream.go
index b52cd9c6..77b216a3 100644
--- a/internal/httpapi/openai/chat/vercel_stream.go
+++ b/internal/httpapi/openai/chat/vercel_stream.go
@@ -11,6 +11,7 @@ import (
 
 	"ds2api/internal/auth"
 	"ds2api/internal/config"
+	"ds2api/internal/httpapi/openai/history"
 	"ds2api/internal/promptcompat"
 	"ds2api/internal/util"
 
@@ -96,7 +97,7 @@ func (h *Handler) handleVercelStreamPrepare(w http.ResponseWriter, r *http.Reque
 	}
 
 	payload := stdReq.CompletionPayload(sessionID)
-	leaseID := h.holdStreamLease(a)
+	leaseID := h.holdStreamLease(a, stdReq, sessionID)
 	if leaseID == "" {
 		writeOpenAIError(w, http.StatusInternalServerError, "failed to create stream lease")
 		return
@@ -140,10 +141,17 @@ func (h *Handler) handleVercelStreamRelease(w http.ResponseWriter, r *http.Reque
 		writeOpenAIError(w, http.StatusBadRequest, "lease_id is required")
 		return
 	}
-	if !h.releaseStreamLease(leaseID) {
+	lease, ok := h.releaseStreamLease(leaseID)
+	if !ok {
 		writeOpenAIError(w, http.StatusNotFound, "stream lease not found")
 		return
 	}
+	if h.Auth != nil && lease.Auth != nil {
+		defer h.Auth.Release(lease.Auth)
+	}
+	if lease.Auth != nil {
+		h.autoDeleteRemoteSession(r.Context(), lease.Auth, lease.SessionID)
+	}
 	writeJSON(w, http.StatusOK, map[string]any{"success": true})
 }
 
@@ -185,6 +193,80 @@ func (h *Handler) handleVercelStreamPow(w http.ResponseWriter, r *http.Request)
 	})
 }
 
+func (h *Handler) handleVercelStreamSwitch(w http.ResponseWriter, r *http.Request) {
+	if !config.IsVercel() {
+		http.NotFound(w, r)
+		return
+	}
+	h.sweepExpiredStreamLeases()
+	internalSecret := vercelInternalSecret()
+	internalToken := strings.TrimSpace(r.Header.Get("X-Ds2-Internal-Token"))
+	if internalSecret == "" || subtle.ConstantTimeCompare([]byte(internalToken), []byte(internalSecret)) != 1 {
+		writeOpenAIError(w, http.StatusUnauthorized, "unauthorized internal request")
+		return
+	}
+
+	var req map[string]any
+	if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
+		writeOpenAIError(w, http.StatusBadRequest, "invalid json")
+		return
+	}
+	leaseID, _ := req["lease_id"].(string)
+	leaseID = strings.TrimSpace(leaseID)
+	if leaseID == "" {
+		writeOpenAIError(w, http.StatusBadRequest, "lease_id is required")
+		return
+	}
+	lease, ok := h.lookupStreamLease(leaseID)
+	if !ok || lease.Auth == nil {
+		writeOpenAIError(w, http.StatusNotFound, "stream lease not found or expired")
+		return
+	}
+	a := lease.Auth
+	if !a.UseConfigToken || !a.SwitchAccount(r.Context()) {
+		writeOpenAIErrorWithCode(w, http.StatusTooManyRequests, "Upstream account hit a rate limit and returned reasoning without visible output.", "upstream_empty_output")
+		return
+	}
+
+	stdReq := lease.Standard
+	var err error
+	if stdReq.CurrentInputFileApplied {
+		stdReq, err = (history.Service{Store: h.Store, DS: h.DS}).ReuploadAppliedCurrentInputFile(r.Context(), a, stdReq)
+		if err != nil {
+			status, message := mapCurrentInputFileError(err)
+			writeOpenAIError(w, status, message)
+			return
+		}
+	}
+	sessionID, err := h.DS.CreateSession(r.Context(), a, 3)
+	if err != nil {
+		writeOpenAIError(w, http.StatusUnauthorized, "Account token is invalid. Please re-login the account in admin.")
+		return
+	}
+	powHeader, err := h.DS.GetPow(r.Context(), a, 3)
+	if err != nil {
+		writeOpenAIError(w, http.StatusUnauthorized, "Failed to get PoW (invalid token or unknown error).")
+		return
+	}
+	if strings.TrimSpace(a.DeepSeekToken) == "" {
+		writeOpenAIError(w, http.StatusUnauthorized, "Account token is invalid. Please re-login the account in admin.")
+		return
+	}
+	h.updateStreamLeaseState(leaseID, stdReq, sessionID)
+	writeJSON(w, http.StatusOK, map[string]any{
+		"session_id":       sessionID,
+		"lease_id":         leaseID,
+		"model":            stdReq.ResponseModel,
+		"final_prompt":     stdReq.FinalPrompt,
+		"thinking_enabled": stdReq.Thinking,
+		"search_enabled":   stdReq.Search,
+		"tool_names":       stdReq.ToolNames,
+		"deepseek_token":   a.DeepSeekToken,
+		"pow_header":       powHeader,
+		"payload":          stdReq.CompletionPayload(sessionID),
+	})
+}
+
 func isVercelStreamPrepareRequest(r *http.Request) bool {
 	if r == nil {
 		return false
@@ -206,6 +288,13 @@ func isVercelStreamPowRequest(r *http.Request) bool {
 	return strings.TrimSpace(r.URL.Query().Get("__stream_pow")) == "1"
 }
 
+func isVercelStreamSwitchRequest(r *http.Request) bool {
+	if r == nil {
+		return false
+	}
+	return strings.TrimSpace(r.URL.Query().Get("__stream_switch")) == "1"
+}
+
 func vercelInternalSecret() string {
 	if v := strings.TrimSpace(os.Getenv("DS2API_VERCEL_INTERNAL_SECRET")); v != "" {
 		return v
@@ -216,7 +305,7 @@ func vercelInternalSecret() string {
 	return "admin"
 }
 
-func (h *Handler) holdStreamLease(a *auth.RequestAuth) string {
+func (h *Handler) holdStreamLease(a *auth.RequestAuth, stdReq promptcompat.StandardRequest, sessionID string) string {
 	if a == nil {
 		return ""
 	}
@@ -234,6 +323,8 @@ func (h *Handler) holdStreamLease(a *auth.RequestAuth) string {
 	leaseID := newLeaseID()
 	h.streamLeases[leaseID] = streamLease{
 		Auth:      a,
+		Standard:  stdReq,
+		SessionID: sessionID,
 		ExpiresAt: now.Add(ttl),
 	}
 	h.leaseMu.Unlock()
@@ -241,24 +332,48 @@ func (h *Handler) holdStreamLease(a *auth.RequestAuth) string {
 	return leaseID
 }
 
-func (h *Handler) lookupStreamLeaseAuth(leaseID string) *auth.RequestAuth {
+func (h *Handler) lookupStreamLease(leaseID string) (streamLease, bool) {
 	leaseID = strings.TrimSpace(leaseID)
 	if leaseID == "" {
-		return nil
+		return streamLease{}, false
 	}
 	h.leaseMu.Lock()
 	lease, ok := h.streamLeases[leaseID]
 	h.leaseMu.Unlock()
 	if !ok || time.Now().After(lease.ExpiresAt) {
+		return streamLease{}, false
+	}
+	return lease, true
+}
+
+func (h *Handler) lookupStreamLeaseAuth(leaseID string) *auth.RequestAuth {
+	lease, ok := h.lookupStreamLease(leaseID)
+	if !ok {
 		return nil
 	}
 	return lease.Auth
 }
 
-func (h *Handler) releaseStreamLease(leaseID string) bool {
+func (h *Handler) updateStreamLeaseState(leaseID string, stdReq promptcompat.StandardRequest, sessionID string) {
 	leaseID = strings.TrimSpace(leaseID)
 	if leaseID == "" {
-		return false
+		return
+	}
+	h.leaseMu.Lock()
+	defer h.leaseMu.Unlock()
+	lease, ok := h.streamLeases[leaseID]
+	if !ok {
+		return
+	}
+	lease.Standard = stdReq
+	lease.SessionID = sessionID
+	h.streamLeases[leaseID] = lease
+}
+
+func (h *Handler) releaseStreamLease(leaseID string) (streamLease, bool) {
+	leaseID = strings.TrimSpace(leaseID)
+	if leaseID == "" {
+		return streamLease{}, false
 	}
 
 	h.leaseMu.Lock()
@@ -271,12 +386,9 @@ func (h *Handler) releaseStreamLease(leaseID string) bool {
 	h.releaseExpiredAuths(expired)
 
 	if !ok {
-		return false
-	}
-	if h.Auth != nil {
-		h.Auth.Release(lease.Auth)
+		return streamLease{}, false
 	}
-	return true
+	return lease, true
 }
 
 func (h *Handler) popExpiredLeasesLocked(now time.Time) []*auth.RequestAuth {
diff --git a/internal/httpapi/openai/deps_injection_test.go b/internal/httpapi/openai/deps_injection_test.go
index 3082dab1..b3bdc1da 100644
--- a/internal/httpapi/openai/deps_injection_test.go
+++ b/internal/httpapi/openai/deps_injection_test.go
@@ -103,7 +103,7 @@ func TestNormalizeOpenAIResponsesRequestAlwaysAcceptsWideInput(t *testing.T) {
 	if out.Surface != "openai_responses" {
 		t.Fatalf("unexpected surface: %q", out.Surface)
 	}
-	if !strings.Contains(out.FinalPrompt, "<｜User｜>hi") {
+	if !strings.Contains(out.FinalPrompt, "<|User|>hi") {
 		t.Fatalf("unexpected final prompt: %q", out.FinalPrompt)
 	}
 }
diff --git a/internal/httpapi/openai/file_inline_upload_test.go b/internal/httpapi/openai/file_inline_upload_test.go
index abaf704c..88978e28 100644
--- a/internal/httpapi/openai/file_inline_upload_test.go
+++ b/internal/httpapi/openai/file_inline_upload_test.go
@@ -4,6 +4,7 @@ import (
 	"context"
 	"encoding/json"
 	"errors"
+	"fmt"
 	"net/http"
 	"net/http/httptest"
 	"strings"
@@ -41,8 +42,12 @@ func (m *inlineUploadDSStub) UploadFile(ctx context.Context, _ *auth.RequestAuth
 	if m.uploadErr != nil {
 		return nil, m.uploadErr
 	}
+	id := "file-inline-1"
+	if len(m.uploadCalls) > 1 {
+		id = "file-inline-" + fmt.Sprint(len(m.uploadCalls))
+	}
 	return &dsclient.UploadFileResult{
-		ID:       "file-inline-1",
+		ID:       id,
 		Filename: req.Filename,
 		Bytes:    int64(len(req.Data)),
 		Status:   "uploaded",
diff --git a/internal/httpapi/openai/history/current_input_file.go b/internal/httpapi/openai/history/current_input_file.go
index 9f5f8eeb..032927d1 100644
--- a/internal/httpapi/openai/history/current_input_file.go
+++ b/internal/httpapi/openai/history/current_input_file.go
@@ -15,6 +15,7 @@ import (
 
 const (
 	currentInputFilename    = promptcompat.CurrentInputContextFilename
+	currentToolsFilename    = promptcompat.CurrentToolsContextFilename
 	currentInputContentType = "text/plain; charset=utf-8"
 	currentInputPurpose     = "assistants"
 )
@@ -50,6 +51,7 @@ func (s Service) ApplyCurrentInputFile(ctx context.Context, a *auth.RequestAuth,
 	if strings.TrimSpace(fileText) == "" {
 		return stdReq, errors.New("current user input file produced empty transcript")
 	}
+	toolsText, _ := promptcompat.BuildOpenAIToolsContextTranscript(stdReq.ToolsRaw, stdReq.ToolChoice)
 	modelType := "default"
 	if resolvedType, ok := config.GetModelType(stdReq.ResolvedModel); ok {
 		modelType = resolvedType
@@ -69,21 +71,98 @@ func (s Service) ApplyCurrentInputFile(ctx context.Context, a *auth.RequestAuth,
 		return stdReq, errors.New("upload current user input file returned empty file id")
 	}
 
+	toolFileID := ""
+	if strings.TrimSpace(toolsText) != "" {
+		result, err := s.DS.UploadFile(ctx, a, dsclient.UploadFileRequest{
+			Filename:    currentToolsFilename,
+			ContentType: currentInputContentType,
+			Purpose:     currentInputPurpose,
+			ModelType:   modelType,
+			Data:        []byte(toolsText),
+		}, 3)
+		if err != nil {
+			return stdReq, fmt.Errorf("upload current tools file: %w", err)
+		}
+		toolFileID = strings.TrimSpace(result.ID)
+		if toolFileID == "" {
+			return stdReq, errors.New("upload current tools file returned empty file id")
+		}
+	}
+
 	messages := []any{
 		map[string]any{
 			"role":    "user",
-			"content": currentInputFilePrompt(),
+			"content": currentInputFilePrompt(toolFileID != ""),
 		},
 	}
 
 	stdReq.Messages = messages
 	stdReq.HistoryText = fileText
 	stdReq.CurrentInputFileApplied = true
-	stdReq.RefFileIDs = prependUniqueRefFileID(stdReq.RefFileIDs, fileID)
-	stdReq.FinalPrompt, stdReq.ToolNames = promptcompat.BuildOpenAIPrompt(messages, stdReq.ToolsRaw, "", stdReq.ToolChoice, stdReq.Thinking)
+	stdReq.CurrentInputFileID = fileID
+	stdReq.CurrentToolsFileID = toolFileID
+	stdReq.RefFileIDs = prependUniqueRefFileIDs(stdReq.RefFileIDs, fileID, toolFileID)
+	stdReq.FinalPrompt, stdReq.ToolNames = promptcompat.BuildOpenAIPromptWithToolInstructionsOnly(messages, stdReq.ToolsRaw, "", stdReq.ToolChoice, stdReq.Thinking)
 	// Token accounting must reflect the actual downstream context:
-	// the uploaded DS2API_HISTORY.txt file content + the continuation live prompt.
-	stdReq.PromptTokenText = fileText + "\n" + stdReq.FinalPrompt
+	// uploaded context files + the continuation live prompt.
+	tokenParts := []string{fileText}
+	if strings.TrimSpace(toolsText) != "" {
+		tokenParts = append(tokenParts, toolsText)
+	}
+	tokenParts = append(tokenParts, stdReq.FinalPrompt)
+	stdReq.PromptTokenText = strings.Join(tokenParts, "\n")
+	return stdReq, nil
+}
+
+func (s Service) ReuploadAppliedCurrentInputFile(ctx context.Context, a *auth.RequestAuth, stdReq promptcompat.StandardRequest) (promptcompat.StandardRequest, error) {
+	if !stdReq.CurrentInputFileApplied || s.DS == nil || a == nil {
+		return stdReq, nil
+	}
+	fileText := strings.TrimSpace(stdReq.HistoryText)
+	if fileText == "" {
+		return stdReq, nil
+	}
+	modelType := "default"
+	if resolvedType, ok := config.GetModelType(stdReq.ResolvedModel); ok {
+		modelType = resolvedType
+	}
+	result, err := s.DS.UploadFile(ctx, a, dsclient.UploadFileRequest{
+		Filename:    currentInputFilename,
+		ContentType: currentInputContentType,
+		Purpose:     currentInputPurpose,
+		ModelType:   modelType,
+		Data:        []byte(stdReq.HistoryText),
+	}, 3)
+	if err != nil {
+		return stdReq, fmt.Errorf("upload current user input file: %w", err)
+	}
+	fileID := strings.TrimSpace(result.ID)
+	if fileID == "" {
+		return stdReq, errors.New("upload current user input file returned empty file id")
+	}
+
+	toolsText, _ := promptcompat.BuildOpenAIToolsContextTranscript(stdReq.ToolsRaw, stdReq.ToolChoice)
+	toolFileID := ""
+	if strings.TrimSpace(toolsText) != "" {
+		result, err := s.DS.UploadFile(ctx, a, dsclient.UploadFileRequest{
+			Filename:    currentToolsFilename,
+			ContentType: currentInputContentType,
+			Purpose:     currentInputPurpose,
+			ModelType:   modelType,
+			Data:        []byte(toolsText),
+		}, 3)
+		if err != nil {
+			return stdReq, fmt.Errorf("upload current tools file: %w", err)
+		}
+		toolFileID = strings.TrimSpace(result.ID)
+		if toolFileID == "" {
+			return stdReq, errors.New("upload current tools file returned empty file id")
+		}
+	}
+
+	stdReq.RefFileIDs = replaceGeneratedCurrentInputRefs(stdReq.RefFileIDs, stdReq.CurrentInputFileID, stdReq.CurrentToolsFileID, fileID, toolFileID)
+	stdReq.CurrentInputFileID = fileID
+	stdReq.CurrentToolsFileID = toolFileID
 	return stdReq, nil
 }
 
@@ -106,23 +185,62 @@ func latestUserInputForFile(messages []any) (int, string) {
 	return -1, ""
 }
 
-func currentInputFilePrompt() string {
-	return "Continue from the latest state in the attached DS2API_HISTORY.txt context. Treat it as the current working state and answer the latest user request directly."
+func currentInputFilePrompt(hasToolsFile bool) string {
+	prompt := "Continue from the latest state in the attached DS2API_HISTORY.txt context. Treat it as the current working state and answer the latest user request directly."
+	if hasToolsFile {
+		prompt += " Available tool descriptions and parameter schemas are attached in DS2API_TOOLS.txt; use only those tools and follow the tool-call format rules in this prompt."
+	}
+	return prompt
 }
 
-func prependUniqueRefFileID(existing []string, fileID string) []string {
-	fileID = strings.TrimSpace(fileID)
-	if fileID == "" {
-		return existing
+func prependUniqueRefFileIDs(existing []string, fileIDs ...string) []string {
+	out := make([]string, 0, len(existing)+len(fileIDs))
+	seen := map[string]struct{}{}
+	for _, fileID := range fileIDs {
+		trimmed := strings.TrimSpace(fileID)
+		if trimmed == "" {
+			continue
+		}
+		key := strings.ToLower(trimmed)
+		if _, ok := seen[key]; ok {
+			continue
+		}
+		out = append(out, trimmed)
+		seen[key] = struct{}{}
 	}
-	out := make([]string, 0, len(existing)+1)
-	out = append(out, fileID)
 	for _, id := range existing {
 		trimmed := strings.TrimSpace(id)
-		if trimmed == "" || strings.EqualFold(trimmed, fileID) {
+		if trimmed == "" {
+			continue
+		}
+		key := strings.ToLower(trimmed)
+		if _, ok := seen[key]; ok {
 			continue
 		}
 		out = append(out, trimmed)
+		seen[key] = struct{}{}
 	}
 	return out
 }
+
+func replaceGeneratedCurrentInputRefs(existing []string, oldHistoryID, oldToolsID, newHistoryID, newToolsID string) []string {
+	filtered := make([]string, 0, len(existing))
+	old := map[string]struct{}{}
+	for _, id := range []string{oldHistoryID, oldToolsID} {
+		trimmed := strings.ToLower(strings.TrimSpace(id))
+		if trimmed != "" {
+			old[trimmed] = struct{}{}
+		}
+	}
+	for _, id := range existing {
+		trimmed := strings.TrimSpace(id)
+		if trimmed == "" {
+			continue
+		}
+		if _, ok := old[strings.ToLower(trimmed)]; ok {
+			continue
+		}
+		filtered = append(filtered, trimmed)
+	}
+	return prependUniqueRefFileIDs(filtered, newHistoryID, newToolsID)
+}
diff --git a/internal/httpapi/openai/history_split_test.go b/internal/httpapi/openai/history_split_test.go
index 97100f41..14b86588 100644
--- a/internal/httpapi/openai/history_split_test.go
+++ b/internal/httpapi/openai/history_split_test.go
@@ -84,7 +84,7 @@ func TestBuildOpenAICurrentInputContextTranscriptUsesNumberedHistorySections(t *
 		"latest user turn",
 		"[reasoning_content]",
 		"hidden reasoning",
-		"<｜DSML｜tool_calls>",
+		"<|DSML|tool_calls>",
 	} {
 		if !strings.Contains(transcript, want) {
 			t.Fatalf("expected transcript to contain %q, got %q", want, transcript)
@@ -380,6 +380,79 @@ func TestApplyCurrentInputFileUploadsFullContextFile(t *testing.T) {
 	}
 }
 
+func TestApplyCurrentInputFileUploadsToolsContextSeparately(t *testing.T) {
+	ds := &inlineUploadDSStub{}
+	h := &openAITestSurface{
+		Store: mockOpenAIConfig{
+			currentInputEnabled: true,
+			currentInputMin:     0,
+		},
+		DS: ds,
+	}
+	req := map[string]any{
+		"model":    "deepseek-v4-flash",
+		"messages": historySplitTestMessages(),
+		"tools": []any{
+			map[string]any{
+				"type": "function",
+				"function": map[string]any{
+					"name":        "search",
+					"description": "search docs",
+					"parameters": map[string]any{
+						"type": "object",
+					},
+				},
+			},
+		},
+	}
+	stdReq, err := promptcompat.NormalizeOpenAIChatRequest(h.Store, req, "")
+	if err != nil {
+		t.Fatalf("normalize failed: %v", err)
+	}
+
+	out, err := h.applyCurrentInputFile(context.Background(), &auth.RequestAuth{DeepSeekToken: "token"}, stdReq)
+	if err != nil {
+		t.Fatalf("apply current input file failed: %v", err)
+	}
+	if len(ds.uploadCalls) != 2 {
+		t.Fatalf("expected history and tools uploads, got %d", len(ds.uploadCalls))
+	}
+	if ds.uploadCalls[0].Filename != "DS2API_HISTORY.txt" {
+		t.Fatalf("expected first upload to be DS2API_HISTORY.txt, got %q", ds.uploadCalls[0].Filename)
+	}
+	if ds.uploadCalls[1].Filename != "DS2API_TOOLS.txt" {
+		t.Fatalf("expected second upload to be DS2API_TOOLS.txt, got %q", ds.uploadCalls[1].Filename)
+	}
+	historyText := string(ds.uploadCalls[0].Data)
+	if strings.Contains(historyText, "You have access to these tools") || strings.Contains(historyText, "Description: search docs") {
+		t.Fatalf("history transcript should not embed tool descriptions, got %q", historyText)
+	}
+	toolsText := string(ds.uploadCalls[1].Data)
+	for _, want := range []string{"# DS2API_TOOLS.txt", "Tool: search", "Description: search docs", `Parameters: {"type":"object"}`} {
+		if !strings.Contains(toolsText, want) {
+			t.Fatalf("expected tools transcript to contain %q, got %q", want, toolsText)
+		}
+	}
+	if strings.Contains(toolsText, "TOOL CALL FORMAT") {
+		t.Fatalf("tools transcript should not duplicate tool format instructions, got %q", toolsText)
+	}
+	if !strings.Contains(out.FinalPrompt, "Continue from the latest state in the attached DS2API_HISTORY.txt context.") || !strings.Contains(out.FinalPrompt, "DS2API_TOOLS.txt") {
+		t.Fatalf("expected live prompt to reference both context files, got %q", out.FinalPrompt)
+	}
+	if !strings.Contains(out.FinalPrompt, "TOOL CALL FORMAT") || !strings.Contains(out.FinalPrompt, "Remember: The ONLY valid way to use tools") {
+		t.Fatalf("expected live prompt to retain tool format instructions, got %q", out.FinalPrompt)
+	}
+	if strings.Contains(out.FinalPrompt, "You have access to these tools") || strings.Contains(out.FinalPrompt, "Description: search docs") || strings.Contains(out.FinalPrompt, "Parameters:") {
+		t.Fatalf("expected live prompt to omit tool descriptions after tools upload, got %q", out.FinalPrompt)
+	}
+	if len(out.RefFileIDs) < 2 || out.RefFileIDs[0] != "file-inline-1" || out.RefFileIDs[1] != "file-inline-2" {
+		t.Fatalf("expected history and tools file ids first, got %#v", out.RefFileIDs)
+	}
+	if !strings.Contains(out.PromptTokenText, "# DS2API_HISTORY.txt") || !strings.Contains(out.PromptTokenText, "# DS2API_TOOLS.txt") || !strings.Contains(out.PromptTokenText, "Description: search docs") {
+		t.Fatalf("expected prompt token text to include uploaded history and tools content, got %q", out.PromptTokenText)
+	}
+}
+
 func TestApplyCurrentInputFileCarriesHistoryText(t *testing.T) {
 	ds := &inlineUploadDSStub{}
 	h := &openAITestSurface{
@@ -537,6 +610,69 @@ func TestResponsesCurrentInputFileUploadsContextAndKeepsNeutralPrompt(t *testing
 	}
 }
 
+func TestResponsesCurrentInputFileUploadsToolsSeparately(t *testing.T) {
+	ds := &inlineUploadDSStub{}
+	h := &openAITestSurface{
+		Store: mockOpenAIConfig{
+			currentInputEnabled: true,
+		},
+		Auth: streamStatusAuthStub{},
+		DS:   ds,
+	}
+	r := chi.NewRouter()
+	registerOpenAITestRoutes(r, h)
+	reqBody, _ := json.Marshal(map[string]any{
+		"model":    "deepseek-v4-flash",
+		"messages": historySplitTestMessages(),
+		"tools": []any{
+			map[string]any{
+				"type": "function",
+				"function": map[string]any{
+					"name":        "search",
+					"description": "search docs",
+					"parameters":  map[string]any{"type": "object"},
+				},
+			},
+		},
+		"stream": false,
+	})
+	req := httptest.NewRequest(http.MethodPost, "/v1/responses", strings.NewReader(string(reqBody)))
+	req.Header.Set("Authorization", "Bearer direct-token")
+	req.Header.Set("Content-Type", "application/json")
+	rec := httptest.NewRecorder()
+
+	r.ServeHTTP(rec, req)
+
+	if rec.Code != http.StatusOK {
+		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
+	}
+	if len(ds.uploadCalls) != 2 {
+		t.Fatalf("expected history and tools uploads, got %d", len(ds.uploadCalls))
+	}
+	if ds.uploadCalls[0].Filename != "DS2API_HISTORY.txt" || ds.uploadCalls[1].Filename != "DS2API_TOOLS.txt" {
+		t.Fatalf("unexpected upload filenames: %#v", ds.uploadCalls)
+	}
+	historyText := string(ds.uploadCalls[0].Data)
+	if strings.Contains(historyText, "Description: search docs") {
+		t.Fatalf("history transcript should not embed tool descriptions, got %q", historyText)
+	}
+	toolsText := string(ds.uploadCalls[1].Data)
+	if !strings.Contains(toolsText, "# DS2API_TOOLS.txt") || !strings.Contains(toolsText, "Tool: search") || !strings.Contains(toolsText, "Description: search docs") {
+		t.Fatalf("expected tools transcript to include schema, got %q", toolsText)
+	}
+	promptText, _ := ds.completionReq["prompt"].(string)
+	if !strings.Contains(promptText, "DS2API_TOOLS.txt") || !strings.Contains(promptText, "TOOL CALL FORMAT") {
+		t.Fatalf("expected live prompt to reference tools file and retain format instructions, got %q", promptText)
+	}
+	if strings.Contains(promptText, "Description: search docs") {
+		t.Fatalf("live prompt should not inline tool descriptions, got %q", promptText)
+	}
+	refIDs, _ := ds.completionReq["ref_file_ids"].([]any)
+	if len(refIDs) < 2 || refIDs[0] != "file-inline-1" || refIDs[1] != "file-inline-2" {
+		t.Fatalf("expected history and tools ref ids first, got %#v", ds.completionReq["ref_file_ids"])
+	}
+}
+
 func TestChatCompletionsCurrentInputFileMapsManagedAuthFailureTo401(t *testing.T) {
 	ds := &inlineUploadDSStub{
 		uploadErr: &dsclient.RequestFailure{Op: "upload file", Kind: dsclient.FailureManagedUnauthorized, Message: "expired token"},
diff --git a/internal/httpapi/openai/leaked_output_sanitize_test.go b/internal/httpapi/openai/leaked_output_sanitize_test.go
index acaf7208..939f73fb 100644
--- a/internal/httpapi/openai/leaked_output_sanitize_test.go
+++ b/internal/httpapi/openai/leaked_output_sanitize_test.go
@@ -19,21 +19,47 @@ func TestSanitizeLeakedOutputRemovesLeakedWireToolCallAndResult(t *testing.T) {
 }
 
 func TestSanitizeLeakedOutputRemovesStandaloneMetaMarkers(t *testing.T) {
-	raw := "A<| end_of_sentence |><| Assistant |>B<| end_of_thinking |>C<｜end▁of▁thinking｜>D<｜end▁of▁sentence｜>E<| end_of_toolresults |>F<｜end▁of▁instructions｜>G"
+	raw := "A<| end_of_sentence |><| Assistant |>B<| end_of_thinking |>C<|end▁of▁thinking|>D<|end▁of▁sentence|>E<| end_of_toolresults |>F<|end▁of▁instructions|>G"
 	got := sanitizeLeakedOutput(raw)
 	if got != "ABCDEFG" {
 		t.Fatalf("unexpected sanitize result for meta markers: %q", got)
 	}
 }
 
+func TestSanitizeLeakedOutputRemovesFullwidthDelimitedMetaMarkers(t *testing.T) {
+	fw := "\uff5c"
+	raw := "A<" + fw + "end▁of▁sentence" + fw + ">B<" + fw + " Assistant " + fw + ">C<" + fw + "end_of_toolresults" + fw + ">D"
+	got := sanitizeLeakedOutput(raw)
+	if got != "ABCD" {
+		t.Fatalf("unexpected sanitize result for fullwidth-delimited meta markers: %q", got)
+	}
+}
+
 func TestSanitizeLeakedOutputRemovesThinkAndBosMarkers(t *testing.T) {
-	raw := "A<think>B</think>C<｜begin▁of▁sentence｜>D<| begin_of_sentence |>E<｜begin_of_sentence｜>F"
+	raw := "A<think>B</think>C<|begin▁of▁sentence|>D<| begin_of_sentence |>E<|begin_of_sentence|>F"
 	got := sanitizeLeakedOutput(raw)
 	if got != "ABCDEF" {
 		t.Fatalf("unexpected sanitize result for think/BOS markers: %q", got)
 	}
 }
 
+func TestSanitizeLeakedOutputRemovesThoughtMarkers(t *testing.T) {
+	raw := "A<|▁of▁thought|>B<| of_thought |>C<| begin_of_thought |>D<| end_of_thought |>E"
+	got := sanitizeLeakedOutput(raw)
+	if got != "ABCDE" {
+		t.Fatalf("unexpected sanitize result for leaked thought markers: %q", got)
+	}
+}
+
+func TestSanitizeLeakedOutputRemovesFullwidthDelimitedBosAndThoughtMarkers(t *testing.T) {
+	fw := "\uff5c"
+	raw := "A<" + fw + "begin▁of▁sentence" + fw + ">B<" + fw + "▁of▁thought" + fw + ">C<" + fw + " begin_of_thought " + fw + ">D"
+	got := sanitizeLeakedOutput(raw)
+	if got != "ABCD" {
+		t.Fatalf("unexpected sanitize result for fullwidth-delimited BOS/thought markers: %q", got)
+	}
+}
+
 func TestSanitizeLeakedOutputRemovesDanglingThinkBlock(t *testing.T) {
 	raw := "Answer prefix<think>internal reasoning that never closes"
 	got := sanitizeLeakedOutput(raw)
@@ -43,7 +69,7 @@ func TestSanitizeLeakedOutputRemovesDanglingThinkBlock(t *testing.T) {
 }
 
 func TestSanitizeLeakedOutputRemovesCompleteDSMLToolCallWrapper(t *testing.T) {
-	raw := "前置文本\n<｜DSML｜tool_calls>\n<｜DSML｜invoke name=\"Bash\">\n<｜DSML｜parameter name=\"command\"></｜DSML｜parameter>\n</｜DSML｜invoke>\n</｜DSML｜tool_calls>\n后置文本"
+	raw := "前置文本\n<|DSML|tool_calls>\n<|DSML|invoke name=\"Bash\">\n<|DSML|parameter name=\"command\"></|DSML|parameter>\n</|DSML|invoke>\n</|DSML|tool_calls>\n后置文本"
 	got := sanitizeLeakedOutput(raw)
 	if got != "前置文本\n\n后置文本" {
 		t.Fatalf("unexpected sanitize result for leaked dsml wrapper: %q", got)
diff --git a/internal/httpapi/openai/responses/empty_retry_runtime.go b/internal/httpapi/openai/responses/empty_retry_runtime.go
index 80422f5b..5166f9c7 100644
--- a/internal/httpapi/openai/responses/empty_retry_runtime.go
+++ b/internal/httpapi/openai/responses/empty_retry_runtime.go
@@ -15,7 +15,7 @@ import (
 	streamengine "ds2api/internal/stream"
 )
 
-func (h *Handler) handleResponsesStreamWithRetry(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow, owner, responseID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, toolChoice promptcompat.ToolChoicePolicy, traceID string, historySession *responsehistory.Session) {
+func (h *Handler) handleResponsesStreamWithRetry(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow, owner, responseID string, stdReq promptcompat.StandardRequest, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, toolChoice promptcompat.ToolChoicePolicy, traceID string, historySession *responsehistory.Session) {
 	streamRuntime, initialType, ok := h.prepareResponsesStreamRuntime(w, resp, owner, responseID, model, finalPrompt, refFileTokens, thinkingEnabled, searchEnabled, toolNames, toolsRaw, toolChoice, traceID, historySession)
 	if !ok {
 		return
@@ -27,6 +27,8 @@ func (h *Handler) handleResponsesStreamWithRetry(w http.ResponseWriter, r *http.
 		RetryMaxAttempts: emptyOutputRetryMaxAttempts(),
 		MaxAttempts:      3,
 		UsagePrompt:      finalPrompt,
+		Request:          stdReq,
+		CurrentInputFile: h.Store,
 	}, completionruntime.StreamRetryHooks{
 		ConsumeAttempt: func(currentResp *http.Response, allowDeferEmpty bool) (bool, bool) {
 			return h.consumeResponsesStreamAttempt(r, currentResp, streamRuntime, initialType, thinkingEnabled, allowDeferEmpty)
diff --git a/internal/httpapi/openai/responses/responses_handler.go b/internal/httpapi/openai/responses/responses_handler.go
index 3a6680d6..f34daed8 100644
--- a/internal/httpapi/openai/responses/responses_handler.go
+++ b/internal/httpapi/openai/responses/responses_handler.go
@@ -138,7 +138,7 @@ func (h *Handler) Responses(w http.ResponseWriter, r *http.Request) {
 
 	streamReq := start.Request
 	refFileTokens := streamReq.RefFileTokens
-	h.handleResponsesStreamWithRetry(w, r, a, start.Response, start.Payload, start.Pow, owner, responseID, streamReq.ResponseModel, streamReq.PromptTokenText, refFileTokens, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw, streamReq.ToolChoice, traceID, historySession)
+	h.handleResponsesStreamWithRetry(w, r, a, start.Response, start.Payload, start.Pow, owner, responseID, streamReq, streamReq.ResponseModel, streamReq.PromptTokenText, refFileTokens, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw, streamReq.ToolChoice, traceID, historySession)
 }
 
 func (h *Handler) handleResponsesNonStream(w http.ResponseWriter, resp *http.Response, owner, responseID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, toolChoice promptcompat.ToolChoicePolicy, traceID string) {
diff --git a/internal/httpapi/openai/shared/leaked_output_sanitize.go b/internal/httpapi/openai/shared/leaked_output_sanitize.go
index 5e54637e..9293e78f 100644
--- a/internal/httpapi/openai/shared/leaked_output_sanitize.go
+++ b/internal/httpapi/openai/shared/leaked_output_sanitize.go
@@ -13,15 +13,23 @@ var leakedToolResultBlobPattern = regexp.MustCompile(`(?is)<\s*\|\s*tool\s*\|\s*
 
 var leakedThinkTagPattern = regexp.MustCompile(`(?is)</?\s*think\s*>`)
 
-// leakedBOSMarkerPattern matches DeepSeek BOS markers in BOTH forms:
-//   - ASCII underscore: <｜begin_of_sentence｜>
-//   - U+2581 variant:   <｜begin▁of▁sentence｜>
-var leakedBOSMarkerPattern = regexp.MustCompile(`(?i)<[｜\|]\s*begin[_▁]of[_▁]sentence\s*[｜\|]>`)
+// leakedBOSMarkerPattern matches DeepSeek BOS markers with halfwidth or
+// legacy U+FF5C fullwidth delimiters:
+//   - ASCII underscore: <|begin_of_sentence|>
+//   - U+2581 variant:   <|begin▁of▁sentence|>
+var leakedBOSMarkerPattern = regexp.MustCompile(`(?i)<[\|\x{ff5c}]\s*begin[_▁]of[_▁]sentence\s*[\|\x{ff5c}]>`)
 
-// leakedMetaMarkerPattern matches the remaining DeepSeek special tokens in BOTH forms:
-//   - ASCII underscore: <｜end_of_sentence｜>, <｜end_of_toolresults｜>, <｜end_of_instructions｜>
-//   - U+2581 variant:   <｜end▁of▁sentence｜>, <｜end▁of▁toolresults｜>, <｜end▁of▁instructions｜>
-var leakedMetaMarkerPattern = regexp.MustCompile(`(?i)<[｜\|]\s*(?:assistant|tool|end[_▁]of[_▁]sentence|end[_▁]of[_▁]thinking|end[_▁]of[_▁]toolresults|end[_▁]of[_▁]instructions)\s*[｜\|]>`)
+// leakedThoughtMarkerPattern matches leaked thought control markers in both
+// explicit and compact forms:
+//   - ASCII underscore: <| of_thought |>, <| begin_of_thought |>
+//   - U+2581 variant:   <|▁of▁thought|>, <|begin▁of▁thought|>
+var leakedThoughtMarkerPattern = regexp.MustCompile(`(?i)<[\|\x{ff5c}]\s*(?:begin[_▁])?[_▁]*of[_▁]thought\s*[\|\x{ff5c}]>`)
+
+// leakedMetaMarkerPattern matches the remaining DeepSeek special tokens with
+// halfwidth or legacy U+FF5C fullwidth delimiters:
+//   - ASCII underscore: <|end_of_sentence|>, <|end_of_toolresults|>, <|end_of_instructions|>
+//   - U+2581 variant:   <|end▁of▁sentence|>, <|end▁of▁toolresults|>, <|end▁of▁instructions|>
+var leakedMetaMarkerPattern = regexp.MustCompile(`(?i)<[\|\x{ff5c}]\s*(?:assistant|tool|end[_▁]of[_▁]sentence|end[_▁]of[_▁]thinking|end[_▁]of[_▁]thought|end[_▁]of[_▁]toolresults|end[_▁]of[_▁]instructions)\s*[\|\x{ff5c}]>`)
 
 // leakedAgentXMLBlockPatterns catch agent-style XML blocks that leak through
 // when the sieve fails to capture them. These are applied only to complete
@@ -48,6 +56,7 @@ func sanitizeLeakedOutput(text string) string {
 	out = stripDanglingThinkSuffix(out)
 	out = leakedThinkTagPattern.ReplaceAllString(out, "")
 	out = leakedBOSMarkerPattern.ReplaceAllString(out, "")
+	out = leakedThoughtMarkerPattern.ReplaceAllString(out, "")
 	out = leakedMetaMarkerPattern.ReplaceAllString(out, "")
 	out = stripLeakedToolCallWrapperBlocks(out)
 	out = sanitizeLeakedAgentXMLBlocks(out)
diff --git a/internal/js/chat-stream/http_internal.js b/internal/js/chat-stream/http_internal.js
index 247e38c0..1c94ced5 100644
--- a/internal/js/chat-stream/http_internal.js
+++ b/internal/js/chat-stream/http_internal.js
@@ -85,6 +85,33 @@ async function fetchStreamPow(req, leaseID) {
   };
 }
 
+async function fetchStreamSwitch(req, leaseID) {
+  const url = buildInternalGoURL(req);
+  url.searchParams.set('__stream_switch', '1');
+
+  const upstream = await fetch(url.toString(), {
+    method: 'POST',
+    headers: buildInternalGoHeaders(req, { withInternalToken: true, withContentType: true }),
+    body: Buffer.from(JSON.stringify({ lease_id: leaseID })),
+  });
+
+  const text = await upstream.text();
+  let body = {};
+  try {
+    body = JSON.parse(text || '{}');
+  } catch (_err) {
+    body = {};
+  }
+
+  return {
+    ok: upstream.ok,
+    status: upstream.status,
+    contentType: upstream.headers.get('content-type') || 'application/json',
+    text,
+    body,
+  };
+}
+
 function relayPreparedFailure(res, prep) {
   if (prep.status === 401 && looksLikeVercelAuthPage(prep.text)) {
     writeOpenAIError(
@@ -223,6 +250,7 @@ module.exports = {
   readRawBody,
   fetchStreamPrepare,
   fetchStreamPow,
+  fetchStreamSwitch,
   relayPreparedFailure,
   safeReadText,
   buildInternalGoURL,
diff --git a/internal/js/chat-stream/sse_parse_impl.js b/internal/js/chat-stream/sse_parse_impl.js
index 6f5922ec..91074710 100644
--- a/internal/js/chat-stream/sse_parse_impl.js
+++ b/internal/js/chat-stream/sse_parse_impl.js
@@ -7,6 +7,10 @@ const {
   SKIP_EXACT_PATHS,
 } = require('../shared/deepseek-constants');
 
+const LEAKED_BOS_MARKER_PATTERN = /<[\|\uFF5C]\s*begin[_▁]of[_▁]sentence\s*[\|\uFF5C]>/gi;
+const LEAKED_THOUGHT_MARKER_PATTERN = /<[\|\uFF5C]\s*(?:begin[_▁])?[_▁]*of[_▁]thought\s*[\|\uFF5C]>/gi;
+const LEAKED_META_MARKER_PATTERN = /<[\|\uFF5C]\s*(?:assistant|tool|end[_▁]of[_▁]sentence|end[_▁]of[_▁]thinking|end[_▁]of[_▁]thought|end[_▁]of[_▁]toolresults|end[_▁]of[_▁]instructions)\s*[\|\uFF5C]>/gi;
+
 
 
 function stripThinkTags(text) {
@@ -621,7 +625,11 @@ function stripReferenceMarkersText(text) {
   if (!text) {
     return text;
   }
-  return text.replace(/\[(?:citation|reference):\s*\d+\]/gi, '');
+  return text
+    .replace(/\[(?:citation|reference):\s*\d+\]/gi, '')
+    .replace(LEAKED_BOS_MARKER_PATTERN, '')
+    .replace(LEAKED_THOUGHT_MARKER_PATTERN, '')
+    .replace(LEAKED_META_MARKER_PATTERN, '');
 }
 
 function asString(v) {
diff --git a/internal/js/chat-stream/vercel_stream_impl.js b/internal/js/chat-stream/vercel_stream_impl.js
index 9a9bb0b8..6e1d4a85 100644
--- a/internal/js/chat-stream/vercel_stream_impl.js
+++ b/internal/js/chat-stream/vercel_stream_impl.js
@@ -25,6 +25,7 @@ const {
   isAbortError,
   fetchStreamPrepare,
   fetchStreamPow,
+  fetchStreamSwitch,
   relayPreparedFailure,
   createLeaseReleaser,
 } = require('./http_internal');
@@ -46,11 +47,11 @@ async function handleVercelStream(req, res, rawBody, payload) {
   }
 
   const model = asString(prep.body.model) || asString(payload.model);
-  const sessionID = asString(prep.body.session_id) || `chatcmpl-${Date.now()}`;
+  const responseID = asString(prep.body.session_id) || `chatcmpl-${Date.now()}`;
   const leaseID = asString(prep.body.lease_id);
-  const deepseekToken = asString(prep.body.deepseek_token);
+  let deepseekToken = asString(prep.body.deepseek_token);
   const initialPowHeader = asString(prep.body.pow_header);
-  const completionPayload = prep.body.payload && typeof prep.body.payload === 'object' ? prep.body.payload : null;
+  let completionPayload = prep.body.payload && typeof prep.body.payload === 'object' ? prep.body.payload : null;
   const finalPrompt = asString(prep.body.final_prompt);
   const thinkingEnabled = toBool(prep.body.thinking_enabled);
   const searchEnabled = toBool(prep.body.search_enabled);
@@ -133,13 +134,14 @@ async function handleVercelStream(req, res, rawBody, payload) {
       }
     };
     const fetchCompletion = (bodyPayload) => fetchDeepSeekStream(DEEPSEEK_COMPLETION_URL, bodyPayload, currentPowHeader);
+    let activeDeepSeekSessionID = responseID;
     const fetchContinue = async (messageID) => {
       const powHeader = await refreshPowHeader('continue');
       if (!powHeader) {
         return null;
       }
       return fetchDeepSeekStream(DEEPSEEK_CONTINUE_URL, {
-        chat_session_id: sessionID,
+        chat_session_id: activeDeepSeekSessionID,
         message_id: messageID,
         fallback_to_resume: true,
       }, powHeader);
@@ -185,7 +187,7 @@ async function handleVercelStream(req, res, rawBody, payload) {
     let ended = false;
     const { sendFrame, sendDeltaFrame } = createChatCompletionEmitter({
       res,
-      sessionID,
+      sessionID: responseID,
       created,
       model,
       isClosed: () => clientClosed,
@@ -242,7 +244,7 @@ async function handleVercelStream(req, res, rawBody, payload) {
       }
       ended = true;
       sendFrame({
-        id: sessionID,
+        id: responseID,
         object: 'chat.completion.chunk',
         created,
         model,
@@ -261,7 +263,7 @@ async function handleVercelStream(req, res, rawBody, payload) {
 
     const processStream = async (initialResponse, allowDeferEmpty) => {
       let currentResponse = initialResponse;
-      let continueState = createContinueState(sessionID);
+      let continueState = createContinueState(activeDeepSeekSessionID);
       let continueRounds = 0;
       // eslint-disable-next-line no-constant-condition
       while (true) {
@@ -412,13 +414,39 @@ async function handleVercelStream(req, res, rawBody, payload) {
     };
 
     let retryAttempts = 0;
+    let accountSwitchAttempted = false;
     // eslint-disable-next-line no-constant-condition
     while (true) {
-      const processed = await processStream(completionRes, retryAttempts < EMPTY_OUTPUT_RETRY_MAX_ATTEMPTS);
+      const allowDeferEmpty = retryAttempts < EMPTY_OUTPUT_RETRY_MAX_ATTEMPTS || !accountSwitchAttempted;
+      const processed = await processStream(completionRes, allowDeferEmpty);
       if (processed.terminal) {
         return;
       }
-      if (!processed.retryable || retryAttempts >= EMPTY_OUTPUT_RETRY_MAX_ATTEMPTS) {
+      if (!processed.retryable) {
+        await finish('stop');
+        return;
+      }
+      if (retryAttempts >= EMPTY_OUTPUT_RETRY_MAX_ATTEMPTS) {
+        if (!accountSwitchAttempted) {
+          accountSwitchAttempted = true;
+          const switched = await fetchStreamSwitch(req, leaseID);
+          if (switched.ok && switched.body && switched.body.payload && typeof switched.body.payload === 'object') {
+            completionPayload = switched.body.payload;
+            deepseekToken = asString(switched.body.deepseek_token) || deepseekToken;
+            currentPowHeader = asString(switched.body.pow_header) || currentPowHeader;
+            activeDeepSeekSessionID = asString(switched.body.session_id) || activeDeepSeekSessionID;
+            usagePrompt = finalPrompt;
+            completionRes = await fetchCompletion(completionPayload);
+            if (completionRes === null) {
+              return;
+            }
+            if (!completionRes.ok || !completionRes.body) {
+              await finish('stop');
+              return;
+            }
+            continue;
+          }
+        }
         await finish('stop');
         return;
       }
diff --git a/internal/js/helpers/stream-tool-sieve/parse.js b/internal/js/helpers/stream-tool-sieve/parse.js
index f2ba3dcb..7a707695 100644
--- a/internal/js/helpers/stream-tool-sieve/parse.js
+++ b/internal/js/helpers/stream-tool-sieve/parse.js
@@ -7,6 +7,9 @@ const {
   parseMarkupToolCalls,
   stripFencedCodeBlocks,
   containsToolCallWrapperSyntaxOutsideIgnored,
+  normalizeDSMLToolCallMarkup,
+  hasRepairableXMLToolCallsWrapper,
+  indexToolCDATAOpen,
   sanitizeLooseCDATA,
 } = require('./parse_payload');
 
@@ -37,19 +40,23 @@ function parseToolCalls(text, toolNames) {
 
 function parseToolCallsDetailed(text, toolNames) {
   const result = emptyParseResult();
-  const normalized = toStringSafe(text);
-  if (!normalized) {
+  const raw = toStringSafe(text);
+  if (!raw) {
     return result;
   }
-  result.sawToolCallSyntax = looksLikeToolCallSyntax(normalized);
-  if (shouldSkipToolCallParsingForCodeFenceExample(normalized)) {
+  if (shouldSkipToolCallParsingForCodeFenceExample(raw)) {
     return result;
   }
+  const normalized = normalizeDSMLToolCallMarkup(stripFencedCodeBlocks(raw).trim());
+  if (!normalized.ok || !normalized.text) {
+    return result;
+  }
+  result.sawToolCallSyntax = looksLikeToolCallSyntax(normalized.text) || hasRepairableXMLToolCallsWrapper(normalized.text);
   // XML markup parsing only.
-  let parsed = parseMarkupToolCalls(normalized);
-  if (parsed.length === 0 && normalized.toLowerCase().includes('<![cdata[')) {
-    const recovered = sanitizeLooseCDATA(normalized);
-    if (recovered !== normalized) {
+  let parsed = parseMarkupToolCalls(normalized.text);
+  if (parsed.length === 0 && indexToolCDATAOpen(normalized.text, 0) >= 0) {
+    const recovered = sanitizeLooseCDATA(normalized.text);
+    if (recovered !== normalized.text) {
       parsed = parseMarkupToolCalls(recovered);
     }
   }
@@ -70,19 +77,23 @@ function parseStandaloneToolCalls(text, toolNames) {
 
 function parseStandaloneToolCallsDetailed(text, toolNames) {
   const result = emptyParseResult();
-  const trimmed = toStringSafe(text);
-  if (!trimmed) {
+  const raw = toStringSafe(text);
+  if (!raw) {
+    return result;
+  }
+  if (shouldSkipToolCallParsingForCodeFenceExample(raw)) {
     return result;
   }
-  result.sawToolCallSyntax = looksLikeToolCallSyntax(trimmed);
-  if (shouldSkipToolCallParsingForCodeFenceExample(trimmed)) {
+  const normalized = normalizeDSMLToolCallMarkup(stripFencedCodeBlocks(raw).trim());
+  if (!normalized.ok || !normalized.text) {
     return result;
   }
+  result.sawToolCallSyntax = looksLikeToolCallSyntax(normalized.text) || hasRepairableXMLToolCallsWrapper(normalized.text);
   // XML markup parsing only.
-  let parsed = parseMarkupToolCalls(trimmed);
-  if (parsed.length === 0 && trimmed.toLowerCase().includes('<![cdata[')) {
-    const recovered = sanitizeLooseCDATA(trimmed);
-    if (recovered !== trimmed) {
+  let parsed = parseMarkupToolCalls(normalized.text);
+  if (parsed.length === 0 && indexToolCDATAOpen(normalized.text, 0) >= 0) {
+    const recovered = sanitizeLooseCDATA(normalized.text);
+    if (recovered !== normalized.text) {
       parsed = parseMarkupToolCalls(recovered);
     }
   }
diff --git a/internal/js/helpers/stream-tool-sieve/parse_payload.js b/internal/js/helpers/stream-tool-sieve/parse_payload.js
index a24bd624..e9fc02f2 100644
--- a/internal/js/helpers/stream-tool-sieve/parse_payload.js
+++ b/internal/js/helpers/stream-tool-sieve/parse_payload.js
@@ -2,6 +2,8 @@
 
 const CDATA_PATTERN = /^(?:<|〈)(?:!|！)\[CDATA\[([\s\S]*?)]](?:>|＞|〉)$/i;
 const XML_ATTR_PATTERN = /\b([a-z0-9_:-]+)\s*=\s*("([^"]*)"|'([^']*)')/gi;
+const XML_TOOL_CALLS_CLOSE_PATTERN = /[<＜][\/／]tool_calls\s*[>＞]/gi;
+const XML_INVOKE_START_PATTERN = /[<＜]invoke\b[^>＞]*\bname\s*[=＝]\s*(?:"([^"]*)"|'([^']*)'|“([^”]*)”|‘([^’]*)’|＂([^＂]*)＂|＇([^＇]*)＇)/i;
 const TOOL_MARKUP_NAMES = [
   { raw: 'tool_calls', canonical: 'tool_calls' },
   { raw: 'tool-calls', canonical: 'tool_calls', dsmlOnly: true },
@@ -88,8 +90,7 @@ function isFenceCloseLine(trimmed, fenceChar, fenceLen) {
 }
 
 function cdataStartsBeforeFence(line) {
-  const cdataOpen = findNextCDATAOpen(line, 0);
-  const cdataIdx = cdataOpen.ok ? cdataOpen.start : -1;
+  const cdataIdx = indexToolCDATAOpen(line, 0);
   if (cdataIdx < 0) return false;
   const fenceIdx = Math.min(
     line.indexOf('```') >= 0 ? line.indexOf('```') : Infinity,
@@ -99,21 +100,28 @@ function cdataStartsBeforeFence(line) {
 }
 
 function updateCDATAStateLine(inCDATA, line) {
-  const lower = line.toLowerCase();
   let pos = 0;
   let state = inCDATA;
-  while (pos < lower.length) {
+  while (pos < line.length) {
     if (state) {
-      const cdataEnd = findCDATAEnd(lower, pos);
-      const end = cdataEnd.index;
+      let end = -1;
+      let closeLen = 0;
+      for (let i = pos; i < line.length; i += 1) {
+        const foundLen = toolCDATACloseLenAt(line, i);
+        if (foundLen > 0) {
+          end = i;
+          closeLen = foundLen;
+          break;
+        }
+      }
       if (end < 0) return true;
-      pos = end + cdataEnd.len;
+      pos = end + closeLen;
       state = false;
       continue;
     }
-    const start = findNextCDATAOpen(line, pos);
-    if (!start.ok) return false;
-    pos = start.bodyStart;
+    const start = indexToolCDATAOpen(line, pos);
+    if (start < 0) return false;
+    pos = start + toolCDATAOpenLenAt(line, start);
     state = true;
   }
   return state;
@@ -124,12 +132,20 @@ function parseMarkupToolCalls(text) {
   if (!normalized.ok) {
     return [];
   }
-  const raw = normalized.text.trim();
+  let raw = normalized.text.trim();
   if (!raw) {
     return [];
   }
+  let wrappers = findXmlElementBlocks(raw, 'tool_calls');
+  if (wrappers.length === 0 && hasRepairableXMLToolCallsWrapper(raw)) {
+    const repaired = repairMissingXMLToolCallsOpeningWrapper(raw);
+    if (repaired !== raw) {
+      raw = repaired;
+      wrappers = findXmlElementBlocks(raw, 'tool_calls');
+    }
+  }
   const out = [];
-  for (const wrapper of findXmlElementBlocks(raw, 'tool_calls')) {
+  for (const wrapper of wrappers) {
     const body = toStringSafe(wrapper.body);
     for (const block of findXmlElementBlocks(body, 'invoke')) {
       const parsed = parseMarkupSingleToolCall(block);
@@ -146,12 +162,13 @@ function normalizeDSMLToolCallMarkup(text) {
   if (!raw) {
     return { text: '', ok: true };
   }
-  const styles = containsToolMarkupSyntaxOutsideIgnored(raw);
-  if (!styles.dsml) {
-    return { text: raw, ok: true };
+  const canonicalized = canonicalizeToolCallCandidateSpans(raw);
+  const styles = containsToolMarkupSyntaxOutsideIgnored(canonicalized);
+  if (!styles.dsml && !styles.canonical) {
+    return { text: canonicalized, ok: true };
   }
   return {
-    text: replaceDSMLToolMarkupOutsideIgnored(raw),
+    text: replaceDSMLToolMarkupOutsideIgnored(canonicalized),
     ok: true,
   };
 }
@@ -170,9 +187,8 @@ function containsToolCallWrapperSyntaxOutsideIgnored(text) {
   if (!raw) {
     return styles;
   }
-  const lower = raw.toLowerCase();
   for (let i = 0; i < raw.length;) {
-    const skipped = skipXmlIgnoredSection(lower, i);
+    const skipped = skipXmlIgnoredSection(raw, i);
     if (skipped.blocked) {
       return styles;
     }
@@ -208,7 +224,7 @@ function containsToolMarkupSyntaxOutsideIgnored(text) {
     return styles;
   }
   for (let i = 0; i < raw.length;) {
-    const skipped = skipXmlIgnoredSection(raw.toLowerCase(), i);
+    const skipped = skipXmlIgnoredSection(raw, i);
     if (skipped.blocked) {
       return styles;
     }
@@ -239,10 +255,9 @@ function replaceDSMLToolMarkupOutsideIgnored(text) {
   if (!raw) {
     return '';
   }
-  const lower = raw.toLowerCase();
   let out = '';
   for (let i = 0; i < raw.length;) {
-    const skipped = skipXmlIgnoredSection(lower, i);
+    const skipped = skipXmlIgnoredSection(raw, i);
     if (skipped.blocked) {
       out += raw.slice(i);
       break;
@@ -254,15 +269,7 @@ function replaceDSMLToolMarkupOutsideIgnored(text) {
     }
     const tag = scanToolMarkupTagAt(raw, i);
     if (tag) {
-      if (tag.dsmlLike) {
-        const tail = normalizeToolMarkupTagTailForXML(raw.slice(tag.nameEnd, tag.end + 1));
-        out += `<${tag.closing ? '/' : ''}${tag.name}${tail}`;
-        if (!tail.endsWith('>')) {
-          out += '>';
-        }
-      } else {
-        out += raw.slice(tag.start, tag.end + 1);
-      }
+      out += `<${tag.closing ? '/' : ''}${tag.name}${raw.slice(tag.nameEnd, tag.end)}>`;
       i = tag.end + 1;
       continue;
     }
@@ -345,7 +352,7 @@ function findXmlStartTagOutsideCDATA(text, tag, from) {
   const lower = text.toLowerCase();
   const target = `<${tag}`;
   for (let i = Math.max(0, from || 0); i < text.length;) {
-    const skipped = skipXmlIgnoredSection(lower, i);
+    const skipped = skipXmlIgnoredSection(text, i);
     if (skipped.blocked) {
       return null;
     }
@@ -375,7 +382,7 @@ function findMatchingXmlEndTagOutsideCDATA(text, tag, from) {
   const closeTarget = `</${tag}`;
   let depth = 1;
   for (let i = Math.max(0, from || 0); i < text.length;) {
-    const skipped = skipXmlIgnoredSection(lower, i);
+    const skipped = skipXmlIgnoredSection(text, i);
     if (skipped.blocked) {
       return null;
     }
@@ -411,18 +418,18 @@ function findMatchingXmlEndTagOutsideCDATA(text, tag, from) {
   return null;
 }
 
-function skipXmlIgnoredSection(lower, i) {
-  const cdataOpen = matchCDATAOpenAt(lower, i);
-  if (cdataOpen.ok) {
-    const cdataEnd = findCDATAEnd(lower, cdataOpen.bodyStart);
-    const end = cdataEnd.index;
+function skipXmlIgnoredSection(text, i) {
+  const raw = toStringSafe(text);
+  const openLen = toolCDATAOpenLenAt(raw, i);
+  if (openLen > 0) {
+    const end = findToolCDATAEnd(raw, i + openLen);
     if (end < 0) {
       return { advanced: false, blocked: true, next: i };
     }
-    return { advanced: true, blocked: false, next: end + cdataEnd.len };
+    return { advanced: true, blocked: false, next: end + toolCDATACloseLenAt(raw, end) };
   }
-  if (lower.startsWith('<!--', i)) {
-    const end = lower.indexOf('-->', i + '<!--'.length);
+  if (raw.startsWith('<!--', i)) {
+    const end = raw.indexOf('-->', i + '<!--'.length);
     if (end < 0) {
       return { advanced: false, blocked: true, next: i };
     }
@@ -433,35 +440,17 @@ function skipXmlIgnoredSection(lower, i) {
 
 function findNextCDATAOpen(text, from) {
   const raw = toStringSafe(text);
-  for (let i = Math.max(0, from || 0); i < raw.length; i += 1) {
-    if (normalizeFullwidthASCIIChar(raw[i]) !== '<') {
-      continue;
-    }
-    const open = matchCDATAOpenAt(raw, i);
-    if (open.ok) {
-      return { ok: true, start: i, bodyStart: open.bodyStart };
-    }
+  const start = indexToolCDATAOpen(raw, from || 0);
+  if (start < 0) {
+    return { ok: false, start: -1, bodyStart: -1 };
   }
-  return { ok: false, start: -1, bodyStart: -1 };
+  return { ok: true, start, bodyStart: start + toolCDATAOpenLenAt(raw, start) };
 }
 
 function matchCDATAOpenAt(text, start) {
   const raw = toStringSafe(text);
-  if (start < 0 || start >= raw.length || normalizeFullwidthASCIIChar(raw[start]) !== '<') {
-    return { ok: false, bodyStart: start };
-  }
-  let i = start + 1;
-  for (let skipped = 0; skipped <= 4 && i < raw.length; skipped += 1) {
-    const matched = matchNormalizedASCII(raw, i, '[cdata[');
-    if (matched.ok) {
-      return { ok: true, bodyStart: i + matched.len };
-    }
-    if (!isCDATAOpenSeparator(raw[i])) {
-      break;
-    }
-    i += 1;
-  }
-  return { ok: false, bodyStart: start };
+  const openLen = toolCDATAOpenLenAt(raw, start);
+  return openLen > 0 ? { ok: true, bodyStart: start + openLen } : { ok: false, bodyStart: start };
 }
 
 function isCDATAOpenSeparator(ch) {
@@ -469,39 +458,30 @@ function isCDATAOpenSeparator(ch) {
 }
 
 function findCDATAEnd(text, from) {
-  const ascii = text.indexOf(']]>', from);
-  const fullwidth = text.indexOf(']]＞', from);
-  const cjk = text.indexOf(']]〉', from);
-  if (ascii < 0 && fullwidth < 0 && cjk < 0) {
-    return { index: -1, len: 0 };
-  }
-  let best = { index: -1, len: 0 };
-  for (const candidate of [
-    { index: ascii, len: ']]>'.length },
-    { index: fullwidth, len: ']]＞'.length },
-    { index: cjk, len: ']]〉'.length },
-  ]) {
-    if (candidate.index >= 0 && (best.index < 0 || candidate.index < best.index)) {
-      best = candidate;
-    }
-  }
-  return best;
+  const raw = toStringSafe(text);
+  const index = findToolCDATAEnd(raw, from);
+  return { index, len: index >= 0 ? toolCDATACloseLenAt(raw, index) : 0 };
 }
 
 function scanToolMarkupTagAt(text, start) {
   const raw = toStringSafe(text);
-  if (!raw || start < 0 || start >= raw.length || normalizeFullwidthASCIIChar(raw[start]) !== '<') {
+  const startDelimLen = xmlTagStartDelimiterLenAt(raw, start);
+  if (!raw || start < 0 || start >= raw.length || !startDelimLen) {
     return null;
   }
   const lower = raw.toLowerCase();
-  let i = start + 1;
-  while (i < raw.length && normalizeFullwidthASCIIChar(raw[i]) === '<') {
-    i += 1;
-  }
-  let closing = raw[i] === '/';
-  if (closing) {
-    i += 1;
+  let i = start + startDelimLen;
+  while (i < raw.length) {
+    i = skipToolMarkupIgnorables(raw, i);
+    const delimLen = xmlTagStartDelimiterLenAt(raw, i);
+    if (!delimLen) {
+      break;
+    }
+    i += delimLen;
   }
+  const slash = consumeToolMarkupClosingSlash(raw, i);
+  let closing = slash.closing;
+  i = slash.next;
   const prefix = consumeToolMarkupNamePrefix(raw, lower, i);
   const prefixStart = i;
   i = prefix.next;
@@ -522,8 +502,12 @@ function scanToolMarkupTagAt(text, start) {
   }
   const originalNameEnd = i + len;
   let nameEnd = originalNameEnd;
-  while (nameEnd < raw.length && isToolMarkupSeparator(raw[nameEnd])) {
-    nameEnd += 1;
+  while (true) {
+    const nextPipe = consumeToolMarkupSeparator(raw, nameEnd);
+    if (!nextPipe.ok) {
+      break;
+    }
+    nameEnd = nextPipe.next;
   }
   const hasTrailingSeparator = nameEnd > originalNameEnd;
   if (!hasXmlTagBoundary(raw, nameEnd)) {
@@ -552,7 +536,7 @@ function scanToolMarkupTagAt(text, start) {
     nameEnd,
     name,
     closing,
-    selfClosing: raw.slice(start, end + 1).trim().endsWith('/>'),
+    selfClosing: isSelfClosingXmlTag(raw.slice(start, end)),
     dsmlLike,
     canonical: !dsmlLike,
   };
@@ -560,9 +544,8 @@ function scanToolMarkupTagAt(text, start) {
 
 function findToolMarkupTagOutsideIgnored(text, from) {
   const raw = toStringSafe(text);
-  const lower = raw.toLowerCase();
   for (let i = Math.max(0, from || 0); i < raw.length;) {
-    const skipped = skipXmlIgnoredSection(lower, i);
+    const skipped = skipXmlIgnoredSection(raw, i);
     if (skipped.blocked) {
       return null;
     }
@@ -609,13 +592,13 @@ function findMatchingToolMarkupClose(text, openTag) {
 
 function findPartialToolMarkupStart(text) {
   const raw = toStringSafe(text);
-  const lastLT = raw.lastIndexOf('<');
+  const lastLT = lastIndexOfToolMarkupStartDelimiter(raw);
   if (lastLT < 0) {
     return -1;
   }
   const start = includeDuplicateLeadingLessThan(raw, lastLT);
   const tail = raw.slice(start);
-  if (tail.includes('>') || tail.includes('＞')) {
+  if (containsXmlTagTerminator(tail)) {
     return -1;
   }
   return isPartialToolMarkupTagPrefix(tail) ? start : -1;
@@ -623,13 +606,20 @@ function findPartialToolMarkupStart(text) {
 
 function includeDuplicateLeadingLessThan(text, idx) {
   let out = idx;
-  while (out > 0 && text[out - 1] === '<') {
+  while (out > 0 && isXmlTagStartDelimiter(text[out - 1])) {
     out -= 1;
   }
   return out;
 }
 
+function isXmlTagStartDelimiter(ch) {
+  return ['<', '＜', '﹤', '〈'].includes(ch);
+}
+
 function isToolMarkupSeparator(ch) {
+  if (isToolMarkupWhitespaceLike(ch)) {
+    return false;
+  }
   const normalized = normalizeFullwidthASCIIChar(ch || '');
   if (!normalized || ['<', '>', '/', '=', '"', "'", '['].includes(normalized)) {
     return false;
@@ -640,21 +630,26 @@ function isToolMarkupSeparator(ch) {
   return !/^[A-Za-z0-9]$/.test(normalized);
 }
 
+function isToolMarkupWhitespaceLike(ch) {
+  return !!ch && (/\s/u.test(ch) || ch === '▁');
+}
+
 function isPartialToolMarkupTagPrefix(text) {
   const raw = toStringSafe(text);
-  if (!raw || raw[0] !== '<' || raw.includes('>')) {
+  if (!raw || !isXmlTagStartDelimiter(raw[0]) || containsXmlTagTerminator(raw)) {
     return false;
   }
   const lower = raw.toLowerCase();
   let i = 1;
-  while (i < raw.length && raw[i] === '<') {
+  while (i < raw.length && isXmlTagStartDelimiter(raw[i])) {
     i += 1;
   }
   if (i >= raw.length) {
     return true;
   }
-  if (raw[i] === '/') {
-    i += 1;
+  const slash = consumeToolMarkupClosingSlash(raw, i);
+  if (slash.closing) {
+    i = slash.next;
   }
   while (i <= raw.length) {
     if (i === raw.length) {
@@ -663,7 +658,7 @@ function isPartialToolMarkupTagPrefix(text) {
     if (hasToolMarkupNamePrefix(raw, i)) {
       return true;
     }
-    if (normalizedASCIITailAt(raw, i).startsWith('dsml') || 'dsml'.startsWith(normalizedASCIITailAt(raw, i))) {
+    if (hasDSMLNamePrefixOrPartial(raw, i)) {
       return true;
     }
     if (hasPartialToolMarkupNameAfterArbitraryPrefix(raw, i)) {
@@ -697,10 +692,14 @@ function matchToolMarkupNameAfterArbitraryPrefix(raw, start) {
       return { ok: false };
     }
     for (const name of TOOL_MARKUP_NAMES) {
-      const matched = matchNormalizedASCII(raw, idx, name.raw);
-      if (!matched.ok) continue;
-      if (!toolMarkupPrefixAllowsLocalNameAt(raw, start, idx)) continue;
-      return { ok: true, name: name.canonical, start: idx, len: matched.len };
+      const matched = consumeToolKeyword(raw, idx, name.raw);
+      if (!matched.ok) {
+        continue;
+      }
+      if (!toolMarkupPrefixAllowsLocalNameAt(raw, start, idx)) {
+        continue;
+      }
+      return { ok: true, name: name.canonical, start: idx, len: matched.next - idx };
     }
     idx += 1;
   }
@@ -725,7 +724,7 @@ function hasPartialToolMarkupNameAfterArbitraryPrefix(raw, start) {
 
 function hasDSMLNamePrefixOrPartial(raw, start) {
   const tail = normalizedASCIITailAt(raw, start);
-  return tail.startsWith('dsml') || 'dsml'.startsWith(tail);
+  return tail.startsWith('dsml') || 'dsml'.startsWith(tail) || hasConfusablePartialKeywordPrefix(raw, start, 'dsml');
 }
 
 function toolMarkupPrefixAllowsLocalName(prefix) {
@@ -735,7 +734,7 @@ function toolMarkupPrefixAllowsLocalName(prefix) {
   if (normalizedASCIITailAt(prefix, 0).includes('dsml')) {
     return true;
   }
-  if (/[="'"]/.test(prefix)) {
+  if (/[="']/u.test(prefix)) {
     return false;
   }
   const previous = normalizeFullwidthASCIIChar(prefix[prefix.length - 1] || '');
@@ -750,7 +749,7 @@ function toolMarkupPrefixAllowsLocalNameAt(raw, start, localStart) {
   if (toolMarkupPrefixAllowsLocalName(prefix)) {
     return true;
   }
-  if (/[="'"]/.test(prefix)) {
+  if (/[="']/u.test(prefix)) {
     return false;
   }
   const previous = normalizeFullwidthASCIIChar(prefix[prefix.length - 1] || '');
@@ -772,18 +771,24 @@ function isToolMarkupTagTerminator(raw, idx) {
 }
 
 function consumeToolMarkupNamePrefixOnce(raw, lower, idx) {
-  if (idx < raw.length && isToolMarkupSeparator(raw[idx])) {
-    return { next: idx + 1, ok: true };
+  idx = skipToolMarkupIgnorables(raw, idx);
+  const sep = consumeToolMarkupSeparator(raw, idx);
+  if (sep.ok) {
+    return sep;
   }
-  if (idx < raw.length && [' ', '\t', '\r', '\n'].includes(raw[idx])) {
-    return { next: idx + 1, ok: true };
+  const spacingLen = toolMarkupWhitespaceLikeLenAt(raw, idx);
+  if (spacingLen > 0) {
+    return { next: idx + spacingLen, ok: true };
   }
-  const dsml = matchNormalizedASCII(raw, idx, 'dsml');
+  const dsml = consumeToolKeyword(raw, idx, 'dsml');
   if (dsml.ok) {
-    let next = idx + dsml.len;
-    const sep = normalizeFullwidthASCIIChar(raw[next] || '');
-    if (next < raw.length && (sep === '-' || sep === '_')) {
-      next += 1;
+    let next = dsml.next;
+    const dashLen = toolMarkupDashLenAt(raw, next);
+    const underscoreLen = toolMarkupUnderscoreLenAt(raw, next);
+    if (dashLen) {
+      next += dashLen;
+    } else if (underscoreLen) {
+      next += underscoreLen;
     }
     return { next, ok: true };
   }
@@ -794,7 +799,7 @@ function consumeToolMarkupNamePrefixOnce(raw, lower, idx) {
   return { next: idx, ok: false };
 }
 
-function consumeArbitraryToolMarkupNamePrefix(raw, lower, idx) {
+function consumeArbitraryToolMarkupNamePrefix(raw, _lower, idx) {
   const first = consumeToolMarkupPrefixSegment(raw, idx);
   if (!first.ok) {
     return { next: idx, ok: false };
@@ -802,27 +807,45 @@ function consumeArbitraryToolMarkupNamePrefix(raw, lower, idx) {
   let j = first.next;
   while (j < raw.length) {
     const segment = consumeToolMarkupPrefixSegment(raw, j);
-    if (!segment.ok) break;
+    if (!segment.ok) {
+      break;
+    }
     j = segment.next;
   }
   let k = j;
-  while (k < raw.length && [' ', '\t', '\r', '\n'].includes(raw[k])) {
-    k += 1;
+  while (true) {
+    const spacingLen = toolMarkupWhitespaceLikeLenAt(raw, k);
+    if (!spacingLen) {
+      break;
+    }
+    k += spacingLen;
   }
   let next = k;
   let ok = false;
-  if (next < raw.length && isToolMarkupSeparator(raw[next])) {
-    next += 1;
-    ok = true;
-  } else if (next < raw.length && ['_', '-'].includes(normalizeFullwidthASCIIChar(raw[next]))) {
-    next += 1;
+  const sep = consumeToolMarkupSeparator(raw, next);
+  if (sep.ok) {
+    next = sep.next;
     ok = true;
+  } else {
+    const dashLen = toolMarkupDashLenAt(raw, next);
+    const underscoreLen = toolMarkupUnderscoreLenAt(raw, next);
+    if (dashLen) {
+      next += dashLen;
+      ok = true;
+    } else if (underscoreLen) {
+      next += underscoreLen;
+      ok = true;
+    }
   }
   if (!ok) {
     return { next: idx, ok: false };
   }
-  while (next < raw.length && [' ', '\t', '\r', '\n'].includes(raw[next])) {
-    next += 1;
+  while (true) {
+    const spacingLen = toolMarkupWhitespaceLikeLenAt(raw, next);
+    if (!spacingLen) {
+      break;
+    }
+    next += spacingLen;
   }
   if (!hasToolMarkupNamePrefix(raw, next)) {
     return { next: idx, ok: false };
@@ -834,68 +857,669 @@ function consumeToolMarkupPrefixSegment(raw, idx) {
   if (idx < 0 || idx >= raw.length) {
     return { next: idx, ok: false };
   }
-  const ch = normalizeFullwidthASCIIChar(raw[idx]);
-  if (/^[A-Za-z0-9]$/.test(ch)) {
+  const normalized = normalizeFullwidthASCIIChar(raw[idx]);
+  if (/^[A-Za-z0-9]$/.test(normalized)) {
     return { next: idx + 1, ok: true };
   }
   return { next: idx, ok: false };
 }
 
 function hasToolMarkupNamePrefix(raw, start) {
-  const tail = normalizedASCIITailAt(raw, start);
   for (const name of TOOL_MARKUP_NAMES) {
-    if (tail.startsWith(name.raw) || name.raw.startsWith(tail)) {
+    if (consumeToolKeyword(raw, start, name.raw).ok) {
+      return true;
+    }
+    if (hasConfusablePartialKeywordPrefix(raw, start, name.raw)) {
       return true;
     }
   }
   return false;
 }
 
+function hasConfusablePartialKeywordPrefix(raw, start, keyword) {
+  if (start < 0 || start >= raw.length) {
+    return false;
+  }
+  let idx = start;
+  let matched = 0;
+  while (matched < keyword.length && idx < raw.length) {
+    idx = skipToolMarkupIgnorables(raw, idx);
+    if (idx >= raw.length) {
+      break;
+    }
+    const expected = keyword[matched];
+    if (expected === '_') {
+      const underscoreLen = toolMarkupUnderscoreLenAt(raw, idx);
+      if (!underscoreLen) {
+        return false;
+      }
+      idx += underscoreLen;
+      matched += 1;
+      continue;
+    }
+    if (expected === '-') {
+      const dashLen = toolMarkupDashLenAt(raw, idx);
+      if (!dashLen) {
+        return false;
+      }
+      idx += dashLen;
+      matched += 1;
+      continue;
+    }
+    const cp = raw.codePointAt(idx);
+    const ch = String.fromCodePoint(cp);
+    const folded = foldToolKeywordRune(ch);
+    if (!folded || folded !== expected.toLowerCase()) {
+      return false;
+    }
+    idx += ch.length;
+    matched += 1;
+  }
+  return matched > 0 && matched < keyword.length && idx === raw.length;
+}
+
 function matchToolMarkupName(raw, start, dsmlLike) {
   for (const name of TOOL_MARKUP_NAMES) {
     if (name.dsmlOnly && !dsmlLike) {
       continue;
     }
-    const matched = matchNormalizedASCII(raw, start, name.raw);
+    const matched = consumeToolKeyword(raw, start, name.raw);
     if (matched.ok) {
-      return { name: name.canonical, len: matched.len };
+      return { name: name.canonical, len: matched.next - start };
     }
   }
   return { name: '', len: 0 };
 }
 
+function consumeToolMarkupSeparator(raw, idx) {
+  idx = skipToolMarkupIgnorables(raw, idx);
+  if (idx >= raw.length) {
+    return { next: idx, ok: false };
+  }
+  const cp = raw.codePointAt(idx);
+  const ch = String.fromCodePoint(cp);
+  if (!isToolMarkupSeparator(ch)) {
+    return { next: idx, ok: false };
+  }
+  return { next: idx + ch.length, ok: true };
+}
+
+function hasToolMarkupBoundary(text, idx) {
+  idx = skipToolMarkupIgnorables(text, idx);
+  if (idx >= text.length) {
+    return true;
+  }
+  if (toolMarkupWhitespaceLikeLenAt(text, idx) > 0) {
+    return true;
+  }
+  if (consumeToolMarkupClosingSlash(text, idx).closing) {
+    return true;
+  }
+  return xmlTagEndDelimiterLenAt(text, idx) > 0;
+}
+
+function consumeToolMarkupLessThan(raw, idx) {
+  idx = skipToolMarkupIgnorables(raw, idx);
+  if (idx < 0 || idx >= raw.length) {
+    return { next: idx, ok: false };
+  }
+  const delimLen = xmlTagStartDelimiterLenAt(raw, idx);
+  if (!delimLen) {
+    return { next: idx, ok: false };
+  }
+  return { next: idx + delimLen, ok: true };
+}
+
+function canonicalizeToolCallCandidateSpans(text) {
+  const raw = toStringSafe(text);
+  if (!raw) {
+    return '';
+  }
+  let out = '';
+  for (let i = 0; i < raw.length;) {
+    const skipped = skipXmlIgnoredSection(raw, i);
+    if (skipped.blocked) {
+      out += raw.slice(i);
+      break;
+    }
+    if (skipped.advanced) {
+      out += raw.slice(i, skipped.next);
+      i = skipped.next;
+      continue;
+    }
+    const tag = scanToolMarkupTagAt(raw, i);
+    if (!tag) {
+      out += raw[i];
+      i += 1;
+      continue;
+    }
+    out += canonicalizeRecognizedToolMarkupTag(raw.slice(tag.start, tag.end + 1), tag);
+    i = tag.end + 1;
+  }
+  return out;
+}
+
+function canonicalizeRecognizedToolMarkupTag(rawTag, tag) {
+  const raw = toStringSafe(rawTag);
+  if (!raw || !tag) {
+    return raw;
+  }
+  let idx = 0;
+  const startLen = xmlTagStartDelimiterLenAt(raw, idx);
+  if (startLen > 0) {
+    idx += startLen;
+  }
+  while (idx < raw.length) {
+    idx = skipToolMarkupIgnorables(raw, idx);
+    const delimLen = xmlTagStartDelimiterLenAt(raw, idx);
+    if (!delimLen) {
+      break;
+    }
+    idx += delimLen;
+  }
+  idx = skipToolMarkupIgnorables(raw, idx);
+  if (tag.closing) {
+    const slash = consumeToolMarkupClosingSlash(raw, idx);
+    if (slash.closing) {
+      idx = slash.next;
+    }
+  }
+  const prefix = consumeToolMarkupNamePrefix(raw, raw.toLowerCase(), idx);
+  idx = prefix.next;
+  const nameMatch = consumeToolKeyword(raw, idx, rawNameForTag(tag));
+  const afterName = nameMatch.ok ? nameMatch.next : idx;
+  const attrs = parseCanonicalToolMarkupAttrs(raw, afterName);
+
+  let out = '<';
+  if (tag.closing) {
+    out += '/';
+  }
+  if (tag.dsmlLike) {
+    out += '|DSML|';
+  }
+  out += tag.name;
+  for (const attr of attrs) {
+    if (!attr || !attr.key) {
+      continue;
+    }
+    out += ` ${attr.key}="${quoteCanonicalXMLAttrValue(attr.value)}"`;
+  }
+  if (tag.selfClosing) {
+    out += '/';
+  }
+  out += '>';
+  return out;
+}
+
+function parseCanonicalToolMarkupAttrs(rawTag, startIdx) {
+  const raw = toStringSafe(rawTag);
+  let idx = Math.max(0, startIdx || 0);
+  const out = [];
+  while (idx < raw.length) {
+    idx = skipToolMarkupIgnorables(raw, idx);
+    if (idx >= raw.length) {
+      break;
+    }
+    const spacingLen = toolMarkupWhitespaceLikeLenAt(raw, idx);
+    if (spacingLen > 0) {
+      idx += spacingLen;
+      continue;
+    }
+    if (xmlTagEndDelimiterLenAt(raw, idx) > 0) {
+      break;
+    }
+    if (consumeToolMarkupPipe(raw, idx).ok) {
+      idx = consumeToolMarkupPipe(raw, idx).next;
+      continue;
+    }
+    if (consumeToolMarkupClosingSlash(raw, idx).closing) {
+      idx = consumeToolMarkupClosingSlash(raw, idx).next;
+      continue;
+    }
+
+    const keyStart = idx;
+    while (idx < raw.length) {
+      idx = skipToolMarkupIgnorables(raw, idx);
+      if (idx >= raw.length) {
+        break;
+      }
+      if (toolMarkupWhitespaceLikeLenAt(raw, idx) > 0) {
+        break;
+      }
+      if (toolMarkupEqualsLenAt(raw, idx) > 0 || xmlTagEndDelimiterLenAt(raw, idx) > 0) {
+        break;
+      }
+      if (consumeToolMarkupPipe(raw, idx).ok || consumeToolMarkupClosingSlash(raw, idx).closing) {
+        break;
+      }
+      const cp = raw.codePointAt(idx);
+      idx += cp > 0xFFFF ? 2 : 1;
+    }
+    const key = normalizeCanonicalToolAttrKey(raw.slice(keyStart, idx));
+
+    idx = skipToolMarkupIgnorables(raw, idx);
+    while (idx < raw.length) {
+      const wsLen = toolMarkupWhitespaceLikeLenAt(raw, idx);
+      if (!wsLen) {
+        break;
+      }
+      idx += wsLen;
+      idx = skipToolMarkupIgnorables(raw, idx);
+    }
+    const equalsLen = toolMarkupEqualsLenAt(raw, idx);
+    if (!equalsLen) {
+      continue;
+    }
+    idx += equalsLen;
+    idx = skipToolMarkupIgnorables(raw, idx);
+    while (idx < raw.length) {
+      const wsLen = toolMarkupWhitespaceLikeLenAt(raw, idx);
+      if (!wsLen) {
+        break;
+      }
+      idx += wsLen;
+      idx = skipToolMarkupIgnorables(raw, idx);
+    }
+    if (!key) {
+      if (idx < raw.length) {
+        const cp = raw.codePointAt(idx);
+        idx += cp > 0xFFFF ? 2 : 1;
+      }
+      continue;
+    }
+
+    let value = '';
+    const quote = xmlQuotePairAt(raw, idx);
+    if (quote.len) {
+      const valueStart = idx + quote.len;
+      idx = valueStart;
+      while (idx < raw.length) {
+        const closeLen = xmlQuoteCloseDelimiterLenAt(raw, idx, quote.close);
+        if (closeLen) {
+          value = raw.slice(valueStart, idx);
+          idx += closeLen;
+          break;
+        }
+        const cp = raw.codePointAt(idx);
+        idx += cp > 0xFFFF ? 2 : 1;
+      }
+    } else {
+      const valueStart = idx;
+      while (idx < raw.length) {
+        if (toolMarkupWhitespaceLikeLenAt(raw, idx) > 0 || xmlTagEndDelimiterLenAt(raw, idx) > 0 || toolMarkupEqualsLenAt(raw, idx) > 0) {
+          break;
+        }
+        if (consumeToolMarkupPipe(raw, idx).ok || consumeToolMarkupClosingSlash(raw, idx).closing) {
+          break;
+        }
+        const cp = raw.codePointAt(idx);
+        idx += cp > 0xFFFF ? 2 : 1;
+      }
+      value = raw.slice(valueStart, idx);
+    }
+    out.push({ key, value });
+  }
+  return out;
+}
+
+function normalizeCanonicalToolAttrKey(rawKey) {
+  const trimmed = toStringSafe(removeToolMarkupIgnorables(rawKey)).trim();
+  if (!trimmed) {
+    return '';
+  }
+  const matched = consumeToolKeyword(trimmed, 0, 'name');
+  return matched.ok && skipToolMarkupIgnorables(trimmed, matched.next) === trimmed.length ? 'name' : '';
+}
+
+function quoteCanonicalXMLAttrValue(rawValue) {
+  return toStringSafe(rawValue).replace(/"/g, '&quot;');
+}
+
+function removeToolMarkupIgnorables(rawValue) {
+  const raw = toStringSafe(rawValue);
+  let out = '';
+  for (let i = 0; i < raw.length;) {
+    const ignorableLen = toolMarkupIgnorableLenAt(raw, i);
+    if (ignorableLen) {
+      i += ignorableLen;
+      continue;
+    }
+    const cp = raw.codePointAt(i);
+    const ch = String.fromCodePoint(cp);
+    out += ch;
+    i += ch.length;
+  }
+  return out;
+}
+
+function skipToolMarkupIgnorables(text, idx) {
+  const raw = toStringSafe(text);
+  let pos = Math.max(0, idx || 0);
+  while (pos < raw.length) {
+    const next = toolMarkupIgnorableLenAt(raw, pos);
+    if (!next) {
+      break;
+    }
+    pos += next;
+  }
+  return pos;
+}
+
+function toolMarkupIgnorableLenAt(text, idx) {
+  const raw = toStringSafe(text);
+  if (idx < 0 || idx >= raw.length) {
+    return 0;
+  }
+  const cp = raw.codePointAt(idx);
+  if (cp === undefined) {
+    return 0;
+  }
+  const ch = String.fromCodePoint(cp);
+  const isFormat = /[\u00AD\u200B-\u200F\u202A-\u202E\u2060-\u206F\uFE00-\uFE0F\uFEFF]/u.test(ch);
+  const isControl = /[\u0000-\u0008\u000B\u000C\u000E-\u001F\u007F-\u009F]/u.test(ch);
+  return isFormat || isControl ? ch.length : 0;
+}
+
+function toolMarkupEqualsLenAt(text, idx) {
+  const raw = toStringSafe(text);
+  const pos = skipToolMarkupIgnorables(raw, idx);
+  for (const variant of ['=', '＝', '﹦', '꞊']) {
+    if (raw.startsWith(variant, pos)) {
+      return (pos + variant.length) - idx;
+    }
+  }
+  return 0;
+}
+
+function toolMarkupDashLenAt(text, idx) {
+  const raw = toStringSafe(text);
+  const pos = skipToolMarkupIgnorables(raw, idx);
+  for (const variant of ['-', '‐', '‑', '‒', '–', '—', '―', '−', '﹣', '－']) {
+    if (raw.startsWith(variant, pos)) {
+      return (pos + variant.length) - idx;
+    }
+  }
+  return 0;
+}
+
+function toolMarkupUnderscoreLenAt(text, idx) {
+  const raw = toStringSafe(text);
+  const pos = skipToolMarkupIgnorables(raw, idx);
+  for (const variant of ['_', '＿', '﹍', '﹎', '﹏']) {
+    if (raw.startsWith(variant, pos)) {
+      return (pos + variant.length) - idx;
+    }
+  }
+  return 0;
+}
+
+function consumeToolKeyword(text, idx, keyword) {
+  const raw = toStringSafe(text);
+  let next = idx;
+  for (const ch of keyword.toLowerCase()) {
+    next = skipToolMarkupIgnorables(raw, next);
+    if (next >= raw.length) {
+      return { next: idx, ok: false };
+    }
+    if (ch === '_') {
+      const len = toolMarkupUnderscoreLenAt(raw, next);
+      if (!len) {
+        return { next: idx, ok: false };
+      }
+      next += len;
+      continue;
+    }
+    if (ch === '-') {
+      const len = toolMarkupDashLenAt(raw, next);
+      if (!len) {
+        return { next: idx, ok: false };
+      }
+      next += len;
+      continue;
+    }
+    const cp = raw.codePointAt(next);
+    const folded = foldToolKeywordRune(String.fromCodePoint(cp));
+    if (!folded || folded !== ch) {
+      return { next: idx, ok: false };
+    }
+    next += cp > 0xFFFF ? 2 : 1;
+  }
+  return { next, ok: true };
+}
+
+function foldToolKeywordRune(ch) {
+  if (!ch) {
+    return '';
+  }
+  const cp = ch.codePointAt(0);
+  if (cp >= 0xFF21 && cp <= 0xFF3A) {
+    return String.fromCharCode(cp - 0xFEE0).toLowerCase();
+  }
+  if (cp >= 0xFF41 && cp <= 0xFF5A) {
+    return String.fromCharCode(cp - 0xFEE0);
+  }
+  const lower = ch.toLowerCase();
+  if ('acdeiklmnoprstv'.includes(lower)) {
+    return lower;
+  }
+  const mapped = {
+    'а': 'a',
+    'α': 'a',
+    'с': 'c',
+    'ϲ': 'c',
+    'ԁ': 'd',
+    'ⅾ': 'd',
+    'е': 'e',
+    'ε': 'e',
+    'і': 'i',
+    'ι': 'i',
+    'ı': 'i',
+    'к': 'k',
+    'κ': 'k',
+    'ⅼ': 'l',
+    'м': 'm',
+    'μ': 'm',
+    'ո': 'n',
+    'о': 'o',
+    'ο': 'o',
+    'р': 'p',
+    'ρ': 'p',
+    'ѕ': 's',
+    'т': 't',
+    'τ': 't',
+    'ν': 'v',
+    'ѵ': 'v',
+    'ⅴ': 'v',
+  };
+  return mapped[lower] || '';
+}
+
+function toolMarkupWhitespaceLikeLenAt(text, idx) {
+  const raw = toStringSafe(text);
+  const pos = skipToolMarkupIgnorables(raw, idx);
+  if (pos < 0 || pos >= raw.length) {
+    return 0;
+  }
+  if ([' ', '\t', '\n', '\r'].includes(raw[pos])) {
+    return (pos + 1) - idx;
+  }
+  if (raw.startsWith('▁', pos)) {
+    return (pos + '▁'.length) - idx;
+  }
+  const cp = raw.codePointAt(pos);
+  const ch = String.fromCodePoint(cp);
+  return /\s/u.test(ch) ? (pos + ch.length) - idx : 0;
+}
+
+function consumeToolMarkupPipe(raw, idx) {
+  const pos = skipToolMarkupIgnorables(raw, idx);
+  if (pos >= raw.length) {
+    return { next: idx, ok: false };
+  }
+  for (const variant of ['|', '│', '∣', '❘', 'ǀ', '￨']) {
+    if (raw.startsWith(variant, pos)) {
+      return { next: pos + variant.length, ok: true };
+    }
+  }
+  return { next: idx, ok: false };
+}
+
+function consumeToolMarkupClosingSlash(raw, idx) {
+  const pos = skipToolMarkupIgnorables(raw, idx);
+  if (pos >= raw.length) {
+    return { next: idx, closing: false };
+  }
+  for (const variant of ['/', '／', '∕', '⁄', '⧸']) {
+    if (raw.startsWith(variant, pos)) {
+      return { next: pos + variant.length, closing: true };
+    }
+  }
+  return { next: idx, closing: false };
+}
+
+function xmlTagStartDelimiterLenAt(text, idx) {
+  const raw = toStringSafe(text);
+  const pos = skipToolMarkupIgnorables(raw, idx);
+  if (pos < 0 || pos >= raw.length) {
+    return 0;
+  }
+  for (const variant of ['<', '＜', '﹤', '〈']) {
+    if (raw.startsWith(variant, pos)) {
+      return (pos + variant.length) - idx;
+    }
+  }
+  return 0;
+}
+
+function xmlTagEndDelimiterLenAt(text, idx) {
+  const raw = toStringSafe(text);
+  const pos = skipToolMarkupIgnorables(raw, idx);
+  if (pos < 0 || pos >= raw.length) {
+    return 0;
+  }
+  for (const variant of ['>', '＞', '﹥', '〉']) {
+    if (raw.startsWith(variant, pos)) {
+      return (pos + variant.length) - idx;
+    }
+  }
+  return 0;
+}
+
+function xmlTagEndDelimiterLenEndingAt(text, end) {
+  const raw = toStringSafe(text);
+  if (end < 0 || end >= raw.length) {
+    return 0;
+  }
+  for (const variant of ['>', '＞', '﹥', '〉']) {
+    if (end + 1 >= variant.length && raw.slice(end + 1 - variant.length, end + 1) === variant) {
+      return variant.length;
+    }
+  }
+  return 0;
+}
+
+function xmlQuotePairAt(text, idx) {
+  const raw = toStringSafe(text);
+  const pos = skipToolMarkupIgnorables(raw, idx);
+  if (pos < 0 || pos >= raw.length) {
+    return { close: '', len: 0 };
+  }
+  if (raw[pos] === '"') {
+    return { close: '"', len: (pos + 1) - idx };
+  }
+  if (raw[pos] === "'") {
+    return { close: "'", len: (pos + 1) - idx };
+  }
+  if (raw.startsWith('“', pos)) {
+    return { close: '”', len: (pos + '“'.length) - idx };
+  }
+  if (raw.startsWith('‘', pos)) {
+    return { close: '’', len: (pos + '‘'.length) - idx };
+  }
+  if (raw.startsWith('＂', pos)) {
+    return { close: '＂', len: (pos + '＂'.length) - idx };
+  }
+  if (raw.startsWith('＇', pos)) {
+    return { close: '＇', len: (pos + '＇'.length) - idx };
+  }
+  if (raw.startsWith('„', pos)) {
+    return { close: '”', len: (pos + '„'.length) - idx };
+  }
+  if (raw.startsWith('‟', pos)) {
+    return { close: '”', len: (pos + '‟'.length) - idx };
+  }
+  return { close: '', len: 0 };
+}
+
+function xmlQuoteCloseDelimiterLenAt(text, idx, close) {
+  const raw = toStringSafe(text);
+  if (!close) {
+    return 0;
+  }
+  return raw.startsWith(close, idx) ? close.length : 0;
+}
+
+function lastIndexOfToolMarkupStartDelimiter(raw) {
+  const text = toStringSafe(raw);
+  let best = -1;
+  for (const variant of ['<', '＜', '﹤', '〈']) {
+    const idx = text.lastIndexOf(variant);
+    if (idx > best) {
+      best = idx;
+    }
+  }
+  return best;
+}
+
+function containsXmlTagTerminator(raw) {
+  const text = toStringSafe(raw);
+  return text.includes('>') || text.includes('＞') || text.includes('﹥') || text.includes('〉');
+}
+
 function findXmlTagEnd(text, from) {
+  const raw = toStringSafe(text);
   let quote = '';
-  for (let i = Math.max(0, from || 0); i < text.length; i += 1) {
-    const ch = text[i];
-    const normalized = normalizeFullwidthASCIIChar(ch);
+  for (let i = Math.max(0, from || 0); i < raw.length;) {
     if (quote) {
-      if (normalized === quote) {
+      const closeLen = xmlQuoteCloseDelimiterLenAt(raw, i, quote);
+      if (closeLen) {
         quote = '';
+        i += closeLen;
+        continue;
       }
+      const cp = raw.codePointAt(i);
+      i += cp > 0xFFFF ? 2 : 1;
       continue;
     }
-    if (normalized === '"' || normalized === "'") {
-      quote = normalized;
+    const nextQuote = xmlQuotePairAt(raw, i);
+    if (nextQuote.len) {
+      quote = nextQuote.close;
+      i += nextQuote.len;
       continue;
     }
-    if (normalized === '>') {
-      return i;
+    const endLen = xmlTagEndDelimiterLenAt(raw, i);
+    if (endLen > 0) {
+      return i + endLen - 1;
     }
+    const cp = raw.codePointAt(i);
+    i += cp > 0xFFFF ? 2 : 1;
   }
   return -1;
 }
 
 function hasXmlTagBoundary(text, idx) {
-  if (idx >= text.length) {
+  const pos = skipToolMarkupIgnorables(text, idx);
+  if (pos >= text.length) {
     return true;
   }
-  return [' ', '\t', '\n', '\r', '>', '/'].includes(text[idx])
-    || normalizeFullwidthASCIIChar(text[idx]) === '>';
+  return toolMarkupWhitespaceLikeLenAt(text, pos) > 0
+    || consumeToolMarkupClosingSlash(text, pos).closing
+    || xmlTagEndDelimiterLenAt(text, pos) > 0;
 }
 
 function isSelfClosingXmlTag(startTag) {
-  return toStringSafe(startTag).trim().endsWith('/');
+  const trimmed = toStringSafe(startTag).trim();
+  return trimmed.endsWith('/') || trimmed.endsWith('／');
 }
 
 function normalizeFullwidthASCIIChar(ch) {
@@ -1070,7 +1694,7 @@ function findGenericXmlElementBlocks(text) {
 function findGenericXmlStartTagOutsideCDATA(text, from) {
   const lower = text.toLowerCase();
   for (let i = Math.max(0, from || 0); i < text.length;) {
-    const skipped = skipXmlIgnoredSection(lower, i);
+    const skipped = skipXmlIgnoredSection(text, i);
     if (skipped.blocked) {
       return null;
     }
@@ -1120,7 +1744,7 @@ function findMatchingGenericXmlEndTagOutsideCDATA(text, name, from) {
   const closeTarget = `</${needle}`;
   let depth = 1;
   for (let i = Math.max(0, from || 0); i < text.length;) {
-    const skipped = skipXmlIgnoredSection(lower, i);
+    const skipped = skipXmlIgnoredSection(text, i);
     if (skipped.blocked) {
       return null;
     }
@@ -1320,28 +1944,33 @@ function unescapeHtml(safe) {
 
 function extractStandaloneCDATA(inner) {
   const s = toStringSafe(inner).trim();
-  const open = matchCDATAOpenAt(s, 0);
-  if (open.ok) {
-    const close = findStandaloneCDATAEnd(s, open.bodyStart);
-    if (close.index < 0) {
-      return { ok: true, value: s.slice(open.bodyStart) };
-    }
-    return { ok: true, value: s.slice(open.bodyStart, close.index) };
+  const openLen = toolCDATAOpenLenAt(s, 0);
+  if (!openLen) {
+    return { ok: false, value: '' };
+  }
+  const closeStart = findTrailingToolCDATACloseStart(s);
+  if (closeStart >= openLen) {
+    return { ok: true, value: s.slice(openLen, closeStart) };
+  }
+  const end = findToolCDATAEnd(s, openLen);
+  if (end >= 0) {
+    return { ok: true, value: s.slice(openLen, end) };
   }
-  return { ok: false, value: '' };
+  return { ok: true, value: s.slice(openLen) };
 }
 
 function findStandaloneCDATAEnd(text, from) {
   const raw = toStringSafe(text);
   let best = { index: -1, len: 0 };
   for (let searchFrom = Math.max(0, from || 0); searchFrom < raw.length;) {
-    const close = findCDATAEnd(raw, searchFrom);
-    if (close.index < 0) {
+    const index = findToolCDATAEnd(raw, searchFrom);
+    if (index < 0) {
       break;
     }
-    const closeEnd = close.index + close.len;
+    const len = toolCDATACloseLenAt(raw, index);
+    const closeEnd = index + len;
     if (!raw.slice(closeEnd).trim()) {
-      best = close;
+      best = { index, len };
     }
     searchFrom = closeEnd;
   }
@@ -1588,26 +2217,23 @@ function sanitizeLooseCDATA(text) {
   if (!raw) {
     return '';
   }
-  const lower = raw.toLowerCase();
-  const openMarker = '<![cdata[';
-  const closeMarker = ']]>';
 
   let out = '';
   let pos = 0;
   let changed = false;
   while (pos < raw.length) {
-    const startRel = lower.indexOf(openMarker, pos);
-    if (startRel < 0) {
+    const start = indexToolCDATAOpen(raw, pos);
+    if (start < 0) {
       out += raw.slice(pos);
       break;
     }
-    const start = startRel;
-    const contentStart = start + openMarker.length;
+    const openLen = toolCDATAOpenLenAt(raw, start);
+    const contentStart = start + openLen;
     out += raw.slice(pos, start);
 
-    const endRel = lower.indexOf(closeMarker, contentStart);
+    const endRel = findToolCDATAEnd(raw, contentStart);
     if (endRel >= 0) {
-      const end = endRel + closeMarker.length;
+      const end = endRel + toolCDATACloseLenAt(raw, endRel);
       out += raw.slice(start, end);
       pos = end;
       continue;
@@ -1621,6 +2247,181 @@ function sanitizeLooseCDATA(text) {
   return changed ? out : raw;
 }
 
+function hasRepairableXMLToolCallsWrapper(text) {
+  const raw = toStringSafe(text).trim();
+  if (!raw || raw.toLowerCase().includes('<tool_calls')) {
+    return false;
+  }
+  const closeMatches = [...raw.matchAll(XML_TOOL_CALLS_CLOSE_PATTERN)];
+  if (closeMatches.length === 0) {
+    return false;
+  }
+  const invoke = raw.match(XML_INVOKE_START_PATTERN);
+  if (!invoke || invoke.index === undefined) {
+    return false;
+  }
+  const close = closeMatches[closeMatches.length - 1];
+  return invoke.index < close.index;
+}
+
+function repairMissingXMLToolCallsOpeningWrapper(text) {
+  const raw = toStringSafe(text);
+  if (!hasRepairableXMLToolCallsWrapper(raw)) {
+    return raw;
+  }
+  const closeMatches = [...raw.matchAll(XML_TOOL_CALLS_CLOSE_PATTERN)];
+  const invoke = raw.match(XML_INVOKE_START_PATTERN);
+  const close = closeMatches[closeMatches.length - 1];
+  return `${raw.slice(0, invoke.index)}<tool_calls>${raw.slice(invoke.index, close.index)}</tool_calls>${raw.slice(close.index + close[0].length)}`;
+}
+
+function rawNameForTag(tag) {
+  for (const candidate of TOOL_MARKUP_NAMES) {
+    if (candidate.canonical === tag.name) {
+      return candidate.raw;
+    }
+  }
+  return tag.name || '';
+}
+
+function toolCDATAOpenLenAt(text, idx) {
+  const raw = toStringSafe(text);
+  const start = skipToolMarkupIgnorables(raw, idx);
+  const ltLen = xmlTagStartDelimiterLenAt(raw, start);
+  if (!ltLen) {
+    return 0;
+  }
+  let pos = start + ltLen;
+  for (let skipped = 0; skipped <= 4 && pos < raw.length; skipped += 1) {
+    pos = skipToolMarkupIgnorables(raw, pos);
+    if (raw[pos] === '[') {
+      pos += 1;
+      const keyword = consumeToolKeyword(raw, pos, 'cdata');
+      if (!keyword.ok) {
+        return 0;
+      }
+      pos = skipToolMarkupIgnorables(raw, keyword.next);
+      if (raw[pos] !== '[') {
+        return 0;
+      }
+      pos += 1;
+      return pos - idx;
+    }
+    const cp = raw.codePointAt(pos);
+    if (cp === undefined) {
+      return 0;
+    }
+    const ch = String.fromCodePoint(cp);
+    if (!isToolMarkupSeparator(ch)) {
+      return 0;
+    }
+    pos += ch.length;
+  }
+  return 0;
+}
+
+function toolCDATACloseLenAt(text, idx) {
+  const raw = toStringSafe(text);
+  const start = skipToolMarkupIgnorables(raw, idx);
+  if (raw[start] !== ']') {
+    return 0;
+  }
+  let pos = start + 1;
+  pos = skipToolMarkupIgnorables(raw, pos);
+  if (raw[pos] !== ']') {
+    return 0;
+  }
+  pos += 1;
+  const gtLen = xmlTagEndDelimiterLenAt(raw, pos);
+  return gtLen ? (pos + gtLen) - idx : 0;
+}
+
+function findToolCDATAEnd(text, from) {
+  const raw = toStringSafe(text);
+  if (from < 0 || from >= raw.length) {
+    return -1;
+  }
+  let firstNonFenceEnd = -1;
+  for (let i = from; i < raw.length; i += 1) {
+    const closeLen = toolCDATACloseLenAt(raw, i);
+    if (!closeLen) {
+      continue;
+    }
+    const end = i;
+    if (cdataOffsetIsInsideMarkdownFence(raw.slice(from, end))) {
+      continue;
+    }
+    if (cdataEndLooksStructural(raw, end + closeLen)) {
+      return end;
+    }
+    if (firstNonFenceEnd < 0) {
+      firstNonFenceEnd = end;
+    }
+    i = end + closeLen - 1;
+  }
+  return firstNonFenceEnd;
+}
+
+function indexToolCDATAOpen(text, from = 0) {
+  const raw = toStringSafe(text);
+  for (let i = Math.max(0, from || 0); i < raw.length; i += 1) {
+    if (toolCDATAOpenLenAt(raw, i)) {
+      return i;
+    }
+  }
+  return -1;
+}
+
+function findTrailingToolCDATACloseStart(text) {
+  const raw = toStringSafe(text);
+  for (let i = raw.length - 1; i >= 0; i -= 1) {
+    const closeLen = toolCDATACloseLenAt(raw, i);
+    if (closeLen && i + closeLen === raw.length) {
+      return i;
+    }
+  }
+  return -1;
+}
+
+function cdataOffsetIsInsideMarkdownFence(fragment) {
+  const lines = toStringSafe(fragment).split('\n');
+  let inFence = false;
+  let fenceChar = '';
+  let fenceLen = 0;
+  for (const line of lines) {
+    const trimmed = line.replace(/^[ \t]+/, '');
+    if (!inFence) {
+      const fence = parseFenceOpenLine(trimmed);
+      if (fence) {
+        inFence = true;
+        fenceChar = fence.ch;
+        fenceLen = fence.count;
+      }
+      continue;
+    }
+    if (isFenceCloseLine(trimmed, fenceChar, fenceLen)) {
+      inFence = false;
+      fenceChar = '';
+      fenceLen = 0;
+    }
+  }
+  return inFence;
+}
+
+function cdataEndLooksStructural(text, after) {
+  const raw = toStringSafe(text);
+  let pos = after;
+  while (pos < raw.length) {
+    const ch = raw[pos];
+    if ([' ', '\t', '\r', '\n'].includes(ch)) {
+      pos += 1;
+      continue;
+    }
+    return raw.startsWith('</', pos) || raw.startsWith('<／', pos) || raw.startsWith('＜/', pos) || raw.startsWith('＜／', pos);
+  }
+  return true;
+}
+
 function parseTagAttributes(raw) {
   const source = toStringSafe(raw);
   const out = {};
@@ -1632,7 +2433,7 @@ function parseTagAttributes(raw) {
     if (!key) {
       continue;
     }
-    out[key] = match[3] || match[4] || '';
+    out[key] = match.slice(3).find((value) => value !== undefined && value !== '') || '';
   }
   return out;
 }
@@ -1697,8 +2498,10 @@ module.exports = {
   normalizeDSMLToolCallMarkup,
   containsToolMarkupSyntaxOutsideIgnored,
   containsToolCallWrapperSyntaxOutsideIgnored,
+  hasRepairableXMLToolCallsWrapper,
   findToolMarkupTagOutsideIgnored,
   findMatchingToolMarkupClose,
   findPartialToolMarkupStart,
+  indexToolCDATAOpen,
   sanitizeLooseCDATA,
 };
diff --git a/internal/js/helpers/stream-tool-sieve/sieve-xml.js b/internal/js/helpers/stream-tool-sieve/sieve-xml.js
index 1503c3ec..6e2b1ed7 100644
--- a/internal/js/helpers/stream-tool-sieve/sieve-xml.js
+++ b/internal/js/helpers/stream-tool-sieve/sieve-xml.js
@@ -114,6 +114,39 @@ function hasOpenXMLToolTag(captured) {
   return false;
 }
 
+function shouldKeepBareInvokeCapture(captured) {
+  const invokeTag = findFirstToolTag(captured, 0, 'invoke', false);
+  if (!invokeTag) {
+    return false;
+  }
+  const wrapperOpen = findFirstToolTag(captured, 0, 'tool_calls', false);
+  if (wrapperOpen && wrapperOpen.start <= invokeTag.start) {
+    return false;
+  }
+  const closeTag = findFirstToolTag(captured, invokeTag.start + 1, 'tool_calls', true);
+  if (closeTag && closeTag.start > invokeTag.start) {
+    return true;
+  }
+  const startEnd = invokeTag.end;
+  if (startEnd < 0) {
+    return true;
+  }
+  const body = captured.slice(startEnd + 1);
+  const trimmedBody = body.replace(/^[ \t\r\n]+/, '');
+  if (!trimmedBody) {
+    return true;
+  }
+  const invokeCloseTag = findFirstToolTag(captured, startEnd + 1, 'invoke', true);
+  if (invokeCloseTag) {
+    return captured.slice(invokeCloseTag.end + 1).trim() === '';
+  }
+  const paramTag = findFirstToolTag(body, 0, 'parameter', false);
+  if (paramTag && body.slice(0, paramTag.start).trim() === '') {
+    return true;
+  }
+  return trimmedBody.startsWith('{') || trimmedBody.startsWith('[');
+}
+
 function findFirstToolTag(text, from, name, closing) {
   for (let pos = Math.max(0, from || 0); pos < text.length;) {
     const tag = findToolMarkupTagOutsideIgnored(text, pos);
@@ -131,5 +164,6 @@ function findFirstToolTag(text, from, name, closing) {
 module.exports = {
   consumeXMLToolCapture,
   hasOpenXMLToolTag,
+  shouldKeepBareInvokeCapture,
   findPartialXMLToolTagStart: findPartialToolMarkupStart,
 };
diff --git a/internal/js/helpers/stream-tool-sieve/sieve.js b/internal/js/helpers/stream-tool-sieve/sieve.js
index a90a6623..0e2d0aa7 100644
--- a/internal/js/helpers/stream-tool-sieve/sieve.js
+++ b/internal/js/helpers/stream-tool-sieve/sieve.js
@@ -12,6 +12,7 @@ const {
 const {
   consumeXMLToolCapture: consumeXMLToolCaptureImpl,
   hasOpenXMLToolTag,
+  shouldKeepBareInvokeCapture,
   findPartialXMLToolTagStart,
 } = require('./sieve-xml');
 function processToolSieveChunk(state, chunk, toolNames) {
@@ -203,6 +204,9 @@ function consumeToolCapture(state, toolNames) {
   if (hasOpenXMLToolTag(captured)) {
     return { ready: false, prefix: '', calls: [], suffix: '' };
   }
+  if (shouldKeepBareInvokeCapture(captured)) {
+    return { ready: false, prefix: '', calls: [], suffix: '' };
+  }
 
   // No XML tool tags detected — release captured content as text.
   return {
diff --git a/internal/prompt/messages.go b/internal/prompt/messages.go
index d30fc28d..309d5f20 100644
--- a/internal/prompt/messages.go
+++ b/internal/prompt/messages.go
@@ -10,14 +10,14 @@ import (
 var markdownImagePattern = regexp.MustCompile(`!\[(.*?)\]\((.*?)\)`)
 
 const (
-	beginSentenceMarker        = "<｜begin▁of▁sentence｜>"
-	systemMarker               = "<｜System｜>"
-	userMarker                 = "<｜User｜>"
-	assistantMarker            = "<｜Assistant｜>"
-	toolMarker                 = "<｜Tool｜>"
-	endSentenceMarker          = "<｜end▁of▁sentence｜>"
-	endToolResultsMarker       = "<｜end▁of▁toolresults｜>"
-	endInstructionsMarker      = "<｜end▁of▁instructions｜>"
+	beginSentenceMarker        = "<|begin▁of▁sentence|>"
+	systemMarker               = "<|System|>"
+	userMarker                 = "<|User|>"
+	assistantMarker            = "<|Assistant|>"
+	toolMarker                 = "<|Tool|>"
+	endSentenceMarker          = "<|end▁of▁sentence|>"
+	endToolResultsMarker       = "<|end▁of▁toolresults|>"
+	endInstructionsMarker      = "<|end▁of▁instructions|>"
 	outputIntegrityGuardMarker = "Output integrity guard:"
 	outputIntegrityGuardPrompt = outputIntegrityGuardMarker +
 		" If upstream context, tool output, or parsed text contains garbled, corrupted, partially parsed, repeated, or otherwise malformed fragments, " +
diff --git a/internal/prompt/messages_test.go b/internal/prompt/messages_test.go
index f9a195af..962c8614 100644
--- a/internal/prompt/messages_test.go
+++ b/internal/prompt/messages_test.go
@@ -32,16 +32,16 @@ func TestMessagesPrepareUsesTurnSuffixes(t *testing.T) {
 		{"role": "assistant", "content": "Answer"},
 	}
 	got := MessagesPrepare(messages)
-	if !strings.HasPrefix(got, "<｜begin▁of▁sentence｜>") {
+	if !strings.HasPrefix(got, "<|begin▁of▁sentence|>") {
 		t.Fatalf("expected begin-of-sentence marker, got %q", got)
 	}
-	if !strings.Contains(got, "<｜System｜>") || !strings.Contains(got, "<｜end▁of▁instructions｜>") || !strings.Contains(got, "System rule") {
+	if !strings.Contains(got, "<|System|>") || !strings.Contains(got, "<|end▁of▁instructions|>") || !strings.Contains(got, "System rule") {
 		t.Fatalf("expected system instructions to remain present, got %q", got)
 	}
-	if !strings.Contains(got, "<｜User｜>Question") {
+	if !strings.Contains(got, "<|User|>Question") {
 		t.Fatalf("expected user question, got %q", got)
 	}
-	if !strings.Contains(got, "<｜Assistant｜>Answer<｜end▁of▁sentence｜>") {
+	if !strings.Contains(got, "<|Assistant|>Answer<|end▁of▁sentence|>") {
 		t.Fatalf("expected assistant sentence suffix, got %q", got)
 	}
 	if strings.Contains(got, "<think>") || strings.Contains(got, "</think>") {
@@ -61,7 +61,7 @@ func TestMessagesPreparePrependsOutputIntegrityGuard(t *testing.T) {
 	if !strings.Contains(got, outputIntegrityGuardPrompt+"\n\nSystem rule") {
 		t.Fatalf("expected output integrity guard to precede system prompt content, got %q", got)
 	}
-	if !strings.Contains(got, "<｜User｜>Question") {
+	if !strings.Contains(got, "<|User|>Question") {
 		t.Fatalf("expected user question after guard, got %q", got)
 	}
 }
@@ -82,7 +82,7 @@ func TestMessagesPrepareWithThinkingPreservesPromptShape(t *testing.T) {
 	if gotThinking != gotPlain {
 		t.Fatalf("expected thinking flag not to add extra continuity instructions, got thinking=%q plain=%q", gotThinking, gotPlain)
 	}
-	if !strings.HasSuffix(gotThinking, "<｜Assistant｜>") {
+	if !strings.HasSuffix(gotThinking, "<|Assistant|>") {
 		t.Fatalf("expected assistant suffix, got %q", gotThinking)
 	}
 }
diff --git a/internal/prompt/tool_calls.go b/internal/prompt/tool_calls.go
index e191dcf7..050a2a91 100644
--- a/internal/prompt/tool_calls.go
+++ b/internal/prompt/tool_calls.go
@@ -17,12 +17,12 @@ var promptXMLTextEscaper = strings.NewReplacer(
 var promptXMLNamePattern = regexp.MustCompile(`^[A-Za-z_][A-Za-z0-9_.:-]*$`)
 
 const (
-	promptDSMLToolCallsOpen  = "<｜DSML｜tool_calls>"
-	promptDSMLToolCallsClose = "</｜DSML｜tool_calls>"
-	promptDSMLInvokeOpen     = "<｜DSML｜invoke"
-	promptDSMLInvokeClose    = "</｜DSML｜invoke>"
-	promptDSMLParameterOpen  = "<｜DSML｜parameter"
-	promptDSMLParameterClose = "</｜DSML｜parameter>"
+	promptDSMLToolCallsOpen  = "<|DSML|tool_calls>"
+	promptDSMLToolCallsClose = "</|DSML|tool_calls>"
+	promptDSMLInvokeOpen     = "<|DSML|invoke"
+	promptDSMLInvokeClose    = "</|DSML|invoke>"
+	promptDSMLParameterOpen  = "<|DSML|parameter"
+	promptDSMLParameterClose = "</|DSML|parameter>"
 )
 
 // FormatToolCallsForPrompt renders a tool_calls slice into the prompt-visible
diff --git a/internal/prompt/tool_calls_test.go b/internal/prompt/tool_calls_test.go
index eef0a4a6..8a5a3693 100644
--- a/internal/prompt/tool_calls_test.go
+++ b/internal/prompt/tool_calls_test.go
@@ -22,7 +22,7 @@ func TestFormatToolCallsForPromptDSML(t *testing.T) {
 	if got == "" {
 		t.Fatal("expected non-empty formatted tool calls")
 	}
-	if got != "<｜DSML｜tool_calls>\n  <｜DSML｜invoke name=\"search_web\">\n    <｜DSML｜parameter name=\"query\"><![CDATA[latest]]></｜DSML｜parameter>\n  </｜DSML｜invoke>\n</｜DSML｜tool_calls>" {
+	if got != "<|DSML|tool_calls>\n  <|DSML|invoke name=\"search_web\">\n    <|DSML|parameter name=\"query\"><![CDATA[latest]]></|DSML|parameter>\n  </|DSML|invoke>\n</|DSML|tool_calls>" {
 		t.Fatalf("unexpected formatted tool call DSML: %q", got)
 	}
 }
@@ -34,7 +34,7 @@ func TestFormatToolCallsForPromptEscapesXMLEntities(t *testing.T) {
 			"arguments": `{"q":"a < b && c > d"}`,
 		},
 	})
-	want := "<｜DSML｜tool_calls>\n  <｜DSML｜invoke name=\"search&lt;&amp;&gt;\">\n    <｜DSML｜parameter name=\"q\"><![CDATA[a < b && c > d]]></｜DSML｜parameter>\n  </｜DSML｜invoke>\n</｜DSML｜tool_calls>"
+	want := "<|DSML|tool_calls>\n  <|DSML|invoke name=\"search&lt;&amp;&gt;\">\n    <|DSML|parameter name=\"q\"><![CDATA[a < b && c > d]]></|DSML|parameter>\n  </|DSML|invoke>\n</|DSML|tool_calls>"
 	if got != want {
 		t.Fatalf("unexpected escaped tool call XML: %q", got)
 	}
@@ -50,7 +50,7 @@ func TestFormatToolCallsForPromptUsesCDATAForMultilineContent(t *testing.T) {
 			},
 		},
 	})
-	want := "<｜DSML｜tool_calls>\n  <｜DSML｜invoke name=\"write_file\">\n    <｜DSML｜parameter name=\"content\"><![CDATA[#!/bin/bash\nprintf \"hello\"\n]]></｜DSML｜parameter>\n    <｜DSML｜parameter name=\"path\"><![CDATA[script.sh]]></｜DSML｜parameter>\n  </｜DSML｜invoke>\n</｜DSML｜tool_calls>"
+	want := "<|DSML|tool_calls>\n  <|DSML|invoke name=\"write_file\">\n    <|DSML|parameter name=\"content\"><![CDATA[#!/bin/bash\nprintf \"hello\"\n]]></|DSML|parameter>\n    <|DSML|parameter name=\"path\"><![CDATA[script.sh]]></|DSML|parameter>\n  </|DSML|invoke>\n</|DSML|tool_calls>"
 	if got != want {
 		t.Fatalf("unexpected multiline cdata tool call XML: %q", got)
 	}
diff --git a/internal/promptcompat/message_normalize_test.go b/internal/promptcompat/message_normalize_test.go
index e1172096..cd37f552 100644
--- a/internal/promptcompat/message_normalize_test.go
+++ b/internal/promptcompat/message_normalize_test.go
@@ -38,10 +38,10 @@ func TestNormalizeOpenAIMessagesForPrompt_AssistantToolCallsAndToolResult(t *tes
 		t.Fatalf("expected 4 normalized messages with assistant tool history preserved, got %d", len(normalized))
 	}
 	assistantContent, _ := normalized[2]["content"].(string)
-	if !strings.Contains(assistantContent, "<｜DSML｜tool_calls>") {
+	if !strings.Contains(assistantContent, "<|DSML|tool_calls>") {
 		t.Fatalf("assistant tool history should be preserved in DSML form, got %q", assistantContent)
 	}
-	if !strings.Contains(assistantContent, `<｜DSML｜invoke name="get_weather">`) {
+	if !strings.Contains(assistantContent, `<|DSML|invoke name="get_weather">`) {
 		t.Fatalf("expected tool name in preserved history, got %q", assistantContent)
 	}
 	if !strings.Contains(normalized[3]["content"].(string), `"temp":18`) {
@@ -49,7 +49,7 @@ func TestNormalizeOpenAIMessagesForPrompt_AssistantToolCallsAndToolResult(t *tes
 	}
 
 	prompt := util.MessagesPrepare(normalized)
-	if !strings.Contains(prompt, "<｜DSML｜tool_calls>") {
+	if !strings.Contains(prompt, "<|DSML|tool_calls>") {
 		t.Fatalf("expected preserved assistant tool history in prompt: %q", prompt)
 	}
 }
@@ -177,10 +177,10 @@ func TestNormalizeOpenAIMessagesForPrompt_AssistantMultipleToolCallsRemainSepara
 		t.Fatalf("expected assistant tool_call-only message preserved, got %#v", normalized)
 	}
 	content, _ := normalized[0]["content"].(string)
-	if strings.Count(content, "<｜DSML｜invoke name=") != 2 {
+	if strings.Count(content, "<|DSML|invoke name=") != 2 {
 		t.Fatalf("expected two preserved tool call blocks, got %q", content)
 	}
-	if !strings.Contains(content, `<｜DSML｜invoke name="search_web">`) || !strings.Contains(content, `<｜DSML｜invoke name="eval_javascript">`) {
+	if !strings.Contains(content, `<|DSML|invoke name="search_web">`) || !strings.Contains(content, `<|DSML|invoke name="eval_javascript">`) {
 		t.Fatalf("expected both tool names in preserved history, got %q", content)
 	}
 }
@@ -258,7 +258,7 @@ func TestNormalizeOpenAIMessagesForPrompt_AssistantNilContentDoesNotInjectNullLi
 	if strings.Contains(content, "null") {
 		t.Fatalf("expected no null literal injection, got %q", content)
 	}
-	if !strings.Contains(content, "<｜DSML｜tool_calls>") {
+	if !strings.Contains(content, "<|DSML|tool_calls>") {
 		t.Fatalf("expected assistant tool history in normalized content, got %q", content)
 	}
 }
@@ -282,11 +282,11 @@ func TestNormalizeOpenAIMessagesForPrompt_CanonicalizesStandaloneAssistantToolMa
 	}
 	content, _ := normalized[0]["content"].(string)
 	for _, want := range []string{
-		"<｜DSML｜tool_calls>",
-		`<｜DSML｜invoke name="Bash">`,
-		`<｜DSML｜parameter name="command"><![CDATA[lsof -i :4321 -t]]></｜DSML｜parameter>`,
-		`<｜DSML｜parameter name="description"><![CDATA[Verify port 4321 is free]]></｜DSML｜parameter>`,
-		"</｜DSML｜tool_calls>",
+		"<|DSML|tool_calls>",
+		`<|DSML|invoke name="Bash">`,
+		`<|DSML|parameter name="command"><![CDATA[lsof -i :4321 -t]]></|DSML|parameter>`,
+		`<|DSML|parameter name="description"><![CDATA[Verify port 4321 is free]]></|DSML|parameter>`,
+		"</|DSML|tool_calls>",
 	} {
 		if !strings.Contains(content, want) {
 			t.Fatalf("expected canonicalized assistant tool markup to contain %q, got %q", want, content)
diff --git a/internal/promptcompat/prompt_build.go b/internal/promptcompat/prompt_build.go
index 9d2ee4e8..6ba0f9ae 100644
--- a/internal/promptcompat/prompt_build.go
+++ b/internal/promptcompat/prompt_build.go
@@ -9,10 +9,22 @@ func buildOpenAIFinalPrompt(messagesRaw []any, toolsRaw any, traceID string, thi
 }
 
 func BuildOpenAIPrompt(messagesRaw []any, toolsRaw any, traceID string, toolPolicy ToolChoicePolicy, thinkingEnabled bool) (string, []string) {
+	return buildOpenAIPrompt(messagesRaw, toolsRaw, traceID, toolPolicy, thinkingEnabled, true)
+}
+
+func BuildOpenAIPromptWithToolInstructionsOnly(messagesRaw []any, toolsRaw any, traceID string, toolPolicy ToolChoicePolicy, thinkingEnabled bool) (string, []string) {
+	return buildOpenAIPrompt(messagesRaw, toolsRaw, traceID, toolPolicy, thinkingEnabled, false)
+}
+
+func buildOpenAIPrompt(messagesRaw []any, toolsRaw any, traceID string, toolPolicy ToolChoicePolicy, thinkingEnabled bool, includeToolDescriptions bool) (string, []string) {
 	messages := NormalizeOpenAIMessagesForPrompt(messagesRaw, traceID)
 	toolNames := []string{}
 	if tools, ok := toolsRaw.([]any); ok && len(tools) > 0 {
-		messages, toolNames = injectToolPrompt(messages, tools, toolPolicy)
+		if includeToolDescriptions {
+			messages, toolNames = injectToolPrompt(messages, tools, toolPolicy)
+		} else {
+			messages, toolNames = injectToolPromptInstructionsOnly(messages, tools, toolPolicy)
+		}
 	}
 	return prompt.MessagesPrepareWithThinking(messages, thinkingEnabled), toolNames
 }
diff --git a/internal/promptcompat/prompt_build_test.go b/internal/promptcompat/prompt_build_test.go
index 0a969739..043e1a83 100644
--- a/internal/promptcompat/prompt_build_test.go
+++ b/internal/promptcompat/prompt_build_test.go
@@ -47,10 +47,10 @@ func TestBuildOpenAIFinalPrompt_HandlerPathIncludesToolRoundtripSemantics(t *tes
 	if !strings.Contains(finalPrompt, `"condition":"sunny"`) {
 		t.Fatalf("handler finalPrompt should preserve tool output content: %q", finalPrompt)
 	}
-	if !strings.Contains(finalPrompt, "<｜DSML｜tool_calls>") {
+	if !strings.Contains(finalPrompt, "<|DSML|tool_calls>") {
 		t.Fatalf("handler finalPrompt should preserve assistant tool history: %q", finalPrompt)
 	}
-	if !strings.Contains(finalPrompt, `<｜DSML｜invoke name="get_weather">`) {
+	if !strings.Contains(finalPrompt, `<|DSML|invoke name="get_weather">`) {
 		t.Fatalf("handler finalPrompt should include tool name history: %q", finalPrompt)
 	}
 }
@@ -74,7 +74,7 @@ func TestBuildOpenAIFinalPrompt_VercelPreparePathKeepsFinalAnswerInstruction(t *
 	}
 
 	finalPrompt, _ := buildOpenAIFinalPrompt(messages, tools, "", false)
-	if !strings.Contains(finalPrompt, "Remember: The ONLY valid way to use tools is the <｜DSML｜tool_calls>...</｜DSML｜tool_calls> block at the end of your response.") {
+	if !strings.Contains(finalPrompt, "Remember: The ONLY valid way to use tools is the <|DSML|tool_calls>...</|DSML|tool_calls> block at the end of your response.") {
 		t.Fatalf("vercel prepare finalPrompt missing final tool-call anchor instruction: %q", finalPrompt)
 	}
 	if !strings.Contains(finalPrompt, "TOOL CALL FORMAT") {
@@ -88,6 +88,67 @@ func TestBuildOpenAIFinalPrompt_VercelPreparePathKeepsFinalAnswerInstruction(t *
 	}
 }
 
+func TestBuildOpenAIPromptWithToolInstructionsOnlyOmitsSchemas(t *testing.T) {
+	messages := []any{
+		map[string]any{"role": "system", "content": "You are helpful"},
+		map[string]any{"role": "user", "content": "请调用工具"},
+	}
+	tools := []any{
+		map[string]any{
+			"type": "function",
+			"function": map[string]any{
+				"name":        "search",
+				"description": "search docs",
+				"parameters": map[string]any{
+					"type": "object",
+				},
+			},
+		},
+	}
+
+	finalPrompt, toolNames := BuildOpenAIPromptWithToolInstructionsOnly(messages, tools, "", DefaultToolChoicePolicy(), false)
+	if len(toolNames) != 1 || toolNames[0] != "search" {
+		t.Fatalf("unexpected tool names: %#v", toolNames)
+	}
+	if strings.Contains(finalPrompt, "You have access to these tools") || strings.Contains(finalPrompt, "Description: search docs") || strings.Contains(finalPrompt, "Parameters:") {
+		t.Fatalf("tool descriptions should be externalized, got: %q", finalPrompt)
+	}
+	if !strings.Contains(finalPrompt, "Treat DS2API_TOOLS.txt as the authoritative list of callable tools and schemas") {
+		t.Fatalf("expected instructions-only prompt to point model at tools file, got: %q", finalPrompt)
+	}
+	if !strings.Contains(finalPrompt, "TOOL CALL FORMAT") || !strings.Contains(finalPrompt, "Remember: The ONLY valid way to use tools") {
+		t.Fatalf("expected tool format instructions to remain in live prompt, got: %q", finalPrompt)
+	}
+}
+
+func TestBuildOpenAIToolsContextTranscriptContainsOnlyDescriptions(t *testing.T) {
+	tools := []any{
+		map[string]any{
+			"type": "function",
+			"function": map[string]any{
+				"name":        "search",
+				"description": "search docs",
+				"parameters": map[string]any{
+					"type": "object",
+				},
+			},
+		},
+	}
+
+	transcript, toolNames := BuildOpenAIToolsContextTranscript(tools, DefaultToolChoicePolicy())
+	if len(toolNames) != 1 || toolNames[0] != "search" {
+		t.Fatalf("unexpected tool names: %#v", toolNames)
+	}
+	for _, want := range []string{"# DS2API_TOOLS.txt", "You have access to these tools", "Tool: search", "Description: search docs", `Parameters: {"type":"object"}`} {
+		if !strings.Contains(transcript, want) {
+			t.Fatalf("expected tools transcript to contain %q, got: %q", want, transcript)
+		}
+	}
+	if strings.Contains(transcript, "TOOL CALL FORMAT") || strings.Contains(transcript, "<|DSML|tool_calls>") {
+		t.Fatalf("tools transcript should not duplicate format instructions, got: %q", transcript)
+	}
+}
+
 func TestBuildOpenAIFinalPromptPrependsOutputIntegrityGuard(t *testing.T) {
 	messages := []any{
 		map[string]any{"role": "system", "content": "You are helpful"},
diff --git a/internal/promptcompat/responses_input_items_test.go b/internal/promptcompat/responses_input_items_test.go
index dfc53712..81c21574 100644
--- a/internal/promptcompat/responses_input_items_test.go
+++ b/internal/promptcompat/responses_input_items_test.go
@@ -88,7 +88,7 @@ func TestNormalizeResponsesInputArrayMergesReasoningMessageIntoFunctionCallHisto
 	if !strings.Contains(history, "[reasoning_content]\nneed fresh docs before answering\n[/reasoning_content]") {
 		t.Fatalf("expected reasoning in history transcript, got %q", history)
 	}
-	if !strings.Contains(history, `<｜DSML｜invoke name="search_web">`) {
+	if !strings.Contains(history, `<|DSML|invoke name="search_web">`) {
 		t.Fatalf("expected tool call in history transcript, got %q", history)
 	}
 }
diff --git a/internal/promptcompat/standard_request.go b/internal/promptcompat/standard_request.go
index 76b812d0..b6098004 100644
--- a/internal/promptcompat/standard_request.go
+++ b/internal/promptcompat/standard_request.go
@@ -11,6 +11,8 @@ type StandardRequest struct {
 	HistoryText             string
 	PromptTokenText         string
 	CurrentInputFileApplied bool
+	CurrentInputFileID      string
+	CurrentToolsFileID      string
 	ToolsRaw                any
 	FinalPrompt             string
 	ToolNames               []string
diff --git a/internal/promptcompat/tool_prompt.go b/internal/promptcompat/tool_prompt.go
index 4e5d03f7..4d840761 100644
--- a/internal/promptcompat/tool_prompt.go
+++ b/internal/promptcompat/tool_prompt.go
@@ -9,10 +9,52 @@ import (
 	"ds2api/internal/toolcall"
 )
 
+const CurrentToolsContextFilename = "DS2API_TOOLS.txt"
+
+const toolsTranscriptTitle = "# DS2API_TOOLS.txt"
+const toolsTranscriptSummary = "Available tool descriptions and parameter schemas for this request."
+
+type toolPromptParts struct {
+	Descriptions string
+	Instructions string
+	Names        []string
+}
+
 func injectToolPrompt(messages []map[string]any, tools []any, policy ToolChoicePolicy) ([]map[string]any, []string) {
+	return injectToolPromptWithDescriptions(messages, tools, policy, true)
+}
+
+func injectToolPromptInstructionsOnly(messages []map[string]any, tools []any, policy ToolChoicePolicy) ([]map[string]any, []string) {
+	return injectToolPromptWithDescriptions(messages, tools, policy, false)
+}
+
+func injectToolPromptWithDescriptions(messages []map[string]any, tools []any, policy ToolChoicePolicy, includeDescriptions bool) ([]map[string]any, []string) {
 	if policy.IsNone() {
 		return messages, nil
 	}
+	parts := buildToolPromptParts(tools, policy)
+	if parts.Instructions == "" {
+		return messages, parts.Names
+	}
+	toolPrompt := parts.Instructions
+	if includeDescriptions && parts.Descriptions != "" {
+		toolPrompt = parts.Descriptions + "\n\n" + toolPrompt
+	} else if !includeDescriptions && parts.Descriptions != "" {
+		toolPrompt = "Available tool descriptions and parameter schemas are attached in DS2API_TOOLS.txt. Treat DS2API_TOOLS.txt as the authoritative list of callable tools and schemas; use only tools and parameters listed there.\n\n" + toolPrompt
+	}
+
+	for i := range messages {
+		if messages[i]["role"] == "system" {
+			old, _ := messages[i]["content"].(string)
+			messages[i]["content"] = strings.TrimSpace(old + "\n\n" + toolPrompt)
+			return messages, parts.Names
+		}
+	}
+	messages = append([]map[string]any{{"role": "system", "content": toolPrompt}}, messages...)
+	return messages, parts.Names
+}
+
+func buildToolPromptParts(tools []any, policy ToolChoicePolicy) toolPromptParts {
 	toolSchemas := make([]string, 0, len(tools))
 	names := make([]string, 0, len(tools))
 	isAllowed := func(name string) bool {
@@ -44,29 +86,47 @@ func injectToolPrompt(messages []map[string]any, tools []any, policy ToolChoiceP
 		toolSchemas = append(toolSchemas, fmt.Sprintf("Tool: %s\nDescription: %s\nParameters: %s", name, desc, string(b)))
 	}
 	if len(toolSchemas) == 0 {
-		return messages, names
+		return toolPromptParts{Names: names}
 	}
-	toolPrompt := "You have access to these tools:\n\n" + strings.Join(toolSchemas, "\n\n") + "\n\n" + toolcall.BuildToolCallInstructions(names)
+	descriptions := "You have access to these tools:\n\n" + strings.Join(toolSchemas, "\n\n")
+	instructions := toolcall.BuildToolCallInstructions(names)
 	if hasReadLikeTool(names) {
-		toolPrompt += "\n\nRead-tool cache guard: If a Read/read_file-style tool result says the file is unchanged, already available in history, should be referenced from previous context, or otherwise provides no file body, treat that result as missing content. Do not repeatedly call the same read request for that missing body. Request a full-content read if the tool supports it, or tell the user that the file contents need to be provided again."
+		instructions += "\n\nRead-tool cache guard: If a Read/read_file-style tool result says the file is unchanged, already available in history, should be referenced from previous context, or otherwise provides no file body, treat that result as missing content. Do not repeatedly call the same read request for that missing body. Request a full-content read if the tool supports it, or tell the user that the file contents need to be provided again."
 	}
 	if policy.Mode == ToolChoiceRequired {
-		toolPrompt += "\n7) For this response, you MUST call at least one tool from the allowed list."
+		instructions += "\n7) For this response, you MUST call at least one tool from the allowed list."
 	}
 	if policy.Mode == ToolChoiceForced && strings.TrimSpace(policy.ForcedName) != "" {
-		toolPrompt += "\n7) For this response, you MUST call exactly this tool name: " + strings.TrimSpace(policy.ForcedName)
-		toolPrompt += "\n8) Do not call any other tool."
+		instructions += "\n7) For this response, you MUST call exactly this tool name: " + strings.TrimSpace(policy.ForcedName)
+		instructions += "\n8) Do not call any other tool."
 	}
+	return toolPromptParts{
+		Descriptions: descriptions,
+		Instructions: instructions,
+		Names:        names,
+	}
+}
 
-	for i := range messages {
-		if messages[i]["role"] == "system" {
-			old, _ := messages[i]["content"].(string)
-			messages[i]["content"] = strings.TrimSpace(old + "\n\n" + toolPrompt)
-			return messages, names
-		}
+func BuildOpenAIToolsContextTranscript(toolsRaw any, policy ToolChoicePolicy) (string, []string) {
+	if policy.IsNone() {
+		return "", nil
 	}
-	messages = append([]map[string]any{{"role": "system", "content": toolPrompt}}, messages...)
-	return messages, names
+	tools, ok := toolsRaw.([]any)
+	if !ok || len(tools) == 0 {
+		return "", nil
+	}
+	parts := buildToolPromptParts(tools, policy)
+	if strings.TrimSpace(parts.Descriptions) == "" {
+		return "", parts.Names
+	}
+	var b strings.Builder
+	b.WriteString(toolsTranscriptTitle)
+	b.WriteString("\n")
+	b.WriteString(toolsTranscriptSummary)
+	b.WriteString("\n\n")
+	b.WriteString(parts.Descriptions)
+	b.WriteString("\n")
+	return b.String(), parts.Names
 }
 
 func hasReadLikeTool(names []string) bool {
diff --git a/internal/toolcall/tool_prompt.go b/internal/toolcall/tool_prompt.go
index 9f278e5d..c9b4a519 100644
--- a/internal/toolcall/tool_prompt.go
+++ b/internal/toolcall/tool_prompt.go
@@ -11,19 +11,19 @@ import "strings"
 func BuildToolCallInstructions(toolNames []string) string {
 	return `TOOL CALL FORMAT — FOLLOW EXACTLY:
 
-<｜DSML｜tool_calls>
-  <｜DSML｜invoke name="TOOL_NAME_HERE">
-    <｜DSML｜parameter name="PARAMETER_NAME"><![CDATA[PARAMETER_VALUE]]></｜DSML｜parameter>
-  </｜DSML｜invoke>
-</｜DSML｜tool_calls>
+<|DSML|tool_calls>
+  <|DSML|invoke name="TOOL_NAME_HERE">
+    <|DSML|parameter name="PARAMETER_NAME"><![CDATA[PARAMETER_VALUE]]></|DSML|parameter>
+  </|DSML|invoke>
+</|DSML|tool_calls>
 
 RULES:
-1) Use the <｜DSML｜tool_calls> wrapper format.
-2) Put one or more <｜DSML｜invoke> entries under a single <｜DSML｜tool_calls> root.
-3) Put the tool name in the invoke name attribute: <｜DSML｜invoke name="TOOL_NAME">.
-3a) Tag punctuation alphabet: ASCII < > / = " plus the fullwidth vertical bar ｜.
+1) Use the <|DSML|tool_calls> wrapper format.
+2) Put one or more <|DSML|invoke> entries under a single <|DSML|tool_calls> root.
+3) Put the tool name in the invoke name attribute: <|DSML|invoke name="TOOL_NAME">.
+3a) Tag punctuation alphabet: ASCII < > / = " plus the halfwidth pipe |.
 4) All string values must use <![CDATA[...]]>, even short ones. This includes code, scripts, file contents, prompts, paths, names, and queries.
-5) Every top-level argument must be a <｜DSML｜parameter name="ARG_NAME">...</｜DSML｜parameter> node.
+5) Every top-level argument must be a <|DSML|parameter name="ARG_NAME">...</|DSML|parameter> node.
 6) Objects use nested XML elements inside the parameter body. Arrays may repeat <item> children.
 7) Numbers, booleans, and null stay plain text.
 8) Use only the parameter names in the tool schema. Do not invent fields.
@@ -31,35 +31,35 @@ RULES:
 10) If a required parameter value is unknown, ask the user or answer normally instead of outputting an empty tool call.
 11) For shell tools such as Bash / execute_command, the command/script must be inside the command parameter. Never call them with an empty command.
 12) Do NOT wrap XML in markdown fences. Do NOT output explanations, role markers, or internal monologue.
-13) If you call a tool, the first non-whitespace characters of that tool block must be exactly <｜DSML｜tool_calls>.
-14) Never omit the opening <｜DSML｜tool_calls> tag, even if you already plan to close with </｜DSML｜tool_calls>.
+13) If you call a tool, the first non-whitespace characters of that tool block must be exactly <|DSML|tool_calls>.
+14) Never omit the opening <|DSML|tool_calls> tag, even if you already plan to close with </|DSML|tool_calls>.
 15) Compatibility note: the runtime also accepts the legacy XML tags <tool_calls> / <invoke> / <parameter>, but prefer the DSML-prefixed form above.
 
 PARAMETER SHAPES:
-- string => <｜DSML｜parameter name="x"><![CDATA[value]]></｜DSML｜parameter>
-- object => <｜DSML｜parameter name="x"><field>...</field></｜DSML｜parameter>
-- array => <｜DSML｜parameter name="x"><item>...</item><item>...</item></｜DSML｜parameter>
-- number/bool/null => <｜DSML｜parameter name="x">plain_text</｜DSML｜parameter>
+- string => <|DSML|parameter name="x"><![CDATA[value]]></|DSML|parameter>
+- object => <|DSML|parameter name="x"><field>...</field></|DSML|parameter>
+- array => <|DSML|parameter name="x"><item>...</item><item>...</item></|DSML|parameter>
+- number/bool/null => <|DSML|parameter name="x">plain_text</|DSML|parameter>
 
 【WRONG — Do NOT do these】:
 
 Wrong 1 — mixed text after XML:
-  <｜DSML｜tool_calls>...</｜DSML｜tool_calls> I hope this helps.
+  <|DSML|tool_calls>...</|DSML|tool_calls> I hope this helps.
 Wrong 2 — Markdown code fences:
   ` + "```xml" + `
-  <｜DSML｜tool_calls>...</｜DSML｜tool_calls>
+  <|DSML|tool_calls>...</|DSML|tool_calls>
   ` + "```" + `
 Wrong 3 — missing opening wrapper:
-  <｜DSML｜invoke name="TOOL_NAME">...</｜DSML｜invoke>
-  </｜DSML｜tool_calls>
+  <|DSML|invoke name="TOOL_NAME">...</|DSML|invoke>
+  </|DSML|tool_calls>
 Wrong 4 — empty parameters:
-  <｜DSML｜tool_calls>
-    <｜DSML｜invoke name="Bash">
-      <｜DSML｜parameter name="command"></｜DSML｜parameter>
-    </｜DSML｜invoke>
-  </｜DSML｜tool_calls>
+  <|DSML|tool_calls>
+    <|DSML|invoke name="Bash">
+      <|DSML|parameter name="command"></|DSML|parameter>
+    </|DSML|invoke>
+  </|DSML|tool_calls>
 
-Remember: The ONLY valid way to use tools is the <｜DSML｜tool_calls>...</｜DSML｜tool_calls> block at the end of your response.
+Remember: The ONLY valid way to use tools is the <|DSML|tool_calls>...</|DSML|tool_calls> block at the end of your response.
 ` + buildCorrectToolExamples(toolNames)
 }
 
@@ -150,21 +150,21 @@ func firstScriptExample(names []string) (promptToolExample, bool) {
 
 func renderToolExampleBlock(calls []promptToolExample) string {
 	var b strings.Builder
-	b.WriteString("<｜DSML｜tool_calls>\n")
+	b.WriteString("<|DSML|tool_calls>\n")
 	for _, call := range calls {
-		b.WriteString(`  <｜DSML｜invoke name="`)
+		b.WriteString(`  <|DSML|invoke name="`)
 		b.WriteString(call.name)
 		b.WriteString(`">` + "\n")
 		b.WriteString(indentPromptParameters(call.params, "    "))
-		b.WriteString("\n  </｜DSML｜invoke>\n")
+		b.WriteString("\n  </|DSML|invoke>\n")
 	}
-	b.WriteString("</｜DSML｜tool_calls>")
+	b.WriteString("</|DSML|tool_calls>")
 	return b.String()
 }
 
 func indentPromptParameters(body, indent string) string {
 	if strings.TrimSpace(body) == "" {
-		return indent + `<｜DSML｜parameter name="content"></｜DSML｜parameter>`
+		return indent + `<|DSML|parameter name="content"></|DSML|parameter>`
 	}
 	lines := strings.Split(body, "\n")
 	for i, line := range lines {
@@ -178,7 +178,7 @@ func indentPromptParameters(body, indent string) string {
 }
 
 func wrapParameter(name, inner string) string {
-	return `<｜DSML｜parameter name="` + name + `">` + inner + `</｜DSML｜parameter>`
+	return `<|DSML|parameter name="` + name + `">` + inner + `</|DSML|parameter>`
 }
 
 func exampleBasicParams(name string) (string, bool) {
@@ -204,7 +204,7 @@ func exampleBasicParams(name string) (string, bool) {
 	case "Edit":
 		return wrapParameter("file_path", promptCDATA("README.md")) + "\n" + wrapParameter("old_string", promptCDATA("foo")) + "\n" + wrapParameter("new_string", promptCDATA("bar")), true
 	case "MultiEdit":
-		return wrapParameter("file_path", promptCDATA("README.md")) + "\n" + `<｜DSML｜parameter name="edits"><item><old_string>` + promptCDATA("foo") + `</old_string><new_string>` + promptCDATA("bar") + `</new_string></item></｜DSML｜parameter>`, true
+		return wrapParameter("file_path", promptCDATA("README.md")) + "\n" + `<|DSML|parameter name="edits"><item><old_string>` + promptCDATA("foo") + `</old_string><new_string>` + promptCDATA("bar") + `</new_string></item></|DSML|parameter>`, true
 	}
 	return "", false
 }
@@ -212,11 +212,11 @@ func exampleBasicParams(name string) (string, bool) {
 func exampleNestedParams(name string) (string, bool) {
 	switch strings.TrimSpace(name) {
 	case "MultiEdit":
-		return wrapParameter("file_path", promptCDATA("README.md")) + "\n" + `<｜DSML｜parameter name="edits"><item><old_string>` + promptCDATA("foo") + `</old_string><new_string>` + promptCDATA("bar") + `</new_string></item></｜DSML｜parameter>`, true
+		return wrapParameter("file_path", promptCDATA("README.md")) + "\n" + `<|DSML|parameter name="edits"><item><old_string>` + promptCDATA("foo") + `</old_string><new_string>` + promptCDATA("bar") + `</new_string></item></|DSML|parameter>`, true
 	case "Task":
 		return wrapParameter("description", promptCDATA("Investigate flaky tests")) + "\n" + wrapParameter("prompt", promptCDATA("Run targeted tests and summarize failures")), true
 	case "ask_followup_question":
-		return wrapParameter("question", promptCDATA("Which approach do you prefer?")) + "\n" + `<｜DSML｜parameter name="follow_up"><item><text>` + promptCDATA("Option A") + `</text></item><item><text>` + promptCDATA("Option B") + `</text></item></｜DSML｜parameter>`, true
+		return wrapParameter("question", promptCDATA("Which approach do you prefer?")) + "\n" + `<|DSML|parameter name="follow_up"><item><text>` + promptCDATA("Option A") + `</text></item><item><text>` + promptCDATA("Option B") + `</text></item></|DSML|parameter>`, true
 	}
 	return "", false
 }
diff --git a/internal/toolcall/tool_prompt_test.go b/internal/toolcall/tool_prompt_test.go
index 1c3757c1..66bbe7a2 100644
--- a/internal/toolcall/tool_prompt_test.go
+++ b/internal/toolcall/tool_prompt_test.go
@@ -7,20 +7,20 @@ import (
 
 func TestBuildToolCallInstructions_ExecCommandUsesCmdExample(t *testing.T) {
 	out := BuildToolCallInstructions([]string{"exec_command"})
-	if !strings.Contains(out, `<｜DSML｜invoke name="exec_command">`) {
+	if !strings.Contains(out, `<|DSML|invoke name="exec_command">`) {
 		t.Fatalf("expected exec_command in examples, got: %s", out)
 	}
-	if !strings.Contains(out, `<｜DSML｜parameter name="cmd"><![CDATA[pwd]]></｜DSML｜parameter>`) {
+	if !strings.Contains(out, `<|DSML|parameter name="cmd"><![CDATA[pwd]]></|DSML|parameter>`) {
 		t.Fatalf("expected cmd parameter example for exec_command, got: %s", out)
 	}
 }
 
 func TestBuildToolCallInstructions_ExecuteCommandUsesCommandExample(t *testing.T) {
 	out := BuildToolCallInstructions([]string{"execute_command"})
-	if !strings.Contains(out, `<｜DSML｜invoke name="execute_command">`) {
+	if !strings.Contains(out, `<|DSML|invoke name="execute_command">`) {
 		t.Fatalf("expected execute_command in examples, got: %s", out)
 	}
-	if !strings.Contains(out, `<｜DSML｜parameter name="command"><![CDATA[pwd]]></｜DSML｜parameter>`) {
+	if !strings.Contains(out, `<|DSML|parameter name="command"><![CDATA[pwd]]></|DSML|parameter>`) {
 		t.Fatalf("expected command parameter example for execute_command, got: %s", out)
 	}
 }
@@ -34,20 +34,20 @@ func TestBuildToolCallInstructions_BashUsesCommandAndDescriptionExamples(t *test
 
 	sawDescription := false
 	for _, block := range blocks {
-		if !strings.Contains(block, `<｜DSML｜parameter name="command">`) {
+		if !strings.Contains(block, `<|DSML|parameter name="command">`) {
 			t.Fatalf("expected every Bash example to use command parameter, got: %s", block)
 		}
-		if strings.Contains(block, `<｜DSML｜parameter name="path">`) || strings.Contains(block, `<｜DSML｜parameter name="content">`) {
+		if strings.Contains(block, `<|DSML|parameter name="path">`) || strings.Contains(block, `<|DSML|parameter name="content">`) {
 			t.Fatalf("expected Bash examples not to use file write parameters, got: %s", block)
 		}
-		if strings.Contains(block, `<｜DSML｜parameter name="description">`) {
+		if strings.Contains(block, `<|DSML|parameter name="description">`) {
 			sawDescription = true
 		}
 	}
 	if !sawDescription {
 		t.Fatalf("expected Bash long-script example to include description, got: %s", out)
 	}
-	if strings.Contains(out, `<｜DSML｜invoke name="Read">`) {
+	if strings.Contains(out, `<|DSML|invoke name="Read">`) {
 		t.Fatalf("expected examples to avoid unavailable hard-coded Read tool, got: %s", out)
 	}
 }
@@ -60,10 +60,10 @@ func TestBuildToolCallInstructions_ExecuteCommandLongScriptUsesCommand(t *testin
 	}
 
 	for _, block := range blocks {
-		if !strings.Contains(block, `<｜DSML｜parameter name="command">`) {
+		if !strings.Contains(block, `<|DSML|parameter name="command">`) {
 			t.Fatalf("expected execute_command examples to use command parameter, got: %s", block)
 		}
-		if strings.Contains(block, `<｜DSML｜parameter name="path">`) || strings.Contains(block, `<｜DSML｜parameter name="content">`) {
+		if strings.Contains(block, `<|DSML|parameter name="path">`) || strings.Contains(block, `<|DSML|parameter name="content">`) {
 			t.Fatalf("expected execute_command examples not to use file write parameters, got: %s", block)
 		}
 	}
@@ -80,10 +80,10 @@ func TestBuildToolCallInstructions_ExecCommandLongScriptUsesCmd(t *testing.T) {
 	}
 
 	for _, block := range blocks {
-		if !strings.Contains(block, `<｜DSML｜parameter name="cmd">`) {
+		if !strings.Contains(block, `<|DSML|parameter name="cmd">`) {
 			t.Fatalf("expected exec_command examples to use cmd parameter, got: %s", block)
 		}
-		if strings.Contains(block, `<｜DSML｜parameter name="command">`) || strings.Contains(block, `<｜DSML｜parameter name="path">`) || strings.Contains(block, `<｜DSML｜parameter name="content">`) {
+		if strings.Contains(block, `<|DSML|parameter name="command">`) || strings.Contains(block, `<|DSML|parameter name="path">`) || strings.Contains(block, `<|DSML|parameter name="content">`) {
 			t.Fatalf("expected exec_command examples not to use command or file write parameters, got: %s", block)
 		}
 	}
@@ -100,10 +100,10 @@ func TestBuildToolCallInstructions_WriteUsesFilePathAndContent(t *testing.T) {
 	}
 
 	for _, block := range blocks {
-		if !strings.Contains(block, `<｜DSML｜parameter name="file_path">`) || !strings.Contains(block, `<｜DSML｜parameter name="content">`) {
+		if !strings.Contains(block, `<|DSML|parameter name="file_path">`) || !strings.Contains(block, `<|DSML|parameter name="content">`) {
 			t.Fatalf("expected Write examples to use file_path and content, got: %s", block)
 		}
-		if strings.Contains(block, `<｜DSML｜parameter name="path">`) {
+		if strings.Contains(block, `<|DSML|parameter name="path">`) {
 			t.Fatalf("expected Write examples not to use path, got: %s", block)
 		}
 	}
@@ -111,7 +111,7 @@ func TestBuildToolCallInstructions_WriteUsesFilePathAndContent(t *testing.T) {
 
 func TestBuildToolCallInstructions_AnchorsMissingOpeningWrapperFailureMode(t *testing.T) {
 	out := BuildToolCallInstructions([]string{"read_file"})
-	if !strings.Contains(out, "Never omit the opening <｜DSML｜tool_calls> tag") {
+	if !strings.Contains(out, "Never omit the opening <|DSML|tool_calls> tag") {
 		t.Fatalf("expected explicit missing-opening-tag warning, got: %s", out)
 	}
 	if !strings.Contains(out, "Wrong 3 — missing opening wrapper") {
@@ -135,7 +135,7 @@ func TestBuildToolCallInstructions_RejectsEmptyParametersInPrompt(t *testing.T)
 
 func TestBuildToolCallInstructions_UsesPositiveTagPunctuationAlphabet(t *testing.T) {
 	out := BuildToolCallInstructions([]string{"Bash"})
-	want := `Tag punctuation alphabet: ASCII < > / = " plus the fullwidth vertical bar ｜.`
+	want := `Tag punctuation alphabet: ASCII < > / = " plus the halfwidth pipe |.`
 	if !strings.Contains(out, want) {
 		t.Fatalf("expected positive tag punctuation alphabet %q, got: %s", want, out)
 	}
@@ -147,7 +147,7 @@ func TestBuildToolCallInstructions_UsesPositiveTagPunctuationAlphabet(t *testing
 }
 
 func findInvokeBlocks(text, name string) []string {
-	open := `<｜DSML｜invoke name="` + name + `">`
+	open := `<|DSML|invoke name="` + name + `">`
 	remaining := text
 	blocks := []string{}
 	for {
@@ -156,11 +156,11 @@ func findInvokeBlocks(text, name string) []string {
 			return blocks
 		}
 		remaining = remaining[start:]
-		end := strings.Index(remaining, `</｜DSML｜invoke>`)
+		end := strings.Index(remaining, `</|DSML|invoke>`)
 		if end < 0 {
 			return blocks
 		}
-		end += len(`</｜DSML｜invoke>`)
+		end += len(`</|DSML|invoke>`)
 		blocks = append(blocks, remaining[:end])
 		remaining = remaining[end:]
 	}
diff --git a/internal/toolcall/toolcalls_candidates.go b/internal/toolcall/toolcalls_candidates.go
index 6fb5a8c7..187d61a9 100644
--- a/internal/toolcall/toolcalls_candidates.go
+++ b/internal/toolcall/toolcalls_candidates.go
@@ -1,4 +1,687 @@
 package toolcall
 
-// toolcalls_candidates.go is reserved for tool-call candidate helper logic.
-// It exists to satisfy the refactor line gate target list.
+import (
+	"strings"
+	"unicode"
+	"unicode/utf8"
+)
+
+type canonicalToolMarkupAttr struct {
+	Key   string
+	Value string
+}
+
+func canonicalizeToolCallCandidateSpans(text string) string {
+	if text == "" {
+		return ""
+	}
+	var b strings.Builder
+	b.Grow(len(text))
+	for i := 0; i < len(text); {
+		next, advanced, blocked := skipXMLIgnoredSection(text, i)
+		if blocked {
+			b.WriteString(text[i:])
+			break
+		}
+		if advanced {
+			b.WriteString(text[i:next])
+			i = next
+			continue
+		}
+		tag, ok := scanToolMarkupTagAt(text, i)
+		if !ok {
+			b.WriteByte(text[i])
+			i++
+			continue
+		}
+		b.WriteString(canonicalizeRecognizedToolMarkupTag(text[tag.Start:tag.End+1], tag))
+		i = tag.End + 1
+	}
+	return b.String()
+}
+
+func canonicalizeRecognizedToolMarkupTag(raw string, tag ToolMarkupTag) string {
+	if raw == "" {
+		return raw
+	}
+	idx := 0
+	if delimLen := xmlTagStartDelimiterLenAt(raw, idx); delimLen > 0 {
+		idx += delimLen
+	}
+	for {
+		idx = skipToolMarkupIgnorables(raw, idx)
+		if delimLen := xmlTagStartDelimiterLenAt(raw, idx); delimLen > 0 {
+			idx += delimLen
+			continue
+		}
+		break
+	}
+	idx = skipToolMarkupIgnorables(raw, idx)
+	if tag.Closing {
+		if next, ok := consumeToolMarkupClosingSlash(raw, idx); ok {
+			idx = next
+		}
+	}
+	idx, _ = consumeToolMarkupNamePrefix(raw, idx)
+	afterName, ok := consumeToolKeyword(raw, idx, rawNameForTag(tag))
+	if !ok {
+		afterName = idx
+	}
+
+	attrs := parseCanonicalToolMarkupAttrs(raw, afterName)
+
+	var b strings.Builder
+	b.Grow(len(raw) + 8)
+	b.WriteByte('<')
+	if tag.Closing {
+		b.WriteByte('/')
+	}
+	if tag.DSMLLike {
+		b.WriteString("|DSML|")
+	}
+	b.WriteString(tag.Name)
+	for _, attr := range attrs {
+		if attr.Key == "" {
+			continue
+		}
+		b.WriteByte(' ')
+		b.WriteString(attr.Key)
+		b.WriteString(`="`)
+		b.WriteString(quoteCanonicalXMLAttrValue(attr.Value))
+		b.WriteByte('"')
+	}
+	if tag.SelfClosing {
+		b.WriteByte('/')
+	}
+	b.WriteByte('>')
+	return b.String()
+}
+
+func rawNameForTag(tag ToolMarkupTag) string {
+	for _, name := range toolMarkupNames {
+		if name.canonical == tag.Name {
+			return name.raw
+		}
+	}
+	return tag.Name
+}
+
+func parseCanonicalToolMarkupAttrs(raw string, idx int) []canonicalToolMarkupAttr {
+	if raw == "" || idx >= len(raw) {
+		return nil
+	}
+	var out []canonicalToolMarkupAttr
+	for idx < len(raw) {
+		idx = skipToolMarkupIgnorables(raw, idx)
+		if idx >= len(raw) {
+			break
+		}
+		if spacingLen := toolMarkupWhitespaceLikeLenAt(raw, idx); spacingLen > 0 {
+			idx += spacingLen
+			continue
+		}
+		if xmlTagEndDelimiterLenAt(raw, idx) > 0 {
+			break
+		}
+		if next, ok := consumeToolMarkupPipe(raw, idx); ok {
+			idx = next
+			continue
+		}
+		if next, ok := consumeToolMarkupClosingSlash(raw, idx); ok {
+			idx = next
+			continue
+		}
+
+		keyStart := idx
+		for idx < len(raw) {
+			idx = skipToolMarkupIgnorables(raw, idx)
+			if idx >= len(raw) {
+				break
+			}
+			if spacingLen := toolMarkupWhitespaceLikeLenAt(raw, idx); spacingLen > 0 {
+				break
+			}
+			if toolMarkupEqualsLenAt(raw, idx) > 0 || xmlTagEndDelimiterLenAt(raw, idx) > 0 {
+				break
+			}
+			if _, ok := consumeToolMarkupPipe(raw, idx); ok {
+				break
+			}
+			if _, ok := consumeToolMarkupClosingSlash(raw, idx); ok {
+				break
+			}
+			_, size := utf8.DecodeRuneInString(raw[idx:])
+			if size <= 0 {
+				idx++
+			} else {
+				idx += size
+			}
+		}
+		keyEnd := idx
+		key := normalizeCanonicalToolAttrKey(raw[keyStart:keyEnd])
+		idx = skipToolMarkupIgnorables(raw, idx)
+		for {
+			spacingLen := toolMarkupWhitespaceLikeLenAt(raw, idx)
+			if spacingLen == 0 {
+				break
+			}
+			idx += spacingLen
+			idx = skipToolMarkupIgnorables(raw, idx)
+		}
+		if eqLen := toolMarkupEqualsLenAt(raw, idx); eqLen > 0 {
+			idx += eqLen
+		} else {
+			continue
+		}
+		idx = skipToolMarkupIgnorables(raw, idx)
+		for {
+			spacingLen := toolMarkupWhitespaceLikeLenAt(raw, idx)
+			if spacingLen == 0 {
+				break
+			}
+			idx += spacingLen
+			idx = skipToolMarkupIgnorables(raw, idx)
+		}
+		if key == "" {
+			_, size := utf8.DecodeRuneInString(raw[idx:])
+			if size <= 0 {
+				idx++
+			} else {
+				idx += size
+			}
+			continue
+		}
+
+		value := ""
+		if quote, quoteLen := xmlQuotePairAt(raw, idx); quoteLen > 0 {
+			valueStart := idx + quoteLen
+			idx = valueStart
+			for idx < len(raw) {
+				if closeLen := xmlQuoteCloseDelimiterLenAt(raw, idx, quote); closeLen > 0 {
+					value = raw[valueStart:idx]
+					idx += closeLen
+					break
+				}
+				_, size := utf8.DecodeRuneInString(raw[idx:])
+				if size <= 0 {
+					idx++
+				} else {
+					idx += size
+				}
+			}
+		} else {
+			valueStart := idx
+			for idx < len(raw) {
+				if spacingLen := toolMarkupWhitespaceLikeLenAt(raw, idx); spacingLen > 0 {
+					break
+				}
+				if xmlTagEndDelimiterLenAt(raw, idx) > 0 || toolMarkupEqualsLenAt(raw, idx) > 0 {
+					break
+				}
+				if _, ok := consumeToolMarkupPipe(raw, idx); ok {
+					break
+				}
+				if _, ok := consumeToolMarkupClosingSlash(raw, idx); ok {
+					break
+				}
+				_, size := utf8.DecodeRuneInString(raw[idx:])
+				if size <= 0 {
+					idx++
+				} else {
+					idx += size
+				}
+			}
+			value = raw[valueStart:idx]
+		}
+
+		out = append(out, canonicalToolMarkupAttr{
+			Key:   key,
+			Value: value,
+		})
+	}
+	return out
+}
+
+func normalizeCanonicalToolAttrKey(raw string) string {
+	trimmed := strings.TrimSpace(removeToolMarkupIgnorables(raw))
+	if trimmed == "" {
+		return ""
+	}
+	if next, ok := consumeToolKeyword(trimmed, 0, "name"); ok {
+		if skipToolMarkupIgnorables(trimmed, next) == len(trimmed) {
+			return "name"
+		}
+	}
+	return ""
+}
+
+func quoteCanonicalXMLAttrValue(raw string) string {
+	if raw == "" {
+		return ""
+	}
+	return strings.ReplaceAll(raw, `"`, "&quot;")
+}
+
+func removeToolMarkupIgnorables(raw string) string {
+	if raw == "" {
+		return ""
+	}
+	var b strings.Builder
+	b.Grow(len(raw))
+	for i := 0; i < len(raw); {
+		if ignorableLen := toolMarkupIgnorableLenAt(raw, i); ignorableLen > 0 {
+			i += ignorableLen
+			continue
+		}
+		r, size := utf8.DecodeRuneInString(raw[i:])
+		if size <= 0 {
+			b.WriteByte(raw[i])
+			i++
+			continue
+		}
+		b.WriteRune(r)
+		i += size
+	}
+	return b.String()
+}
+
+func skipToolMarkupIgnorables(text string, idx int) int {
+	for idx < len(text) {
+		if ignorableLen := toolMarkupIgnorableLenAt(text, idx); ignorableLen > 0 {
+			idx += ignorableLen
+			continue
+		}
+		break
+	}
+	return idx
+}
+
+func toolMarkupIgnorableLenAt(text string, idx int) int {
+	if idx < 0 || idx >= len(text) {
+		return 0
+	}
+	r, size := utf8.DecodeRuneInString(text[idx:])
+	if size <= 0 {
+		return 0
+	}
+	if unicode.Is(unicode.Cf, r) {
+		return size
+	}
+	if unicode.IsControl(r) && !unicode.IsSpace(r) {
+		return size
+	}
+	return 0
+}
+
+func toolMarkupEqualsLenAt(text string, idx int) int {
+	idx = skipToolMarkupIgnorables(text, idx)
+	if idx < 0 || idx >= len(text) {
+		return 0
+	}
+	switch {
+	case text[idx] == '=':
+		return 1
+	case strings.HasPrefix(text[idx:], "＝"):
+		return len("＝")
+	case strings.HasPrefix(text[idx:], "﹦"):
+		return len("﹦")
+	case strings.HasPrefix(text[idx:], "꞊"):
+		return len("꞊")
+	default:
+		return 0
+	}
+}
+
+func toolMarkupDashLenAt(text string, idx int) int {
+	idx = skipToolMarkupIgnorables(text, idx)
+	if idx < 0 || idx >= len(text) {
+		return 0
+	}
+	switch {
+	case text[idx] == '-':
+		return 1
+	case strings.HasPrefix(text[idx:], "‐"):
+		return len("‐")
+	case strings.HasPrefix(text[idx:], "‑"):
+		return len("‑")
+	case strings.HasPrefix(text[idx:], "‒"):
+		return len("‒")
+	case strings.HasPrefix(text[idx:], "–"):
+		return len("–")
+	case strings.HasPrefix(text[idx:], "—"):
+		return len("—")
+	case strings.HasPrefix(text[idx:], "―"):
+		return len("―")
+	case strings.HasPrefix(text[idx:], "−"):
+		return len("−")
+	case strings.HasPrefix(text[idx:], "﹣"):
+		return len("﹣")
+	case strings.HasPrefix(text[idx:], "－"):
+		return len("－")
+	default:
+		return 0
+	}
+}
+
+func toolMarkupUnderscoreLenAt(text string, idx int) int {
+	idx = skipToolMarkupIgnorables(text, idx)
+	if idx < 0 || idx >= len(text) {
+		return 0
+	}
+	switch {
+	case text[idx] == '_':
+		return 1
+	case strings.HasPrefix(text[idx:], "＿"):
+		return len("＿")
+	case strings.HasPrefix(text[idx:], "﹍"):
+		return len("﹍")
+	case strings.HasPrefix(text[idx:], "﹎"):
+		return len("﹎")
+	case strings.HasPrefix(text[idx:], "﹏"):
+		return len("﹏")
+	default:
+		return 0
+	}
+}
+
+func consumeToolKeyword(text string, idx int, keyword string) (int, bool) {
+	next := idx
+	for i := 0; i < len(keyword); i++ {
+		next = skipToolMarkupIgnorables(text, next)
+		if next >= len(text) {
+			return idx, false
+		}
+		target := asciiLower(keyword[i])
+		switch target {
+		case '_':
+			if underscoreLen := toolMarkupUnderscoreLenAt(text, next); underscoreLen > 0 {
+				next += underscoreLen
+				continue
+			}
+			return idx, false
+		case '-':
+			if dashLen := toolMarkupDashLenAt(text, next); dashLen > 0 {
+				next += dashLen
+				continue
+			}
+			return idx, false
+		default:
+			r, size := utf8.DecodeRuneInString(text[next:])
+			if size <= 0 {
+				return idx, false
+			}
+			folded, ok := foldToolKeywordRune(r)
+			if !ok || folded != target {
+				return idx, false
+			}
+			next += size
+		}
+	}
+	return next, true
+}
+
+func foldToolKeywordRune(r rune) (byte, bool) {
+	if r >= 'Ａ' && r <= 'Ｚ' {
+		r = r - 'Ａ' + 'A'
+	}
+	if r >= 'ａ' && r <= 'ｚ' {
+		r = r - 'ａ' + 'a'
+	}
+	r = unicode.ToLower(r)
+	switch r {
+	case 'a', 'c', 'd', 'e', 'i', 'k', 'l', 'm', 'n', 'o', 'p', 'r', 's', 't', 'v':
+		return byte(r), true
+	case 'а', 'Α', 'α':
+		return 'a', true
+	case 'с', 'С', 'ϲ', 'Ϲ':
+		return 'c', true
+	case 'ԁ', 'ⅾ':
+		return 'd', true
+	case 'е', 'Е', 'Ε', 'ε':
+		return 'e', true
+	case 'і', 'І', 'Ι', 'ι', 'ı':
+		return 'i', true
+	case 'к', 'К', 'Κ', 'κ':
+		return 'k', true
+	case 'ⅼ':
+		return 'l', true
+	case 'м', 'М', 'Μ', 'μ':
+		return 'm', true
+	case 'ո':
+		return 'n', true
+	case 'о', 'О', 'Ο', 'ο':
+		return 'o', true
+	case 'р', 'Р', 'Ρ', 'ρ':
+		return 'p', true
+	case 'ѕ', 'Ѕ':
+		return 's', true
+	case 'т', 'Т', 'Τ', 'τ':
+		return 't', true
+	case 'ν', 'Ν', 'ѵ', 'ⅴ':
+		return 'v', true
+	default:
+		return 0, false
+	}
+}
+
+func toolMarkupWhitespaceLikeLenAt(text string, idx int) int {
+	idx = skipToolMarkupIgnorables(text, idx)
+	if idx < 0 || idx >= len(text) {
+		return 0
+	}
+	switch text[idx] {
+	case ' ', '\t', '\n', '\r':
+		return 1
+	}
+	if strings.HasPrefix(text[idx:], "▁") {
+		return len("▁")
+	}
+	r, size := utf8.DecodeRuneInString(text[idx:])
+	if size > 0 && unicode.IsSpace(r) {
+		return size
+	}
+	return 0
+}
+
+func consumeToolMarkupPipe(text string, idx int) (int, bool) {
+	idx = skipToolMarkupIgnorables(text, idx)
+	if idx >= len(text) {
+		return idx, false
+	}
+	switch {
+	case text[idx] == '|':
+		return idx + 1, true
+	case strings.HasPrefix(text[idx:], "│"):
+		return idx + len("│"), true
+	case strings.HasPrefix(text[idx:], "∣"):
+		return idx + len("∣"), true
+	case strings.HasPrefix(text[idx:], "❘"):
+		return idx + len("❘"), true
+	case strings.HasPrefix(text[idx:], "ǀ"):
+		return idx + len("ǀ"), true
+	case strings.HasPrefix(text[idx:], "￨"):
+		return idx + len("￨"), true
+	default:
+		return idx, false
+	}
+}
+
+func consumeToolMarkupClosingSlash(text string, idx int) (int, bool) {
+	idx = skipToolMarkupIgnorables(text, idx)
+	if idx >= len(text) {
+		return idx, false
+	}
+	switch {
+	case text[idx] == '/':
+		return idx + 1, true
+	case strings.HasPrefix(text[idx:], "／"):
+		return idx + len("／"), true
+	case strings.HasPrefix(text[idx:], "∕"):
+		return idx + len("∕"), true
+	case strings.HasPrefix(text[idx:], "⁄"):
+		return idx + len("⁄"), true
+	case strings.HasPrefix(text[idx:], "⧸"):
+		return idx + len("⧸"), true
+	default:
+		return idx, false
+	}
+}
+
+func xmlTagStartDelimiterLenAt(text string, idx int) int {
+	idx = skipToolMarkupIgnorables(text, idx)
+	if idx < 0 || idx >= len(text) {
+		return 0
+	}
+	switch {
+	case text[idx] == '<':
+		return 1
+	case strings.HasPrefix(text[idx:], "＜"):
+		return len("＜")
+	case strings.HasPrefix(text[idx:], "﹤"):
+		return len("﹤")
+	case strings.HasPrefix(text[idx:], "〈"):
+		return len("〈")
+	default:
+		return 0
+	}
+}
+
+func xmlTagEndDelimiterLenAt(text string, idx int) int {
+	idx = skipToolMarkupIgnorables(text, idx)
+	if idx < 0 || idx >= len(text) {
+		return 0
+	}
+	switch {
+	case text[idx] == '>':
+		return 1
+	case strings.HasPrefix(text[idx:], "＞"):
+		return len("＞")
+	case strings.HasPrefix(text[idx:], "﹥"):
+		return len("﹥")
+	case strings.HasPrefix(text[idx:], "〉"):
+		return len("〉")
+	default:
+		return 0
+	}
+}
+
+func xmlTagEndDelimiterLenEndingAt(text string, end int) int {
+	if end < 0 || end >= len(text) {
+		return 0
+	}
+	if text[end] == '>' {
+		return 1
+	}
+	if end+1 >= len("＞") && text[end+1-len("＞"):end+1] == "＞" {
+		return len("＞")
+	}
+	return 0
+}
+
+func xmlQuotePairAt(text string, idx int) (string, int) {
+	idx = skipToolMarkupIgnorables(text, idx)
+	if idx < 0 || idx >= len(text) {
+		return "", 0
+	}
+	switch {
+	case text[idx] == '"':
+		return `"`, 1
+	case text[idx] == '\'':
+		return `'`, 1
+	case strings.HasPrefix(text[idx:], "“"):
+		return "”", len("“")
+	case strings.HasPrefix(text[idx:], "‘"):
+		return "’", len("‘")
+	case strings.HasPrefix(text[idx:], "＂"):
+		return "＂", len("＂")
+	case strings.HasPrefix(text[idx:], "＇"):
+		return "＇", len("＇")
+	case strings.HasPrefix(text[idx:], "„"):
+		return "”", len("„")
+	case strings.HasPrefix(text[idx:], "‟"):
+		return "”", len("‟")
+	default:
+		return "", 0
+	}
+}
+
+func xmlQuoteCloseDelimiterLenAt(text string, idx int, quote string) int {
+	if quote == "" || idx < 0 || idx >= len(text) {
+		return 0
+	}
+	if strings.HasPrefix(text[idx:], quote) {
+		return len(quote)
+	}
+	return 0
+}
+
+func hasRepairableXMLToolCallsWrapper(text string) bool {
+	if strings.TrimSpace(text) == "" {
+		return false
+	}
+	if strings.Contains(strings.ToLower(text), "<tool_calls") {
+		return false
+	}
+	closeMatches := xmlToolCallsClosePattern.FindAllStringIndex(text, -1)
+	if len(closeMatches) == 0 {
+		return false
+	}
+	invokeLoc := xmlInvokeStartPattern.FindStringIndex(text)
+	if invokeLoc == nil {
+		return false
+	}
+	closeLoc := closeMatches[len(closeMatches)-1]
+	return invokeLoc[0] < closeLoc[0]
+}
+
+func toolCDATAOpenLenAt(text string, idx int) int {
+	start := skipToolMarkupIgnorables(text, idx)
+	ltLen := xmlTagStartDelimiterLenAt(text, start)
+	if ltLen == 0 {
+		return 0
+	}
+	pos := start + ltLen
+	for skipped := 0; skipped <= 4 && pos < len(text); skipped++ {
+		pos = skipToolMarkupIgnorables(text, pos)
+		if pos >= len(text) {
+			return 0
+		}
+		if text[pos] == '[' {
+			pos++
+			next, ok := consumeToolKeyword(text, pos, "cdata")
+			if !ok {
+				return 0
+			}
+			pos = skipToolMarkupIgnorables(text, next)
+			if pos >= len(text) || text[pos] != '[' {
+				return 0
+			}
+			pos++
+			return pos - idx
+		}
+		r, size := utf8.DecodeRuneInString(text[pos:])
+		if size <= 0 || !isToolMarkupSeparator(r) {
+			return 0
+		}
+		pos += size
+	}
+	return 0
+}
+
+func indexToolCDATAOpen(text string, start int) int {
+	for i := maxInt(start, 0); i < len(text); i++ {
+		if toolCDATAOpenLenAt(text, i) > 0 {
+			return i
+		}
+	}
+	return -1
+}
+
+func findTrailingToolCDATACloseStart(text string) int {
+	for i := len(text) - 1; i >= 0; i-- {
+		if closeLen := toolCDATACloseLenAt(text, i); closeLen > 0 && i+closeLen == len(text) {
+			return i
+		}
+	}
+	return -1
+}
diff --git a/internal/toolcall/toolcalls_dsml.go b/internal/toolcall/toolcalls_dsml.go
index a5d9c4ab..6cd595a0 100644
--- a/internal/toolcall/toolcalls_dsml.go
+++ b/internal/toolcall/toolcalls_dsml.go
@@ -2,18 +2,18 @@ package toolcall
 
 import (
 	"strings"
-	"unicode/utf8"
 )
 
 func normalizeDSMLToolCallMarkup(text string) (string, bool) {
 	if text == "" {
 		return "", true
 	}
-	hasAliasLikeMarkup, _ := ContainsToolMarkupSyntaxOutsideIgnored(text)
-	if !hasAliasLikeMarkup {
-		return text, true
+	canonicalized := canonicalizeToolCallCandidateSpans(text)
+	hasDSMLLikeMarkup, hasCanonicalMarkup := ContainsToolMarkupSyntaxOutsideIgnored(canonicalized)
+	if !hasDSMLLikeMarkup && !hasCanonicalMarkup {
+		return canonicalized, true
 	}
-	return rewriteDSMLToolMarkupOutsideIgnored(text), true
+	return rewriteDSMLToolMarkupOutsideIgnored(canonicalized), true
 }
 
 func rewriteDSMLToolMarkupOutsideIgnored(text string) string {
@@ -39,76 +39,19 @@ func rewriteDSMLToolMarkupOutsideIgnored(text string) string {
 			i++
 			continue
 		}
-		if tag.DSMLLike {
-			b.WriteByte('<')
-			if tag.Closing {
-				b.WriteByte('/')
-			}
-			b.WriteString(tag.Name)
-			tail := normalizeToolMarkupTagTailForXML(text[tag.NameEnd : tag.End+1])
-			b.WriteString(tail)
-			if !strings.HasSuffix(tail, ">") {
-				b.WriteByte('>')
-			}
-			i = tag.End + 1
-			continue
+		b.WriteByte('<')
+		if tag.Closing {
+			b.WriteByte('/')
 		}
-		b.WriteString(text[tag.Start : tag.End+1])
-		i = tag.End + 1
-	}
-	return b.String()
-}
-
-func normalizeToolMarkupTagTailForXML(tail string) string {
-	if tail == "" {
-		return ""
-	}
-	var b strings.Builder
-	b.Grow(len(tail))
-	quote := rune(0)
-	for i := 0; i < len(tail); {
-		r, size := utf8.DecodeRuneInString(tail[i:])
-		if r == utf8.RuneError && size == 1 {
-			b.WriteByte(tail[i])
-			i++
-			continue
+		b.WriteString(tag.Name)
+		if delimLen := xmlTagEndDelimiterLenEndingAt(text, tag.End); delimLen > 0 {
+			b.WriteString(text[tag.NameEnd : tag.End+1-delimLen])
+			b.WriteByte('>')
+		} else {
+			b.WriteString(text[tag.NameEnd : tag.End+1])
+			b.WriteByte('>')
 		}
-		ch := normalizeFullwidthASCII(r)
-		if quote != 0 {
-			b.WriteRune(ch)
-			if ch == quote {
-				quote = 0
-			}
-			i += size
-			continue
-		}
-		switch ch {
-		case '"', '\'':
-			quote = ch
-			b.WriteRune(ch)
-		case '|', '!':
-			j := i + size
-			for j < len(tail) {
-				next, nextSize := utf8.DecodeRuneInString(tail[j:])
-				if nextSize <= 0 {
-					break
-				}
-				if next == ' ' || next == '\t' || next == '\r' || next == '\n' {
-					j += nextSize
-					continue
-				}
-				break
-			}
-			next, _ := normalizedASCIIAt(tail, j)
-			if next != '>' {
-				b.WriteRune(ch)
-			}
-		case '>', '/', '=':
-			b.WriteRune(ch)
-		default:
-			b.WriteString(tail[i : i+size])
-		}
-		i += size
+		i = tag.End + 1
 	}
 	return b.String()
 }
diff --git a/internal/toolcall/toolcalls_markup.go b/internal/toolcall/toolcalls_markup.go
index 08cf07e9..fc457317 100644
--- a/internal/toolcall/toolcalls_markup.go
+++ b/internal/toolcall/toolcalls_markup.go
@@ -105,30 +105,16 @@ func extractRawTagValue(inner string) string {
 
 func extractStandaloneCDATA(inner string) (string, bool) {
 	trimmed := strings.TrimSpace(inner)
-	if bodyStart, ok := matchToolCDATAOpenAt(trimmed, 0); ok {
-		end := findStandaloneCDATAEnd(trimmed, bodyStart)
-		if end < 0 {
-			return trimmed[bodyStart:], true
+	if openLen := toolCDATAOpenLenAt(trimmed, 0); openLen > 0 {
+		if closeStart := findTrailingToolCDATACloseStart(trimmed); closeStart >= openLen {
+			return trimmed[openLen:closeStart], true
 		}
-		return trimmed[bodyStart:end], true
-	}
-	return "", false
-}
-
-func findStandaloneCDATAEnd(text string, from int) int {
-	end := -1
-	for searchFrom := from; searchFrom < len(text); {
-		next := indexToolCDATAClose(text, searchFrom)
-		if next < 0 {
-			break
-		}
-		closeEnd := next + toolCDATACloseLenAt(text, next)
-		if strings.TrimSpace(text[closeEnd:]) == "" {
-			end = next
+		if end := findToolCDATAEnd(trimmed, openLen); end >= 0 {
+			return trimmed[openLen:end], true
 		}
-		searchFrom = closeEnd
+		return trimmed[openLen:], true
 	}
-	return end
+	return "", false
 }
 
 func parseJSONLiteralValue(raw string) (any, bool) {
@@ -159,24 +145,22 @@ func SanitizeLooseCDATA(text string) string {
 		return ""
 	}
 
-	const openMarker = "<![cdata["
-	const closeMarker = "]]>"
-
 	var b strings.Builder
 	b.Grow(len(text))
 	changed := false
 	pos := 0
 	for pos < len(text) {
-		start := indexASCIIFold(text, pos, openMarker)
+		start := indexToolCDATAOpen(text, pos)
 		if start < 0 {
 			b.WriteString(text[pos:])
 			break
 		}
-		contentStart := start + len(openMarker)
+		openLen := toolCDATAOpenLenAt(text, start)
+		contentStart := start + openLen
 		b.WriteString(text[pos:start])
 
-		if endRel := indexASCIIFold(text, contentStart, closeMarker); endRel >= 0 {
-			end := endRel + len(closeMarker)
+		if endRel := findToolCDATAEnd(text, contentStart); endRel >= 0 {
+			end := endRel + toolCDATACloseLenAt(text, endRel)
 			b.WriteString(text[start:end])
 			pos = end
 			continue
diff --git a/internal/toolcall/toolcalls_parse.go b/internal/toolcall/toolcalls_parse.go
index 3880da99..4e4f7043 100644
--- a/internal/toolcall/toolcalls_parse.go
+++ b/internal/toolcall/toolcalls_parse.go
@@ -53,7 +53,6 @@ func parseToolCallsDetailedXMLOnly(text string) ToolCallParseResult {
 	if trimmed == "" {
 		return result
 	}
-	result.SawToolCallSyntax = looksLikeToolCallSyntax(trimmed)
 	trimmed = stripFencedCodeBlocks(trimmed)
 	trimmed = strings.TrimSpace(trimmed)
 	if trimmed == "" {
@@ -64,8 +63,9 @@ func parseToolCallsDetailedXMLOnly(text string) ToolCallParseResult {
 	if !ok {
 		return result
 	}
+	result.SawToolCallSyntax = looksLikeToolCallSyntax(normalized) || hasRepairableXMLToolCallsWrapper(normalized)
 	parsed := parseXMLToolCalls(normalized)
-	if len(parsed) == 0 && strings.Contains(strings.ToLower(normalized), "<![cdata[") {
+	if len(parsed) == 0 && indexToolCDATAOpen(normalized, 0) >= 0 {
 		recovered := SanitizeLooseCDATA(normalized)
 		if recovered != normalized {
 			parsed = parseXMLToolCalls(recovered)
@@ -154,7 +154,7 @@ func stripFencedCodeBlocks(text string) string {
 }
 
 func cdataStartsBeforeFence(line string) bool {
-	cdataIdx := strings.Index(strings.ToLower(line), "<![cdata[")
+	cdataIdx := indexToolCDATAOpen(line, 0)
 	if cdataIdx < 0 {
 		return false
 	}
@@ -183,11 +183,14 @@ func updateCDATAStateForStrip(inCDATA bool, cdataFenceMarker, line string) (bool
 	fenceMarker := cdataFenceMarker
 	lineForFence := line
 	if !state {
-		start := indexASCIIFold(line, pos, "<![cdata[")
+		start := indexToolCDATAOpen(line, pos)
 		if start < 0 {
 			return false, ""
 		}
-		pos = start + len("<![cdata[")
+		pos = start + toolCDATAOpenLenAt(line, start)
+		if pos > len(line) {
+			pos = len(line)
+		}
 		state = true
 		lineForFence = line[pos:]
 	}
@@ -205,22 +208,36 @@ func updateCDATAStateForStrip(inCDATA bool, cdataFenceMarker, line string) (bool
 	}
 
 	for pos < len(line) {
-		endPos := indexASCIIFold(line, pos, "]]>")
+		endPos := -1
+		closeLen := 0
+		for search := pos; search < len(line); search++ {
+			if foundLen := toolCDATACloseLenAt(line, search); foundLen > 0 {
+				endPos = search
+				closeLen = foundLen
+				break
+			}
+		}
 		if endPos < 0 {
 			return true, fenceMarker
 		}
-		pos = endPos + len("]]>")
+		pos = endPos + closeLen
+		if pos > len(line) {
+			pos = len(line)
+		}
 		if fenceMarker != "" {
 			continue
 		}
 		if cdataEndLooksStructural(line, pos) || strings.TrimSpace(line[pos:]) == "" {
 			state = false
 			for pos < len(line) {
-				start := indexASCIIFold(line, pos, "<![cdata[")
+				start := indexToolCDATAOpen(line, pos)
 				if start < 0 {
 					return false, ""
 				}
-				pos = start + len("<![cdata[")
+				pos = start + toolCDATAOpenLenAt(line, start)
+				if pos > len(line) {
+					pos = len(line)
+				}
 				state = true
 				trimmedTail := strings.TrimLeft(line[pos:], " \t")
 				if marker, ok := parseFenceOpen(trimmedTail); ok {
diff --git a/internal/toolcall/toolcalls_parse_markup.go b/internal/toolcall/toolcalls_parse_markup.go
index 4660c506..0d222bd1 100644
--- a/internal/toolcall/toolcalls_parse_markup.go
+++ b/internal/toolcall/toolcalls_parse_markup.go
@@ -229,27 +229,13 @@ func skipXMLIgnoredSection(text string, i int) (next int, advanced bool, blocked
 }
 
 func matchToolCDATAOpenAt(text string, start int) (int, bool) {
-	i, ok := consumeToolMarkupLessThan(text, start)
-	if !ok {
-		return start, false
-	}
-	for skipped := 0; skipped <= 4 && i < len(text); skipped++ {
-		if cdataLen, ok := matchASCIIPrefixFoldAt(text, i, "[cdata["); ok {
-			return i + cdataLen, true
-		}
-		r, size := utf8.DecodeRuneInString(text[i:])
-		if size <= 0 || !isToolCDATAOpenSeparator(r) {
-			break
-		}
-		i += size
+	openLen := toolCDATAOpenLenAt(text, start)
+	if openLen > 0 {
+		return start + openLen, true
 	}
 	return start, false
 }
 
-func isToolCDATAOpenSeparator(r rune) bool {
-	return isToolMarkupSeparator(r)
-}
-
 func hasASCIIPrefixFoldAt(text string, start int, prefix string) bool {
 	_, ok := matchASCIIPrefixFoldAt(text, start, prefix)
 	return ok
@@ -280,23 +266,6 @@ func asciiLower(b byte) byte {
 	return b
 }
 
-// indexASCIIFold returns the absolute byte position in s where substr (ASCII-only) is
-// found case-insensitively, scanning forward from start. Returns -1 if not found.
-// Unlike strings.Index on a lowercased copy, this does not allocate or risk byte-length
-// mismatch when non-ASCII runes change width under case folding.
-func indexASCIIFold(s string, start int, substr string) int {
-	if start < 0 || len(s)-start < len(substr) {
-		return -1
-	}
-	end := len(s) - len(substr) + 1
-	for i := start; i < end; i++ {
-		if hasASCIIPrefixFoldAt(s, i, substr) {
-			return i
-		}
-	}
-	return -1
-}
-
 func findToolCDATAEnd(text string, from int) int {
 	if from < 0 || from >= len(text) {
 		return -1
@@ -342,13 +311,19 @@ func indexToolCDATAClose(text string, from int) int {
 }
 
 func toolCDATACloseLenAt(text string, idx int) int {
+	if idx < 0 || idx >= len(text) {
+		return 0
+	}
 	if strings.HasPrefix(text[idx:], "]]〉") {
 		return len("]]〉")
 	}
 	if strings.HasPrefix(text[idx:], "]]＞") {
 		return len("]]＞")
 	}
-	return len("]]>")
+	if strings.HasPrefix(text[idx:], "]]>") {
+		return len("]]>")
+	}
+	return 0
 }
 
 func cdataEndLooksStructural(text string, after int) bool {
diff --git a/internal/toolcall/toolcalls_scan.go b/internal/toolcall/toolcalls_scan.go
index 39727d1d..f14ca15c 100644
--- a/internal/toolcall/toolcalls_scan.go
+++ b/internal/toolcall/toolcalls_scan.go
@@ -2,6 +2,7 @@ package toolcall
 
 import (
 	"strings"
+	"unicode"
 	"unicode/utf8"
 )
 
@@ -148,9 +149,9 @@ func scanToolMarkupTagAt(text string, start int) (ToolMarkupTag, bool) {
 		i = next
 	}
 	closing := false
-	if i < len(text) && text[i] == '/' {
+	if next, ok := consumeToolMarkupClosingSlash(text, i); ok {
 		closing = true
-		i++
+		i = next
 	}
 	prefixStart := i
 	i, dsmlLike := consumeToolMarkupNamePrefix(text, i)
@@ -252,17 +253,18 @@ func consumeToolMarkupNamePrefix(text string, idx int) (int, bool) {
 }
 
 func consumeToolMarkupNamePrefixOnce(text string, idx int) (int, bool) {
+	idx = skipToolMarkupIgnorables(text, idx)
 	if next, ok := consumeToolMarkupSeparator(text, idx); ok {
 		return next, true
 	}
-	if idx < len(text) && (text[idx] == ' ' || text[idx] == '\t' || text[idx] == '\r' || text[idx] == '\n') {
-		return idx + 1, true
+	if spacingLen := toolMarkupWhitespaceLikeLenAt(text, idx); spacingLen > 0 {
+		return idx + spacingLen, true
 	}
-	if hasASCIIPrefixFoldAt(text, idx, "dsml") {
-		dsmlLen, _ := matchASCIIPrefixFoldAt(text, idx, "dsml")
-		next := idx + dsmlLen
-		if sep, size := normalizedASCIIAt(text, next); sep == '-' || sep == '_' {
-			next += size
+	if next, ok := consumeToolKeyword(text, idx, "dsml"); ok {
+		if dashLen := toolMarkupDashLenAt(text, next); dashLen > 0 {
+			next += dashLen
+		} else if underscoreLen := toolMarkupUnderscoreLenAt(text, next); underscoreLen > 0 {
+			next += underscoreLen
 		}
 		return next, true
 	}
@@ -353,8 +355,8 @@ func matchToolMarkupName(text string, start int, dsmlLike bool) (string, int) {
 		if name.dsmlOnly && !dsmlLike {
 			continue
 		}
-		if nameLen, ok := matchASCIIPrefixFoldAt(text, start, name.raw); ok {
-			return name.canonical, nameLen
+		if next, ok := consumeToolKeyword(text, start, name.raw); ok {
+			return name.canonical, next - start
 		}
 	}
 	return "", 0
@@ -366,14 +368,14 @@ func matchToolMarkupNameAfterArbitraryPrefix(text string, start int) (string, in
 			return "", 0, 0, false
 		}
 		for _, name := range toolMarkupNames {
-			nameLen, ok := matchASCIIPrefixFoldAt(text, idx, name.raw)
+			next, ok := consumeToolKeyword(text, idx, name.raw)
 			if !ok {
 				continue
 			}
 			if !toolMarkupPrefixAllowsLocalNameAt(text, start, idx) {
 				continue
 			}
-			return name.canonical, idx, nameLen, true
+			return name.canonical, idx, next - idx, true
 		}
 		_, size := utf8.DecodeRuneInString(text[idx:])
 		if size <= 0 {
@@ -477,6 +479,7 @@ func isToolMarkupTagTerminator(text string, idx int) bool {
 }
 
 func consumeToolMarkupSeparator(text string, idx int) (int, bool) {
+	idx = skipToolMarkupIgnorables(text, idx)
 	if idx >= len(text) {
 		return idx, false
 	}
@@ -495,6 +498,9 @@ func isToolMarkupSeparator(r rune) bool {
 	if ch == ' ' || ch == '\t' || ch == '\n' || ch == '\r' {
 		return false
 	}
+	if r == '▁' || unicode.IsSpace(r) {
+		return false
+	}
 	if (ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z') || (ch >= '0' && ch <= '9') {
 		return false
 	}
@@ -502,6 +508,7 @@ func isToolMarkupSeparator(r rune) bool {
 }
 
 func consumeToolMarkupLessThan(text string, idx int) (int, bool) {
+	idx = skipToolMarkupIgnorables(text, idx)
 	ch, size := normalizedASCIIAt(text, idx)
 	if size <= 0 || ch != '<' {
 		return idx, false
@@ -510,16 +517,17 @@ func consumeToolMarkupLessThan(text string, idx int) (int, bool) {
 }
 
 func hasToolMarkupBoundary(text string, idx int) bool {
+	idx = skipToolMarkupIgnorables(text, idx)
 	if idx >= len(text) {
 		return true
 	}
-	switch text[idx] {
-	case ' ', '\t', '\n', '\r', '>', '/':
+	if toolMarkupWhitespaceLikeLenAt(text, idx) > 0 {
+		return true
+	}
+	if _, ok := consumeToolMarkupClosingSlash(text, idx); ok {
 		return true
-	default:
-		r, _ := utf8.DecodeRuneInString(text[idx:])
-		return normalizeFullwidthASCII(r) == '>'
 	}
+	return xmlTagEndDelimiterLenAt(text, idx) > 0
 }
 
 func normalizedASCIIAt(text string, idx int) (byte, int) {
diff --git a/internal/toolcall/toolcalls_test.go b/internal/toolcall/toolcalls_test.go
index f706f9cc..28ff9108 100644
--- a/internal/toolcall/toolcalls_test.go
+++ b/internal/toolcall/toolcalls_test.go
@@ -131,14 +131,14 @@ func TestParseToolCallsRejectsCamelPrefixedToolMarkupLookalike(t *testing.T) {
 }
 
 func TestParseToolCallsSupportsFullwidthDSMLShell(t *testing.T) {
-	text := `<ｄＳＭＬ｜tool_calls>
-  <ｄＳＭＬ｜invoke name="Read">
-    <ｄＳＭＬ｜parameter name="file_path"＞<![CDATA[/Users/aq/Desktop/myproject/Personal_Blog/README.md]]＞</ｄＳＭＬ｜parameter>
-  </ｄＳＭＬ｜invoke>
-  <ｄＳＭＬ｜invoke name="Read">
-    <ｄＳＭＬ｜parameter name="file_path"＞<![CDATA[/Users/aq/Desktop/myproject/Personal_Blog/index.html]]＞</ｄＳＭＬ｜parameter>
-  </ｄＳＭＬ｜invoke>
-</ｄＳＭＬ｜tool_calls>`
+	text := `<ｄＳＭＬ|tool_calls>
+  <ｄＳＭＬ|invoke name="Read">
+    <ｄＳＭＬ|parameter name="file_path"＞<![CDATA[/Users/aq/Desktop/myproject/Personal_Blog/README.md]]＞</ｄＳＭＬ|parameter>
+  </ｄＳＭＬ|invoke>
+  <ｄＳＭＬ|invoke name="Read">
+    <ｄＳＭＬ|parameter name="file_path"＞<![CDATA[/Users/aq/Desktop/myproject/Personal_Blog/index.html]]＞</ｄＳＭＬ|parameter>
+  </ｄＳＭＬ|invoke>
+</ｄＳＭＬ|tool_calls>`
 	calls := ParseToolCalls(text, []string{"Read"})
 	if len(calls) != 2 {
 		t.Fatalf("expected two fullwidth DSML calls, got %#v", calls)
@@ -152,20 +152,20 @@ func TestParseToolCallsSupportsFullwidthDSMLShell(t *testing.T) {
 }
 
 func TestParseToolCallsSupportsCJKAngleDSMDrift(t *testing.T) {
-	text := `<DSM｜tool_calls>
-<DSM｜invoke name="Bash">
-<DSM｜parameter name="description"｜>〈![CDATA[Show commits on local dev not on origin/dev]]〉〈/DSM｜parameter〉
-<DSM｜parameter name="command"｜>〈![CDATA[git log --oneline origin/dev..dev]]〉〈/DSM｜parameter〉
-〈/DSM｜invoke〉
-<DSM｜invoke name="Bash">
-<DSM｜parameter name="description"｜>〈![CDATA[Show commits on origin/dev not on local dev]]〉〈/DSM｜parameter〉
-<DSM｜parameter name="command"｜>〈![CDATA[git log --oneline dev..origin/dev]]〉〈/DSM｜parameter〉
-〈/DSM｜invoke〉
-<DSM｜invoke name="Bash">
-<DSM｜parameter name="description"｜>〈![CDATA[Check tracking branch status]]〉〈/DSM｜parameter〉
-<DSM｜parameter name="command"｜>〈![CDATA[git status -b --short]]〉〈/DSM｜parameter〉
-〈/DSM｜invoke〉
-〈/DSM｜tool_calls〉`
+	text := `<DSM|tool_calls>
+<DSM|invoke name="Bash">
+<DSM|parameter name="description"|>〈![CDATA[Show commits on local dev not on origin/dev]]〉〈/DSM|parameter〉
+<DSM|parameter name="command"|>〈![CDATA[git log --oneline origin/dev..dev]]〉〈/DSM|parameter〉
+〈/DSM|invoke〉
+<DSM|invoke name="Bash">
+<DSM|parameter name="description"|>〈![CDATA[Show commits on origin/dev not on local dev]]〉〈/DSM|parameter〉
+<DSM|parameter name="command"|>〈![CDATA[git log --oneline dev..origin/dev]]〉〈/DSM|parameter〉
+〈/DSM|invoke〉
+<DSM|invoke name="Bash">
+<DSM|parameter name="description"|>〈![CDATA[Check tracking branch status]]〉〈/DSM|parameter〉
+<DSM|parameter name="command"|>〈![CDATA[git status -b --short]]〉〈/DSM|parameter〉
+〈/DSM|invoke〉
+〈/DSM|tool_calls〉`
 
 	calls := ParseToolCalls(text, []string{"Bash"})
 	if len(calls) != 3 {
@@ -1201,3 +1201,108 @@ func TestFindMatchingToolMarkupCloseBoundaryConditions(t *testing.T) {
 		})
 	}
 }
+
+func TestParseToolCallsSupportsDSMLShellWithFullwidthClosingSlash(t *testing.T) {
+	text := `<|DSML|tool_calls><|DSML|invoke name="execute_code"><|DSML|parameter name="code"><![CDATA[print("hi")]]></|DSML|parameter></|DSML|invoke><／DSML|tool_calls>`
+	calls := ParseToolCalls(text, []string{"execute_code"})
+	if len(calls) != 1 {
+		t.Fatalf("expected 1 DSML call with fullwidth closing slash, got %#v", calls)
+	}
+	if calls[0].Name != "execute_code" || calls[0].Input["code"] != `print("hi")` {
+		t.Fatalf("unexpected fullwidth-closing-slash DSML parse result: %#v", calls[0])
+	}
+}
+
+func TestParseToolCallsSupportsDSMLShellWithSentencePieceSeparatorAndFullwidthGT(t *testing.T) {
+	text := `<|DSML▁tool_calls|><|DSML▁invoke▁name="execute_code"><|DSML▁parameter▁name="code"><![CDATA[print("hi")]]></|DSML▁parameter></|DSML▁invoke></|DSML▁tool_calls＞`
+	calls := ParseToolCalls(text, []string{"execute_code"})
+	if len(calls) != 1 {
+		t.Fatalf("expected 1 DSML call with sentencepiece separator and fullwidth terminator, got %#v", calls)
+	}
+	if calls[0].Name != "execute_code" || calls[0].Input["code"] != `print("hi")` {
+		t.Fatalf("unexpected sentencepiece/fullwidth-terminator DSML parse result: %#v", calls[0])
+	}
+}
+
+func TestParseToolCallsSupportsDSMLShellWithFullwidthLTUnicodeSpaceAndFullwidthAttributes(t *testing.T) {
+	text := `＜|DSML　tool_calls＞＜|DSML　invoke　name＝“execute_code”＞＜|DSML　parameter　name＝“code”＞<![CDATA[print("hi")]]>＜／DSML|parameter＞＜／DSML|invoke＞＜／DSML|tool_calls＞`
+	calls := ParseToolCalls(text, []string{"execute_code"})
+	if len(calls) != 1 {
+		t.Fatalf("expected 1 DSML call with fullwidth opening delimiter and Unicode attribute confusables, got %#v", calls)
+	}
+	if calls[0].Name != "execute_code" || calls[0].Input["code"] != `print("hi")` {
+		t.Fatalf("unexpected fullwidth-opening/Unicode-attr DSML parse result: %#v", calls[0])
+	}
+}
+
+func TestParseToolCallsCanonicalizesConfusableCandidateShellOnly(t *testing.T) {
+	text := "<|\u200b\uff24\u0405\u039cL|to\u03bfl\uff3fcalls>" +
+		"<|\ufeffDSML|inv\u03bfk\u0435 n\u0430me\uff1d\u201cexecute_code\u201d>" +
+		"<|\u200bDSML|par\u0430meter n\u0430me\uff1d\u201ccode\u201d><![\ufeff\u0421D\u0410T\u0410[print(\"hi\")]]>" +
+		"</|\u200bDSML|par\u0430meter></|\u200bDSML|inv\u03bfk\u0435></|\u200b\uff24\u0405\u039cL|to\u03bfl\uff3fcalls>"
+	calls := ParseToolCalls(text, []string{"execute_code"})
+	if len(calls) != 1 {
+		t.Fatalf("expected one confusable-shell call, got %#v", calls)
+	}
+	if calls[0].Name != "execute_code" || calls[0].Input["code"] != `print("hi")` {
+		t.Fatalf("unexpected confusable-shell parse result: %#v", calls[0])
+	}
+}
+
+func TestParseToolCallsKeepsConfusableMarkupInsideCDATAAsText(t *testing.T) {
+	value := "<inv\u03bfke>literal</inv\u03bfke>"
+	text := "<tool_calls><invoke name=\"Write\"><parameter name=\"description\"><![\u200b\u0421D\u0410T\u0410[" + value + "]]></parameter></invoke></tool_calls>"
+	calls := ParseToolCalls(text, []string{"Write"})
+	if len(calls) != 1 {
+		t.Fatalf("expected one Write call, got %#v", calls)
+	}
+	if got, _ := calls[0].Input["description"].(string); got != value {
+		t.Fatalf("expected confusable markup example inside CDATA to stay raw, got %q", got)
+	}
+}
+
+func TestParseToolCallsRepairsMissingOpeningWrapperWithConfusableShell(t *testing.T) {
+	text := "Before tool call\n" +
+		"<inv\u03bfk\u0435 n\u0430me=\"read_file\"><par\u0430meter n\u0430me=\"path\"><![\u200b\u0421D\u0410T\u0410[README.md]]></par\u0430meter></inv\u03bfk\u0435>\n" +
+		"</to\u03bfl_calls>\n" +
+		"after"
+	res := ParseToolCallsDetailed(text, []string{"read_file"})
+	if len(res.Calls) != 1 {
+		t.Fatalf("expected repaired confusable wrapper to parse one call, got %#v", res)
+	}
+	if got, _ := res.Calls[0].Input["path"].(string); got != "README.md" {
+		t.Fatalf("expected repaired confusable wrapper to preserve args, got %#v", res.Calls[0].Input)
+	}
+	if !res.SawToolCallSyntax {
+		t.Fatalf("expected repaired confusable wrapper to mark tool syntax seen, got %#v", res)
+	}
+}
+
+func TestParseToolCallsDoesNotAcceptConfusableNearMissTagName(t *testing.T) {
+	text := "<tool_calls><inv\u03bfker name=\"execute_code\"><parameter name=\"code\">pwd</parameter></inv\u03bfker></tool_calls>"
+	calls := ParseToolCalls(text, []string{"execute_code"})
+	if len(calls) != 0 {
+		t.Fatalf("expected confusable near-miss tag name to remain non-executable, got %#v", calls)
+	}
+}
+
+func TestFindMatchingToolMarkupCloseBoundaryConditionsSupportsConfusableDelimiters(t *testing.T) {
+	tests := []struct {
+		name   string
+		text   string
+		open   ToolMarkupTag
+		wantOk bool
+	}{
+		{"valid_fullwidth_closing_slash", "<tool_calls><／tool_calls>", ToolMarkupTag{Name: "tool_calls", End: 11}, true},
+		{"valid_fullwidth_opening_delimiter", "＜tool_calls＞＜／tool_calls＞", ToolMarkupTag{Name: "tool_calls", End: len("＜tool_calls＞") - 1}, true},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			_, ok := FindMatchingToolMarkupClose(tt.text, tt.open)
+			if ok != tt.wantOk {
+				t.Errorf("FindMatchingToolMarkupClose(%q, %+v) ok = %v, want %v", tt.text, tt.open, ok, tt.wantOk)
+			}
+		})
+	}
+}
diff --git a/internal/toolstream/complex_edge_test.go b/internal/toolstream/complex_edge_test.go
index 33377732..5e400d00 100644
--- a/internal/toolstream/complex_edge_test.go
+++ b/internal/toolstream/complex_edge_test.go
@@ -316,11 +316,11 @@ func TestSieve_CharByCharToolCall(t *testing.T) {
 func TestSieve_FullwidthPipeWrapperDSMLInvoke(t *testing.T) {
 	var state State
 	chunks := []string{
-		"<｜tool_calls>\n",
+		"<|tool_calls>\n",
 		"<|DSML|invoke name=\"read_file\">\n",
 		"<|DSML|parameter name=\"path\">README.md</|DSML|parameter>\n",
 		"</|DSML|invoke>\n",
-		"</｜tool_calls>",
+		"</|tool_calls>",
 	}
 	var events []Event
 	for _, c := range chunks {
@@ -382,7 +382,7 @@ func TestSieve_TagMentionInTextThenRealToolCall(t *testing.T) {
 	chunks := []string{
 		"建议的 commit message：\n\nfeat: expand DSML alias support\n\n",
 		"Add support for <dsml|tool_calls>, ",
-		"<｜tool_calls> (fullwidth pipe),\n",
+		"<|tool_calls> (pipe alias),\n",
 		"and <|tool_calls> wrapper variants.\n\n",
 		"<|DSML|tool_calls>\n",
 		"<|DSML|invoke name=\"Bash\">\n",
@@ -466,14 +466,14 @@ func TestSieve_ReviewSampleWithAliasMentionsPreservesBodyAndToolCalls(t *testing
 	chunks := []string{
 		"Done reviewing the diff. Here's my analysis before we commit:\n\n",
 		"Summary of Changes\n",
-		"DSML wrapper variant support — recognize aliases (<dsml|tool_calls>, <|tool_calls>, <｜tool_calls>) alongside canonical <tool_calls> and <|DSML|tool_calls> wrappers.\n\n",
+		"DSML wrapper variant support — recognize aliases (<dsml|tool_calls>, <|tool_calls>) alongside canonical <tool_calls> and <|DSML|tool_calls> wrappers.\n\n",
 		"<|DSML|tool_calls>\n",
 		"<|DSML|invoke name=\"Bash\">\n",
 		"<|DSML|parameter name=\"command\"><![CDATA[git add docs/toolcall-semantics.md internal/toolstream/tool_sieve_xml.go]]></|DSML|parameter>\n",
 		"<|DSML|parameter name=\"description\"><![CDATA[Stage all relevant changed files]]></|DSML|parameter>\n",
 		"</|DSML|invoke>\n",
 		"<|DSML|invoke name=\"Bash\">\n",
-		"<|DSML|parameter name=\"command\"><![CDATA[git commit -m \"$(cat <<'EOF'\nfeat(toolstream): expand DSML wrapper detection\n\nSupport DSML wrapper aliases: <dsml|tool_calls>, <|tool_calls>, <｜tool_calls> alongside existing canonical wrappers.\nEOF\n)\"]]></|DSML|parameter>\n",
+		"<|DSML|parameter name=\"command\"><![CDATA[git commit -m \"$(cat <<'EOF'\nfeat(toolstream): expand DSML wrapper detection\n\nSupport DSML wrapper aliases: <dsml|tool_calls> and <|tool_calls> alongside existing canonical wrappers.\nEOF\n)\"]]></|DSML|parameter>\n",
 		"<|DSML|parameter name=\"description\"><![CDATA[Create commit with all staged changes]]></|DSML|parameter>\n",
 		"</|DSML|invoke>\n",
 		"</|DSML|tool_calls>",
diff --git a/internal/toolstream/tool_sieve_xml.go b/internal/toolstream/tool_sieve_xml.go
index 11294bb3..ccb09a60 100644
--- a/internal/toolstream/tool_sieve_xml.go
+++ b/internal/toolstream/tool_sieve_xml.go
@@ -141,6 +141,9 @@ func shouldKeepBareInvokeCapture(captured string) bool {
 	if invokeCloseTag, ok := findFirstToolMarkupTagByNameFrom(captured, startEnd+1, "invoke", true); ok {
 		return strings.TrimSpace(captured[invokeCloseTag.End+1:]) == ""
 	}
+	if paramTag, ok := findFirstToolMarkupTagByName(body, 0, "parameter"); ok && strings.TrimSpace(body[:paramTag.Start]) == "" {
+		return true
+	}
 
 	trimmedLower := strings.ToLower(trimmedBody)
 	return strings.HasPrefix(trimmedLower, "<parameter") ||
@@ -149,14 +152,14 @@ func shouldKeepBareInvokeCapture(captured string) bool {
 }
 
 func findPartialXMLToolTagStart(s string) int {
-	lastLT := strings.LastIndex(s, "<")
+	lastLT := lastToolMarkupStartDelimiterIndex(s)
 	if lastLT < 0 {
 		return -1
 	}
 	start := includeDuplicateLeadingLessThan(s, lastLT)
 	tail := s[start:]
-	// If there's a '>' in the tail, the tag is closed — not partial.
-	if strings.Contains(tail, ">") {
+	// If there's a tag terminator in the tail, the tag is closed — not partial.
+	if strings.Contains(tail, ">") || strings.Contains(tail, "＞") {
 		return -1
 	}
 	if toolcall.IsPartialToolMarkupTagPrefix(tail) {
@@ -164,3 +167,12 @@ func findPartialXMLToolTagStart(s string) int {
 	}
 	return -1
 }
+
+func lastToolMarkupStartDelimiterIndex(s string) int {
+	asciiIdx := strings.LastIndex(s, "<")
+	fullwidthIdx := strings.LastIndex(s, "＜")
+	if asciiIdx > fullwidthIdx {
+		return asciiIdx
+	}
+	return fullwidthIdx
+}
diff --git a/internal/toolstream/tool_sieve_xml_test.go b/internal/toolstream/tool_sieve_xml_test.go
index 780dc1b1..418c8120 100644
--- a/internal/toolstream/tool_sieve_xml_test.go
+++ b/internal/toolstream/tool_sieve_xml_test.go
@@ -626,13 +626,13 @@ func TestProcessToolSieveEmitsAllEmptyDSMLToolBlock(t *testing.T) {
 
 func TestProcessToolSieveEmitsChunkedAllEmptyArbitraryPrefixedToolBlock(t *testing.T) {
 	chunk := strings.Join([]string{
-		`<T｜DSML｜tool_calls>`,
-		`  <T｜DSML｜invoke name="TaskOutput">`,
-		`  <T｜DSML｜parameter name="task_id"></T｜DSML｜parameter>`,
-		`  <T｜DSML｜parameter name="block"></T｜DSML｜parameter>`,
-		`  <T｜DSML｜parameter name="timeout"></T｜DSML｜parameter>`,
-		`  </T｜DSML｜invoke>`,
-		`  </T｜DSML｜tool_calls>`,
+		`<T|DSML|tool_calls>`,
+		`  <T|DSML|invoke name="TaskOutput">`,
+		`  <T|DSML|parameter name="task_id"></T|DSML|parameter>`,
+		`  <T|DSML|parameter name="block"></T|DSML|parameter>`,
+		`  <T|DSML|parameter name="timeout"></T|DSML|parameter>`,
+		`  </T|DSML|invoke>`,
+		`  </T|DSML|tool_calls>`,
 	}, "\n")
 	calls := collectToolCallsForChunks(t, splitEveryNRBytes(chunk, 8), []string{"TaskOutput"})
 	if len(calls) != 1 {
@@ -811,8 +811,8 @@ func TestFindPartialXMLToolTagStart(t *testing.T) {
 		{"partial_tool_calls", "Hello <tool_ca", 6},
 		{"partial_dsml_trailing_pipe", "Hello <|DSML|tool_calls|", 6},
 		{"partial_dsml_extra_leading_less_than", "Hello <<|DSML|tool_calls", 6},
-		{"partial_arbitrary_prefix_before_dsml", "Hello <T｜DS", 6},
-		{"partial_arbitrary_prefix_after_dsml_pipe", "Hello <T｜DSML｜", 6},
+		{"partial_arbitrary_prefix_before_dsml", "Hello <T|DS", 6},
+		{"partial_arbitrary_prefix_after_dsml_pipe", "Hello <T|DSML|", 6},
 		{"partial_invoke", "Hello <inv", 6},
 		{"bare_tool_call_not_held", "Hello <tool_name", -1},
 		{"partial_lt_only", "Text <", 5},
@@ -1091,7 +1091,7 @@ func TestProcessToolSieveRepairsMissingOpeningWrapperWithoutLeakingInvokeText(t
 	}
 }
 
-// Test fullwidth pipe variant: <｜tool_calls> (U+FF5C) should be buffered and parsed.
+// Test escaped U+FF5C pipe variant: <\uff5ctool_calls> should be buffered and parsed.
 func TestProcessToolSieveFullwidthPipeVariantDoesNotLeak(t *testing.T) {
 	var state State
 	chunks := []string{
@@ -1115,19 +1115,19 @@ func TestProcessToolSieveFullwidthPipeVariantDoesNotLeak(t *testing.T) {
 	}
 
 	if strings.Contains(textContent, "invoke") || strings.Contains(textContent, "execute_command") {
-		t.Fatalf("fullwidth pipe variant leaked to text: %q", textContent)
+		t.Fatalf("escaped U+FF5C pipe variant leaked to text: %q", textContent)
 	}
 	if toolCalls != 1 {
-		t.Fatalf("expected one tool call from fullwidth pipe variant, got %d events=%#v", toolCalls, events)
+		t.Fatalf("expected one tool call from escaped U+FF5C pipe variant, got %d events=%#v", toolCalls, events)
 	}
 }
 
-// Test <｜DSML|tool_calls> with DSML invoke/parameter tags should buffer the
+// Test <|DSML|tool_calls> with DSML invoke/parameter tags should buffer the
 // wrapper instead of leaking it before the block is complete.
 func TestProcessToolSieveFullwidthDSMLPrefixVariantDoesNotLeak(t *testing.T) {
 	var state State
 	chunks := []string{
-		"<｜DSML|tool",
+		"<|DSML|tool",
 		"_calls>\n",
 		"<|DSML|invoke name=\"Bash\">\n",
 		"<|DSML|parameter name=\"command\"><![CDATA[ls -la /Users/aq/Desktop/myproject/ds2api/]]></|DSML|parameter>\n",
@@ -1232,12 +1232,12 @@ func TestProcessToolSieveDSMLBarePrefixVariantDoesNotLeak(t *testing.T) {
 func TestProcessToolSieveCJKAngleDSMDriftDoesNotLeak(t *testing.T) {
 	var state State
 	chunks := []string{
-		"<DSM｜tool_calls>\n",
-		"<DSM｜invoke name=\"Bash\">\n",
-		"<DSM｜parameter name=\"description\"｜>〈![CDATA[Check tracking branch status]]〉〈/DSM｜parameter〉\n",
-		"<DSM｜parameter name=\"command\"｜>〈![CDATA[git status -b --short]]〉〈/DSM｜parameter〉\n",
-		"〈/DSM｜invoke〉\n",
-		"〈/DSM｜tool_calls〉",
+		"<DSM|tool_calls>\n",
+		"<DSM|invoke name=\"Bash\">\n",
+		"<DSM|parameter name=\"description\"|>〈![CDATA[Check tracking branch status]]〉〈/DSM|parameter〉\n",
+		"<DSM|parameter name=\"command\"|>〈![CDATA[git status -b --short]]〉〈/DSM|parameter〉\n",
+		"〈/DSM|invoke〉\n",
+		"〈/DSM|tool_calls〉",
 	}
 	var events []Event
 	for _, c := range chunks {
@@ -1335,3 +1335,166 @@ func TestProcessToolSieveIdeographicCommaDSMLDriftDoesNotLeak(t *testing.T) {
 		t.Fatalf("unexpected ideographic-comma DSML drift call: %#v", calls[0])
 	}
 }
+
+func TestProcessToolSieveParsesFullwidthClosingSlashAndKeepsSuffixText(t *testing.T) {
+	var state State
+	chunk := `<|DSML|tool_calls><|DSML|invoke name="execute_code"><|DSML|parameter name="code"><![CDATA[print("hi")]]></|DSML|parameter></|DSML|invoke><／DSML|tool_calls> sao cụm này lại đc trả là 1 message`
+	events := ProcessChunk(&state, chunk, []string{"execute_code"})
+	events = append(events, Flush(&state, []string{"execute_code"})...)
+
+	var textContent strings.Builder
+	toolCalls := 0
+	var parsed Event
+	for _, evt := range events {
+		textContent.WriteString(evt.Content)
+		if len(evt.ToolCalls) > 0 {
+			parsed = evt
+		}
+		toolCalls += len(evt.ToolCalls)
+	}
+	if toolCalls != 1 {
+		t.Fatalf("expected exactly one parsed tool call from fullwidth closing slash block, got %d events=%#v", toolCalls, events)
+	}
+	if parsed.ToolCalls[0].Name != "execute_code" || parsed.ToolCalls[0].Input["code"] != `print("hi")` {
+		t.Fatalf("unexpected parsed call from fullwidth closing slash block: %#v", parsed.ToolCalls[0])
+	}
+	if got := textContent.String(); got != " sao cụm này lại đc trả là 1 message" {
+		t.Fatalf("expected suffix text to be preserved, got %q", got)
+	}
+}
+
+func TestProcessToolSieveParsesSentencePieceSeparatorAndFullwidthTerminator(t *testing.T) {
+	var state State
+	chunk := `<|DSML▁tool_calls|><|DSML▁invoke▁name="execute_code"><|DSML▁parameter▁name="code"><![CDATA[print("hi")]]></|DSML▁parameter></|DSML▁invoke></|DSML▁tool_calls＞ suffix`
+	events := ProcessChunk(&state, chunk, []string{"execute_code"})
+	events = append(events, Flush(&state, []string{"execute_code"})...)
+
+	var textContent strings.Builder
+	toolCalls := 0
+	var parsed Event
+	for _, evt := range events {
+		textContent.WriteString(evt.Content)
+		if len(evt.ToolCalls) > 0 {
+			parsed = evt
+		}
+		toolCalls += len(evt.ToolCalls)
+	}
+	if toolCalls != 1 {
+		t.Fatalf("expected exactly one parsed tool call from sentencepiece/fullwidth-terminator block, got %d events=%#v", toolCalls, events)
+	}
+	if parsed.ToolCalls[0].Name != "execute_code" || parsed.ToolCalls[0].Input["code"] != `print("hi")` {
+		t.Fatalf("unexpected parsed call from sentencepiece/fullwidth-terminator block: %#v", parsed.ToolCalls[0])
+	}
+	if got := textContent.String(); got != " suffix" {
+		t.Fatalf("expected suffix text to be preserved, got %q", got)
+	}
+}
+
+func TestProcessToolSieveParsesFullwidthOpeningDelimiterAndUnicodeAttributes(t *testing.T) {
+	var state State
+	chunk := `＜|DSML　tool_calls＞＜|DSML　invoke　name＝“execute_code”＞＜|DSML　parameter　name＝“code”＞<![CDATA[print("hi")]]>＜／DSML|parameter＞＜／DSML|invoke＞＜／DSML|tool_calls＞ suffix`
+	events := ProcessChunk(&state, chunk, []string{"execute_code"})
+	events = append(events, Flush(&state, []string{"execute_code"})...)
+
+	var textContent strings.Builder
+	toolCalls := 0
+	var parsed Event
+	for _, evt := range events {
+		textContent.WriteString(evt.Content)
+		if len(evt.ToolCalls) > 0 {
+			parsed = evt
+		}
+		toolCalls += len(evt.ToolCalls)
+	}
+	if toolCalls != 1 {
+		t.Fatalf("expected exactly one parsed tool call from fullwidth-opening/Unicode-attr block, got %d events=%#v", toolCalls, events)
+	}
+	if parsed.ToolCalls[0].Name != "execute_code" || parsed.ToolCalls[0].Input["code"] != `print("hi")` {
+		t.Fatalf("unexpected parsed call from fullwidth-opening/Unicode-attr block: %#v", parsed.ToolCalls[0])
+	}
+	if got := textContent.String(); got != " suffix" {
+		t.Fatalf("expected suffix text to be preserved, got %q", got)
+	}
+}
+
+func TestProcessToolSieveParsesConfusableCandidateShellAndKeepsSuffixText(t *testing.T) {
+	var state State
+	chunk := "<|\u200b\uff24\u0405\u039cL|to\u03bfl\uff3fcalls><|\ufeffDSML|inv\u03bfk\u0435 n\u0430me\uff1d\u201cexecute_code\u201d><|\u200bDSML|par\u0430meter n\u0430me\uff1d\u201ccode\u201d><![\ufeff\u0421D\u0410T\u0410[print(\"hi\")]]></|\u200bDSML|par\u0430meter></|\u200bDSML|inv\u03bfk\u0435></|\u200b\uff24\u0405\u039cL|to\u03bfl\uff3fcalls> suffix"
+	events := ProcessChunk(&state, chunk, []string{"execute_code"})
+	events = append(events, Flush(&state, []string{"execute_code"})...)
+
+	var textContent strings.Builder
+	toolCalls := 0
+	var parsed Event
+	for _, evt := range events {
+		textContent.WriteString(evt.Content)
+		if len(evt.ToolCalls) > 0 {
+			parsed = evt
+		}
+		toolCalls += len(evt.ToolCalls)
+	}
+	if toolCalls != 1 {
+		t.Fatalf("expected exactly one parsed tool call from confusable-shell block, got %d events=%#v", toolCalls, events)
+	}
+	if parsed.ToolCalls[0].Name != "execute_code" || parsed.ToolCalls[0].Input["code"] != `print("hi")` {
+		t.Fatalf("unexpected parsed call from confusable-shell block: %#v", parsed.ToolCalls[0])
+	}
+	if got := textContent.String(); got != " suffix" {
+		t.Fatalf("expected suffix text to be preserved, got %q", got)
+	}
+}
+
+func TestProcessToolSieveRepairsConfusableMissingWrapperAndKeepsSuffixText(t *testing.T) {
+	var state State
+	chunks := []string{
+		"<inv\u03bfk\u0435 n\u0430me=\"read_file\">\n",
+		"  <par\u0430meter n\u0430me=\"path\"><![\u200b\u0421D\u0410T\u0410[README.md]]></par\u0430meter>\n",
+		"</inv\u03bfk\u0435>\n",
+		"</to\u03bfl_calls> trailing prose",
+	}
+	var events []Event
+	for _, c := range chunks {
+		events = append(events, ProcessChunk(&state, c, []string{"read_file"})...)
+	}
+	events = append(events, Flush(&state, []string{"read_file"})...)
+
+	var textContent strings.Builder
+	toolCalls := 0
+	var parsed Event
+	for _, evt := range events {
+		textContent.WriteString(evt.Content)
+		if len(evt.ToolCalls) > 0 {
+			parsed = evt
+		}
+		toolCalls += len(evt.ToolCalls)
+	}
+	if toolCalls != 1 {
+		t.Fatalf("expected repaired confusable missing-wrapper stream to emit one tool call, got %d events=%#v", toolCalls, events)
+	}
+	if parsed.ToolCalls[0].Name != "read_file" || parsed.ToolCalls[0].Input["path"] != "README.md" {
+		t.Fatalf("unexpected parsed call from repaired confusable missing-wrapper block: %#v", parsed.ToolCalls[0])
+	}
+	if got := textContent.String(); got != " trailing prose" {
+		t.Fatalf("expected suffix prose to be preserved, got %q", got)
+	}
+}
+
+func TestProcessToolSieveKeepsConfusableNearMissWrapperAsText(t *testing.T) {
+	var state State
+	chunk := "<to\u03bfl_callz><inv\u03bfke name=\"read_file\"><parameter name=\"path\">README.md</parameter></inv\u03bfke></to\u03bfl_callz>"
+	events := ProcessChunk(&state, chunk, []string{"read_file"})
+	events = append(events, Flush(&state, []string{"read_file"})...)
+
+	var textContent strings.Builder
+	toolCalls := 0
+	for _, evt := range events {
+		textContent.WriteString(evt.Content)
+		toolCalls += len(evt.ToolCalls)
+	}
+	if toolCalls != 0 {
+		t.Fatalf("expected confusable near-miss wrapper to remain text, got %d events=%#v", toolCalls, events)
+	}
+	if got := textContent.String(); got != chunk {
+		t.Fatalf("expected confusable near-miss wrapper to pass through unchanged, got %q", got)
+	}
+}
diff --git a/internal/util/messages_test.go b/internal/util/messages_test.go
index 569e65d1..0b2b1f40 100644
--- a/internal/util/messages_test.go
+++ b/internal/util/messages_test.go
@@ -13,10 +13,10 @@ func TestMessagesPrepareBasic(t *testing.T) {
 	if got == "" {
 		t.Fatal("expected non-empty prompt")
 	}
-	if !strings.HasPrefix(got, "<｜begin▁of▁sentence｜><｜System｜>") {
+	if !strings.HasPrefix(got, "<|begin▁of▁sentence|><|System|>") {
 		t.Fatalf("expected output integrity guard at the start, got %q", got)
 	}
-	if !strings.Contains(got, "Hello") || !strings.HasSuffix(got, "<｜Assistant｜>") {
+	if !strings.Contains(got, "Hello") || !strings.HasSuffix(got, "<|Assistant|>") {
 		t.Fatalf("unexpected prompt: %q", got)
 	}
 }
@@ -33,31 +33,31 @@ func TestMessagesPrepareRoles(t *testing.T) {
 	if !contains(got, "Output integrity guard") {
 		t.Fatalf("expected output integrity guard in %q", got)
 	}
-	if !contains(got, "You are helper") || !contains(got, "<｜User｜>Hi") {
+	if !contains(got, "You are helper") || !contains(got, "<|User|>Hi") {
 		t.Fatalf("expected system/user content in %q", got)
 	}
-	if !contains(got, "<｜begin▁of▁sentence｜>") {
+	if !contains(got, "<|begin▁of▁sentence|>") {
 		t.Fatalf("expected begin marker in %q", got)
 	}
-	if !contains(got, "<｜User｜>Hi<｜Assistant｜>Hello<｜end▁of▁sentence｜>") {
+	if !contains(got, "<|User|>Hi<|Assistant|>Hello<|end▁of▁sentence|>") {
 		t.Fatalf("expected user/assistant separation in %q", got)
 	}
-	if !contains(got, "<｜Assistant｜>Hello<｜end▁of▁sentence｜><｜Tool｜>Search results<｜end▁of▁toolresults｜>") {
+	if !contains(got, "<|Assistant|>Hello<|end▁of▁sentence|><|Tool|>Search results<|end▁of▁toolresults|>") {
 		t.Fatalf("expected assistant/tool separation in %q", got)
 	}
-	if !contains(got, "<｜Tool｜>Search results<｜end▁of▁toolresults｜><｜User｜>How are you") {
+	if !contains(got, "<|Tool|>Search results<|end▁of▁toolresults|><|User|>How are you") {
 		t.Fatalf("expected tool/user separation in %q", got)
 	}
-	if !contains(got, "<｜Assistant｜>") {
+	if !contains(got, "<|Assistant|>") {
 		t.Fatalf("expected assistant marker in %q", got)
 	}
-	if !contains(got, "<｜System｜>") {
+	if !contains(got, "<|System|>") {
 		t.Fatalf("expected system marker in %q", got)
 	}
-	if !contains(got, "<｜User｜>") {
+	if !contains(got, "<|User|>") {
 		t.Fatalf("expected user marker in %q", got)
 	}
-	if !contains(got, "<｜Tool｜>") {
+	if !contains(got, "<|Tool|>") {
 		t.Fatalf("expected tool marker in %q", got)
 	}
 }
diff --git a/internal/util/util_edge_test.go b/internal/util/util_edge_test.go
index 463df1ad..1943d6c2 100644
--- a/internal/util/util_edge_test.go
+++ b/internal/util/util_edge_test.go
@@ -162,20 +162,20 @@ func TestMessagesPrepareMergesConsecutiveSameRole(t *testing.T) {
 		{"role": "user", "content": "World"},
 	}
 	got := MessagesPrepare(messages)
-	if !strings.HasPrefix(got, "<｜begin▁of▁sentence｜>") {
+	if !strings.HasPrefix(got, "<|begin▁of▁sentence|>") {
 		t.Fatalf("expected user marker at the start, got %q", got)
 	}
 	if !strings.Contains(got, "Hello") || !strings.Contains(got, "World") {
 		t.Fatalf("expected both messages, got %q", got)
 	}
 	// Should be merged into a single user turn with one marker at the start.
-	count := strings.Count(got, "<｜User｜>")
+	count := strings.Count(got, "<|User|>")
 	if count != 1 {
 		t.Fatalf("expected one User marker for the merged pair, got %d occurrences", count)
 	}
 	// User messages no longer have end_of_sentence markers in the official format.
 	// The merged pair should have zero end_of_sentence markers (user turn only).
-	if count := strings.Count(got, "<｜end▁of▁sentence｜>"); count != 0 {
+	if count := strings.Count(got, "<|end▁of▁sentence|>"); count != 0 {
 		t.Fatalf("expected zero sentence terminators for user-only merge, got %d occurrences", count)
 	}
 }
@@ -186,16 +186,16 @@ func TestMessagesPrepareAssistantMarkers(t *testing.T) {
 		{"role": "assistant", "content": "Hello!"},
 	}
 	got := MessagesPrepare(messages)
-	if !strings.Contains(got, "<｜Assistant｜>") {
+	if !strings.Contains(got, "<|Assistant|>") {
 		t.Fatalf("expected assistant marker, got %q", got)
 	}
-	if !strings.Contains(got, "<｜end▁of▁sentence｜>") {
+	if !strings.Contains(got, "<|end▁of▁sentence|>") {
 		t.Fatalf("expected end of sentence marker, got %q", got)
 	}
-	if strings.Count(got, "<｜end▁of▁sentence｜>") != 1 {
+	if strings.Count(got, "<|end▁of▁sentence|>") != 1 {
 		t.Fatalf("expected one end_of_sentence (assistant only), got %q", got)
 	}
-	if !strings.Contains(got, "<｜Assistant｜>Hello!<｜end▁of▁sentence｜>") {
+	if !strings.Contains(got, "<|Assistant|>Hello!<|end▁of▁sentence|>") {
 		t.Fatalf("expected assistant EOS suffix, got %q", got)
 	}
 	if strings.Contains(got, "<think>") || strings.Contains(got, "</think>") {
diff --git a/tests/node/chat-history-utils.test.js b/tests/node/chat-history-utils.test.js
index 05bf2f09..d9db46a6 100644
--- a/tests/node/chat-history-utils.test.js
+++ b/tests/node/chat-history-utils.test.js
@@ -18,9 +18,9 @@ test('chat history strict parser merges current input file placeholder', async (
       content: 'Continue from the latest state in the attached DS2API_HISTORY.txt context. Treat it as the current working state and answer the latest user request directly.',
     }],
     history_text: [
-      '<｜begin▁of▁sentence｜>',
-      '<｜User｜>hello',
-      '<｜Assistant｜>hi<｜end▁of▁sentence｜>',
+      '<|begin▁of▁sentence|>',
+      '<|User|>hello',
+      '<|Assistant|>hi<|end▁of▁sentence|>',
     ].join(''),
   };
 
@@ -43,9 +43,9 @@ test('chat history strict parser inserts history after system messages', async (
       { role: 'user', content: 'latest' },
     ],
     history_text: [
-      '<｜begin▁of▁sentence｜>',
-      '<｜User｜>old',
-      '<｜Assistant｜>done<｜end▁of▁sentence｜>',
+      '<|begin▁of▁sentence|>',
+      '<|User|>old',
+      '<|Assistant|>done<|end▁of▁sentence|>',
     ].join(''),
   };
 
diff --git a/tests/node/chat-stream.test.js b/tests/node/chat-stream.test.js
index 11461130..5c15d327 100644
--- a/tests/node/chat-stream.test.js
+++ b/tests/node/chat-stream.test.js
@@ -224,6 +224,80 @@ test('vercel stream retries thinking-only output once', async () => {
   assert.equal(parsed[2].choices[0].finish_reason, 'stop');
 });
 
+test('vercel stream switches managed account after empty retry exhaustion', async () => {
+  const originalFetch = global.fetch;
+  const fetchURLs = [];
+  const completionBodies = [];
+  const completionAuth = [];
+  let completionCalls = 0;
+  global.fetch = async (url, init = {}) => {
+    const textURL = String(url);
+    fetchURLs.push(textURL);
+    if (textURL.includes('__stream_prepare=1')) {
+      return jsonResponse({
+        session_id: 'chatcmpl-test',
+        lease_id: 'lease-test',
+        model: 'gpt-test',
+        final_prompt: 'hello',
+        thinking_enabled: true,
+        search_enabled: false,
+        tool_names: [],
+        deepseek_token: 'token-1',
+        pow_header: 'pow-1',
+        payload: { chat_session_id: 'session-1', prompt: 'hello', ref_file_ids: ['file-1'] },
+      });
+    }
+    if (textURL.includes('__stream_pow=1')) {
+      return jsonResponse({ pow_header: 'pow-retry' });
+    }
+    if (textURL.includes('__stream_switch=1')) {
+      return jsonResponse({
+        session_id: 'session-2',
+        lease_id: 'lease-test',
+        model: 'gpt-test',
+        final_prompt: 'hello',
+        thinking_enabled: true,
+        search_enabled: false,
+        tool_names: [],
+        deepseek_token: 'token-2',
+        pow_header: 'pow-2',
+        payload: { chat_session_id: 'session-2', prompt: 'hello', ref_file_ids: ['file-2'] },
+      });
+    }
+    if (textURL.includes('__stream_release=1')) {
+      return jsonResponse({ success: true });
+    }
+    if (textURL === 'https://chat.deepseek.com/api/v0/chat/completion') {
+      completionBodies.push(JSON.parse(String(init.body)));
+      completionAuth.push(init.headers.authorization);
+      completionCalls += 1;
+      if (completionCalls <= 2) {
+        return sseResponse([`data: {"response_message_id":${40 + completionCalls},"p":"response/thinking_content","v":"plan"}\n\n`, 'data: [DONE]\n\n']);
+      }
+      return sseResponse(['data: {"p":"response/content","v":"visible"}\n\n', 'data: [DONE]\n\n']);
+    }
+    throw new Error(`unexpected fetch url: ${textURL}`);
+  };
+  try {
+    const req = new MockStreamRequest();
+    const res = new MockStreamResponse();
+    const payload = { model: 'gpt-test', stream: true };
+    await handleVercelStream(req, res, Buffer.from(JSON.stringify(payload)), payload);
+    const frames = parseSSEDataFrames(res.bodyText());
+    const parsed = frames.filter((frame) => frame !== '[DONE]').map((frame) => JSON.parse(frame));
+    assert.equal(fetchURLs.filter((url) => url.includes('__stream_switch=1')).length, 1);
+    assert.equal(completionBodies.length, 3);
+    assert.match(completionBodies[1].prompt, /Previous reply had no visible output/);
+    assert.equal(completionBodies[1].parent_message_id, 41);
+    assert.equal(completionBodies[2].prompt, 'hello');
+    assert.deepEqual(completionBodies[2].ref_file_ids, ['file-2']);
+    assert.deepEqual(completionAuth, ['Bearer token-1', 'Bearer token-1', 'Bearer token-2']);
+    assert.equal(parsed.at(-1).choices[0].finish_reason, 'stop');
+  } finally {
+    global.fetch = originalFetch;
+  }
+});
+
 test('vercel stream coalesces many small content deltas while keeping one choice', async () => {
   const lines = Array.from({ length: 100 }, () => `data: ${JSON.stringify({ p: 'response/content', v: '字' })}\n\n`);
   lines.push('data: [DONE]\n\n');
@@ -643,6 +717,27 @@ test('parseChunkForContent strips citation and reference markers from fragment c
   assert.deepEqual(parsed.parts, [{ text: '广州天气   多云', type: 'text' }]);
 });
 
+test('parseChunkForContent strips leaked thought control markers from content', () => {
+  const chunk = {
+    p: 'response/content',
+    v: '<|▁of▁thought|>A<| of_thought |>B<| end_of_thought |>C',
+  };
+  const parsed = parseChunkForContent(chunk, false, 'text');
+  assert.equal(parsed.finished, false);
+  assert.deepEqual(parsed.parts, [{ text: 'ABC', type: 'text' }]);
+});
+
+test('parseChunkForContent strips fullwidth-delimited leaked control markers from content', () => {
+  const fw = '\uff5c';
+  const chunk = {
+    p: 'response/content',
+    v: `<${fw}begin▁of▁sentence${fw}>A<${fw}▁of▁thought${fw}>B<${fw} end_of_sentence ${fw}>C`,
+  };
+  const parsed = parseChunkForContent(chunk, false, 'text');
+  assert.equal(parsed.finished, false);
+  assert.deepEqual(parsed.parts, [{ text: 'ABC', type: 'text' }]);
+});
+
 test('parseChunkForContent detects content_filter status and ignores upstream output tokens', () => {
   const chunk = {
     p: 'response',
diff --git a/tests/node/stream-tool-sieve.test.js b/tests/node/stream-tool-sieve.test.js
index 1de053d4..b989fdba 100644
--- a/tests/node/stream-tool-sieve.test.js
+++ b/tests/node/stream-tool-sieve.test.js
@@ -57,6 +57,38 @@ test('parseToolCalls parses DSML shell as XML-compatible tool call', () => {
   assert.deepEqual(calls[0].input, { path: 'README.MD' });
 });
 
+test('parseToolCalls tolerates fullwidth closing slash in DSML wrapper', () => {
+  const payload = '<|DSML|tool_calls><|DSML|invoke name="execute_code"><|DSML|parameter name="code"><![CDATA[print("hi")]]></|DSML|parameter></|DSML|invoke><／DSML|tool_calls>';
+  const calls = parseToolCalls(payload, ['execute_code']);
+  assert.equal(calls.length, 1);
+  assert.equal(calls[0].name, 'execute_code');
+  assert.deepEqual(calls[0].input, { code: 'print("hi")' });
+});
+
+test('parseToolCalls tolerates sentencepiece separator and fullwidth terminator', () => {
+  const payload = '<|DSML▁tool_calls|><|DSML▁invoke▁name="execute_code"><|DSML▁parameter▁name="code"><![CDATA[print("hi")]]></|DSML▁parameter></|DSML▁invoke></|DSML▁tool_calls＞';
+  const calls = parseToolCalls(payload, ['execute_code']);
+  assert.equal(calls.length, 1);
+  assert.equal(calls[0].name, 'execute_code');
+  assert.deepEqual(calls[0].input, { code: 'print("hi")' });
+});
+
+test('parseToolCalls tolerates fullwidth opening delimiter and Unicode attribute confusables', () => {
+  const payload = '＜|DSML　tool_calls＞＜|DSML　invoke　name＝“execute_code”＞＜|DSML　parameter　name＝“code”＞<![CDATA[print("hi")]]>＜／DSML|parameter＞＜／DSML|invoke＞＜／DSML|tool_calls＞';
+  const calls = parseToolCalls(payload, ['execute_code']);
+  assert.equal(calls.length, 1);
+  assert.equal(calls[0].name, 'execute_code');
+  assert.deepEqual(calls[0].input, { code: 'print("hi")' });
+});
+
+test('parseToolCalls canonicalizes confusable candidate shell only', () => {
+  const payload = '<|\u200b\uff24\u0405\u039cL|to\u03bfl\uff3fcalls><|\ufeffDSML|inv\u03bfk\u0435 n\u0430me\uff1d\u201cexecute_code\u201d><|\u200bDSML|par\u0430meter n\u0430me\uff1d\u201ccode\u201d><![\ufeff\u0421D\u0410T\u0410[print("hi")]]></|\u200bDSML|par\u0430meter></|\u200bDSML|inv\u03bfk\u0435></|\u200b\uff24\u0405\u039cL|to\u03bfl\uff3fcalls>';
+  const calls = parseToolCalls(payload, ['execute_code']);
+  assert.equal(calls.length, 1);
+  assert.equal(calls[0].name, 'execute_code');
+  assert.deepEqual(calls[0].input, { code: 'print("hi")' });
+});
+
 test('parseToolCalls parses hyphenated DSML shell with here-doc CDATA', () => {
   const payload = `<dsml-tool-calls>
 <dsml-invoke name="Bash">
@@ -130,14 +162,14 @@ test('parseToolCalls ignores camel-prefixed tool markup lookalike', () => {
 });
 
 test('parseToolCalls parses fullwidth DSML shell drift', () => {
-  const payload = `<ｄＳＭＬ｜tool_calls>
-  <ｄＳＭＬ｜invoke name="Read">
-    <ｄＳＭＬ｜parameter name="file_path"＞<![CDATA[/Users/aq/Desktop/myproject/Personal_Blog/README.md]]＞</ｄＳＭＬ｜parameter>
-  </ｄＳＭＬ｜invoke>
-  <ｄＳＭＬ｜invoke name="Read">
-    <ｄＳＭＬ｜parameter name="file_path"＞<![CDATA[/Users/aq/Desktop/myproject/Personal_Blog/index.html]]＞</ｄＳＭＬ｜parameter>
-  </ｄＳＭＬ｜invoke>
-</ｄＳＭＬ｜tool_calls>`;
+  const payload = `<ｄＳＭＬ|tool_calls>
+  <ｄＳＭＬ|invoke name="Read">
+    <ｄＳＭＬ|parameter name="file_path"＞<![CDATA[/Users/aq/Desktop/myproject/Personal_Blog/README.md]]＞</ｄＳＭＬ|parameter>
+  </ｄＳＭＬ|invoke>
+  <ｄＳＭＬ|invoke name="Read">
+    <ｄＳＭＬ|parameter name="file_path"＞<![CDATA[/Users/aq/Desktop/myproject/Personal_Blog/index.html]]＞</ｄＳＭＬ|parameter>
+  </ｄＳＭＬ|invoke>
+</ｄＳＭＬ|tool_calls>`;
   const calls = parseToolCalls(payload, ['Read']);
   assert.equal(calls.length, 2);
   assert.equal(calls[0].name, 'Read');
@@ -147,20 +179,20 @@ test('parseToolCalls parses fullwidth DSML shell drift', () => {
 });
 
 test('parseToolCalls parses CJK-angle DSM drift', () => {
-  const payload = `<DSM｜tool_calls>
-<DSM｜invoke name="Bash">
-<DSM｜parameter name="description"｜>〈![CDATA[Show commits on local dev not on origin/dev]]〉〈/DSM｜parameter〉
-<DSM｜parameter name="command"｜>〈![CDATA[git log --oneline origin/dev..dev]]〉〈/DSM｜parameter〉
-〈/DSM｜invoke〉
-<DSM｜invoke name="Bash">
-<DSM｜parameter name="description"｜>〈![CDATA[Show commits on origin/dev not on local dev]]〉〈/DSM｜parameter〉
-<DSM｜parameter name="command"｜>〈![CDATA[git log --oneline dev..origin/dev]]〉〈/DSM｜parameter〉
-〈/DSM｜invoke〉
-<DSM｜invoke name="Bash">
-<DSM｜parameter name="description"｜>〈![CDATA[Check tracking branch status]]〉〈/DSM｜parameter〉
-<DSM｜parameter name="command"｜>〈![CDATA[git status -b --short]]〉〈/DSM｜parameter〉
-〈/DSM｜invoke〉
-〈/DSM｜tool_calls〉`;
+  const payload = `<DSM|tool_calls>
+<DSM|invoke name="Bash">
+<DSM|parameter name="description"|>〈![CDATA[Show commits on local dev not on origin/dev]]〉〈/DSM|parameter〉
+<DSM|parameter name="command"|>〈![CDATA[git log --oneline origin/dev..dev]]〉〈/DSM|parameter〉
+〈/DSM|invoke〉
+<DSM|invoke name="Bash">
+<DSM|parameter name="description"|>〈![CDATA[Show commits on origin/dev not on local dev]]〉〈/DSM|parameter〉
+<DSM|parameter name="command"|>〈![CDATA[git log --oneline dev..origin/dev]]〉〈/DSM|parameter〉
+〈/DSM|invoke〉
+<DSM|invoke name="Bash">
+<DSM|parameter name="description"|>〈![CDATA[Check tracking branch status]]〉〈/DSM|parameter〉
+<DSM|parameter name="command"|>〈![CDATA[git status -b --short]]〉〈/DSM|parameter〉
+〈/DSM|invoke〉
+〈/DSM|tool_calls〉`;
   const calls = parseToolCalls(payload, ['Bash']);
   assert.equal(calls.length, 3);
   assert.equal(calls[0].name, 'Bash');
@@ -230,13 +262,13 @@ test('parseToolCalls parses arbitrary-prefixed tool tags', () => {
 });
 
 test('parseToolCalls allows all-empty parameter payloads', () => {
-  const payload = `<T｜DSML｜tool_calls>
-  <T｜DSML｜invoke name="TaskOutput">
-    <T｜DSML｜parameter name="task_id"></T｜DSML｜parameter>
-    <T｜DSML｜parameter name="block"></T｜DSML｜parameter>
-    <T｜DSML｜parameter name="timeout"></T｜DSML｜parameter>
-  </T｜DSML｜invoke>
-</T｜DSML｜tool_calls>`;
+  const payload = `<T|DSML|tool_calls>
+  <T|DSML|invoke name="TaskOutput">
+    <T|DSML|parameter name="task_id"></T|DSML|parameter>
+    <T|DSML|parameter name="block"></T|DSML|parameter>
+    <T|DSML|parameter name="timeout"></T|DSML|parameter>
+  </T|DSML|invoke>
+</T|DSML|tool_calls>`;
   const calls = parseToolCalls(payload, ['TaskOutput']);
   assert.equal(calls.length, 1);
   assert.equal(calls[0].name, 'TaskOutput');
@@ -344,6 +376,12 @@ test('parseToolCalls ignores collapsed DSML lookalike tag names', () => {
   assert.equal(calls.length, 0);
 });
 
+test('parseToolCalls rejects confusable near-miss tag names', () => {
+  const payload = '<tool_calls><inv\u03bfker name="execute_code"><parameter name="code">pwd</parameter></inv\u03bfker></tool_calls>';
+  const calls = parseToolCalls(payload, ['execute_code']);
+  assert.equal(calls.length, 0);
+});
+
 test('parseToolCalls keeps canonical XML examples inside DSML CDATA', () => {
   const content = '<tool_calls><invoke name="demo"><parameter name="value">x</parameter></invoke></tool_calls>';
   const payload = `<|DSML|tool_calls><|DSML|invoke name="write_file"><|DSML|parameter name="path">notes.md</|DSML|parameter><|DSML|parameter name="content"><![CDATA[${content}]]></|DSML|parameter></|DSML|invoke></|DSML|tool_calls>`;
@@ -360,6 +398,14 @@ test('parseToolCalls preserves simple inline markup inside CDATA as text', () =>
   assert.equal(calls[0].input.description, '<b>urgent</b>');
 });
 
+test('parseToolCalls keeps confusable markup examples inside CDATA as text', () => {
+  const value = '<inv\u03bfke>literal</inv\u03bfke>';
+  const payload = `<tool_calls><invoke name="Write"><parameter name="description"><![\u200b\u0421D\u0410T\u0410[${value}]]></parameter></invoke></tool_calls>`;
+  const calls = parseToolCalls(payload, ['Write']);
+  assert.equal(calls.length, 1);
+  assert.equal(calls[0].input.description, value);
+});
+
 test('parseToolCalls recovers when CDATA never closes inside a valid wrapper', () => {
   const payload = '<tool_calls><invoke name="Write"><parameter name="content"><![CDATA[hello world</parameter></invoke></tool_calls>';
   const calls = parseToolCalls(payload, ['Write']);
@@ -556,6 +602,65 @@ test('sieve emits tool_calls for DSML space-separator typo', () => {
   assert.equal(text.includes('<|DSML invoke'), false);
 });
 
+test('sieve emits tool_calls for fullwidth closing slash and preserves suffix text', () => {
+  const input = '<|DSML|tool_calls><|DSML|invoke name="execute_code"><|DSML|parameter name="code"><![CDATA[print("hi")]]></|DSML|parameter></|DSML|invoke><／DSML|tool_calls> sao cụm này lại đc trả là 1 message';
+  const events = runSieve([input], ['execute_code']);
+  const text = collectText(events);
+  const finalCalls = events.filter((evt) => evt.type === 'tool_calls').flatMap((evt) => evt.calls || []);
+  assert.equal(finalCalls.length, 1);
+  assert.equal(finalCalls[0].name, 'execute_code');
+  assert.deepEqual(finalCalls[0].input, { code: 'print("hi")' });
+  assert.equal(text, ' sao cụm này lại đc trả là 1 message');
+});
+
+test('sieve emits tool_calls for sentencepiece separator and fullwidth terminator', () => {
+  const input = '<|DSML▁tool_calls|><|DSML▁invoke▁name="execute_code"><|DSML▁parameter▁name="code"><![CDATA[print("hi")]]></|DSML▁parameter></|DSML▁invoke></|DSML▁tool_calls＞ suffix';
+  const events = runSieve([input], ['execute_code']);
+  const text = collectText(events);
+  const finalCalls = events.filter((evt) => evt.type === 'tool_calls').flatMap((evt) => evt.calls || []);
+  assert.equal(finalCalls.length, 1);
+  assert.equal(finalCalls[0].name, 'execute_code');
+  assert.deepEqual(finalCalls[0].input, { code: 'print("hi")' });
+  assert.equal(text, ' suffix');
+});
+
+test('sieve emits tool_calls for fullwidth opening delimiter and Unicode attribute confusables', () => {
+  const input = '＜|DSML　tool_calls＞＜|DSML　invoke　name＝“execute_code”＞＜|DSML　parameter　name＝“code”＞<![CDATA[print("hi")]]>＜／DSML|parameter＞＜／DSML|invoke＞＜／DSML|tool_calls＞ suffix';
+  const events = runSieve([input], ['execute_code']);
+  const text = collectText(events);
+  const finalCalls = events.filter((evt) => evt.type === 'tool_calls').flatMap((evt) => evt.calls || []);
+  assert.equal(finalCalls.length, 1);
+  assert.equal(finalCalls[0].name, 'execute_code');
+  assert.deepEqual(finalCalls[0].input, { code: 'print("hi")' });
+  assert.equal(text, ' suffix');
+});
+
+test('sieve emits tool_calls for confusable candidate shell and preserves suffix text', () => {
+  const input = '<|\u200b\uff24\u0405\u039cL|to\u03bfl\uff3fcalls><|\ufeffDSML|inv\u03bfk\u0435 n\u0430me\uff1d\u201cexecute_code\u201d><|\u200bDSML|par\u0430meter n\u0430me\uff1d\u201ccode\u201d><![\ufeff\u0421D\u0410T\u0410[print("hi")]]></|\u200bDSML|par\u0430meter></|\u200bDSML|inv\u03bfk\u0435></|\u200b\uff24\u0405\u039cL|to\u03bfl\uff3fcalls> suffix';
+  const events = runSieve([input], ['execute_code']);
+  const text = collectText(events);
+  const finalCalls = events.filter((evt) => evt.type === 'tool_calls').flatMap((evt) => evt.calls || []);
+  assert.equal(finalCalls.length, 1);
+  assert.equal(finalCalls[0].name, 'execute_code');
+  assert.deepEqual(finalCalls[0].input, { code: 'print("hi")' });
+  assert.equal(text, ' suffix');
+});
+
+test('sieve repairs confusable missing opening wrapper and preserves suffix text', () => {
+  const events = runSieve([
+    '<inv\u03bfk\u0435 n\u0430me="read_file">\n',
+    '  <par\u0430meter n\u0430me="path"><![\u200b\u0421D\u0410T\u0410[README.md]]></par\u0430meter>\n',
+    '</inv\u03bfk\u0435>\n',
+    '</to\u03bfl_calls> trailing prose',
+  ], ['read_file']);
+  const text = collectText(events);
+  const finalCalls = events.filter((evt) => evt.type === 'tool_calls').flatMap((evt) => evt.calls || []);
+  assert.equal(finalCalls.length, 1);
+  assert.equal(finalCalls[0].name, 'read_file');
+  assert.deepEqual(finalCalls[0].input, { path: 'README.md' });
+  assert.equal(text, ' trailing prose');
+});
+
 test('sieve emits tool_calls for DSML trailing pipe tag terminator', () => {
   const events = runSieve([
     '<|DSML|tool_calls| \n',
@@ -613,12 +718,12 @@ test('sieve emits tool_calls for arbitrary-prefixed tool tags', () => {
 
 test('sieve emits tool_calls for CJK-angle DSM drift', () => {
   const events = runSieve([
-    '<DSM｜tool_calls>\n',
-    '<DSM｜invoke name="Bash">\n',
-    '<DSM｜parameter name="description"｜>〈![CDATA[Check tracking branch status]]〉〈/DSM｜parameter〉\n',
-    '<DSM｜parameter name="command"｜>〈![CDATA[git status -b --short]]〉〈/DSM｜parameter〉\n',
-    '〈/DSM｜invoke〉\n',
-    '〈/DSM｜tool_calls〉',
+    '<DSM|tool_calls>\n',
+    '<DSM|invoke name="Bash">\n',
+    '<DSM|parameter name="description"|>〈![CDATA[Check tracking branch status]]〉〈/DSM|parameter〉\n',
+    '<DSM|parameter name="command"|>〈![CDATA[git status -b --short]]〉〈/DSM|parameter〉\n',
+    '〈/DSM|invoke〉\n',
+    '〈/DSM|tool_calls〉',
   ], ['Bash']);
   const finalCalls = events.flatMap((evt) => (evt.type === 'tool_calls' ? evt.calls : []));
   assert.equal(finalCalls.length, 1);
@@ -665,13 +770,13 @@ test('sieve emits tool_calls for ideographic-comma DSML drift', () => {
 
 test('sieve emits all-empty arbitrary-prefixed tool tags without leaking text', () => {
   const payload = [
-    '<T｜DSML｜tool_calls>\n',
-    '  <T｜DSML｜invoke name="TaskOutput">\n',
-    '    <T｜DSML｜parameter name="task_id"></T｜DSML｜parameter>\n',
-    '    <T｜DSML｜parameter name="block"></T｜DSML｜parameter>\n',
-    '    <T｜DSML｜parameter name="timeout"></T｜DSML｜parameter>\n',
-    '  </T｜DSML｜invoke>\n',
-    '</T｜DSML｜tool_calls>',
+    '<T|DSML|tool_calls>\n',
+    '  <T|DSML|invoke name="TaskOutput">\n',
+    '    <T|DSML|parameter name="task_id"></T|DSML|parameter>\n',
+    '    <T|DSML|parameter name="block"></T|DSML|parameter>\n',
+    '    <T|DSML|parameter name="timeout"></T|DSML|parameter>\n',
+    '  </T|DSML|invoke>\n',
+    '</T|DSML|tool_calls>',
   ].join('');
   for (const chunks of [[payload], payload.match(/.{1,8}/gs)]) {
     const events = runSieve(chunks, ['TaskOutput']);
@@ -742,18 +847,26 @@ test('sieve keeps collapsed DSML lookalike tag names as text', () => {
   assert.equal(collectText(events), input);
 });
 
+test('sieve keeps confusable near-miss wrappers as text', () => {
+  const input = '<to\u03bfl_callz><inv\u03bfke name="read_file"><parameter name="path">README.md</parameter></inv\u03bfke></to\u03bfl_callz>';
+  const events = runSieve([input], ['read_file']);
+  const finalCalls = events.filter((evt) => evt.type === 'tool_calls').flatMap((evt) => evt.calls || []);
+  assert.equal(finalCalls.length, 0);
+  assert.equal(collectText(events), input);
+});
+
 test('sieve preserves review body with alias mentions before real DSML tool calls', () => {
   const events = runSieve([
     "Done reviewing the diff. Here's my analysis before we commit:\n\n",
     'Summary of Changes\n',
-    'DSML wrapper variant support — recognize aliases (<dsml|tool_calls>, <|tool_calls>, <｜tool_calls>) alongside canonical <tool_calls> and <|DSML|tool_calls> wrappers.\n\n',
+    'DSML wrapper variant support — recognize aliases (<dsml|tool_calls>, <|tool_calls>) alongside canonical <tool_calls> and <|DSML|tool_calls> wrappers.\n\n',
     '<|DSML|tool_calls>\n',
     '<|DSML|invoke name="Bash">\n',
     '<|DSML|parameter name="command"><![CDATA[git add docs/toolcall-semantics.md internal/toolstream/tool_sieve_xml.go]]></|DSML|parameter>\n',
     '<|DSML|parameter name="description"><![CDATA[Stage all relevant changed files]]></|DSML|parameter>\n',
     '</|DSML|invoke>\n',
     '<|DSML|invoke name="Bash">\n',
-    '<|DSML|parameter name="command"><![CDATA[git commit -m "$(cat <<\'EOF\'\nfeat(toolstream): expand DSML wrapper detection\n\nSupport DSML wrapper aliases: <dsml|tool_calls>, <|tool_calls>, <｜tool_calls> alongside existing canonical wrappers.\nEOF\n)"]]></|DSML|parameter>\n',
+    '<|DSML|parameter name="command"><![CDATA[git commit -m "$(cat <<\'EOF\'\nfeat(toolstream): expand DSML wrapper detection\n\nSupport DSML wrapper aliases: <dsml|tool_calls> and <|tool_calls> alongside existing canonical wrappers.\nEOF\n)"]]></|DSML|parameter>\n',
     '<|DSML|parameter name="description"><![CDATA[Create commit with all staged changes]]></|DSML|parameter>\n',
     '</|DSML|invoke>\n',
     '</|DSML|tool_calls>',
@@ -880,7 +993,7 @@ test('sieve emits tool_calls when DSML tag spans multiple chunks', () => {
 test('sieve emits tool_calls when fullwidth DSML prefix variant spans multiple chunks', () => {
   const events = runSieve(
     [
-      '<｜DSML|tool',
+      '<|DSML|tool',
       '_calls>\n',
       '<|DSML|invoke name="Bash">\n',
       '<|DSML|parameter name="command"><![CDATA[ls -la /Users/aq/Desktop/myproject/ds2api/]]></|DSML|parameter>\n',
diff --git a/tests/raw_stream_samples/continue-thinking-snapshot-replay-20260405/meta.json b/tests/raw_stream_samples/continue-thinking-snapshot-replay-20260405/meta.json
index 02d9cd46..40f7abc4 100644
--- a/tests/raw_stream_samples/continue-thinking-snapshot-replay-20260405/meta.json
+++ b/tests/raw_stream_samples/continue-thinking-snapshot-replay-20260405/meta.json
@@ -5,7 +5,7 @@
   "request": {
     "chat_session_id": "0a3c904d-5761-4cf0-ae51-9b41c1c78f1e",
     "parent_message_id": null,
-    "prompt": "<｜System｜>\n**Memories**\nThese are memories stored via the memory_tool that you can reference in future conversations.\n[]\n\n\n**Recent Chats**\nThese are some of the user's recent conversations. You can use them to understand user preferences:\n[\n    {\n        \"title\": \"\",\n        \"last_chat\": \"2026年4月6日\"\n    },\n    {\n        \"title\": \"\",\n        \"last_chat\": \"2026年4月6日\"\n    },\n    {\n        \"title\": \"江青判刑原因\",\n        \"last_chat\": \"2026年4月5日\"\n    },\n    {\n        \"title\": \"GitHub個人檔案\",\n        \"last_chat\": \"2026年4月4日\"\n    },\n    {\n        \"title\": \"DS2API架構圖\",\n        \"last_chat\": \"2026年4月4日\"\n    },\n    {\n        \"title\": \"Markdown範例\",\n        \"last_chat\": \"2026年4月4日\"\n    },\n    {\n        \"title\": \"廣州天氣概況\",\n        \"last_chat\": \"2026年4月4日\"\n    },\n    {\n        \"title\": \"Xbox手把SVG\",\n        \"last_chat\": \"2026年4月4日\"\n    },\n    {\n        \"title\": \"清除记忆\",\n        \"last_chat\": \"2026年4月4日\"\n    },\n    {\n        \"title\": \"SVG與安卓XML示例\",\n        \"last_chat\": \"2026年4月4日\"\n    }\n]\n\n\n\n\n\n\n\n\n\n\nYou have access to these tools:\n\nTool: memory_tool\nDescription: The memory tool stores long-term information across conversations.\nUse `action` to control the operation: `create` (add), `edit` (update), `delete` (remove).\n- No relevant record: `create` + `content`\n- Existing relevant record: `edit` + `id` + `content`\n- Outdated/irrelevant record: `delete` + `id`\nMemories will automatically appear in the <memories> tag in later conversations.\nDo not store sensitive information (e.g., ethnicity, religion, sexual orientation, political views, sex life, criminal records).\nYou may store: preferred name, preferences, plans, work-related notes, chat style preferences, first chat time, etc.\nDo not show memory content directly in the conversation unless the user explicitly asks.\nToday is 2026年4月6日.\nSimilar memories should be merged; prefer updating existing records.\n\nExamples:\n{\"action\":\"create\",\"content\":\"User prefers brief replies and is more active on weekends.\"}\n{\"action\":\"edit\",\"id\":12,\"content\":\"User’s preferred name updated to “A-Xing”, prefers Chinese replies.\"}\n{\"action\":\"delete\",\"id\":7}\nParameters: {\"properties\":{\"action\":{\"description\":\"Operation to perform: create, edit, or delete\",\"enum\":[\"create\",\"edit\",\"delete\"],\"type\":\"string\"},\"content\":{\"description\":\"The content of the memory record (required for create/edit)\",\"type\":\"string\"},\"id\":{\"description\":\"The id of the memory record (required for edit/delete)\",\"type\":\"integer\"}},\"required\":[\"action\"],\"type\":\"object\"}\n\nTool: search_web\nDescription: Search the web for up-to-date or specific information.\nUse this when the user asks for the latest news, current facts, or needs verification.\nGenerate focused keywords and run multiple searches if needed.\nToday is 2026年4月6日.\n\nResponse format:\n- items[].id (short id), title, url, text\n\nCitations:\n- After using results, add `[citation,domain](id)` after the sentence.\n- Multiple citations are allowed.\n- If no results are cited, omit citations.\n\nExample:\nThe capital of France is Paris. [citation,example.com](abc123)\nThe population is about 2.1 million. [citation,example.com](abc123) [citation,example2.com](def456)\nParameters: {\"properties\":{\"query\":{\"description\":\"search keyword\",\"type\":\"string\"},\"topic\":{\"description\":\"search topic (one of `general`, `news`, `finance`)\",\"enum\":[\"general\",\"news\",\"finance\"],\"type\":\"string\"}},\"required\":[\"query\"],\"type\":\"object\"}\n\nTool: scrape_web\nDescription: Scrape a URL for detailed page content.\nUse this when the user requests content from a specific page or when search snippets are insufficient.\nAvoid using it for common questions unless the user asks.\nParameters: {\"properties\":{\"url\":{\"description\":\"url to scrape\",\"type\":\"string\"}},\"required\":[\"url\"],\"type\":\"object\"}\n\nTool: eval_javascript\nDescription: Execute JavaScript code using QuickJS engine (ES2020). The result is the value of the last expression in the code. For calculations with decimals, use toFixed() to control precision. Console output (log/info/warn/error) is captured and returned in 'logs' field. No DOM or Node.js APIs available. Example: '1 + 2' returns 3; 'const x = 5; x * 2' returns 10.\nParameters: {\"properties\":{\"code\":{\"description\":\"The JavaScript code to execute\",\"type\":\"string\"}},\"required\":[\"code\"],\"type\":\"object\"}\n\nTool: get_time_info\nDescription: Get the current local date and time info from the device. Returns year/month/day, weekday, ISO date/time strings, timezone, and timestamp.\nParameters: {\"properties\":{},\"type\":\"object\"}\n\nTool: clipboard_tool\nDescription: Read or write plain text from the device clipboard. Use action: read or write. For write, provide text. Do NOT write to the clipboard unless the user has explicitly requested it.\nParameters: {\"properties\":{\"action\":{\"description\":\"Operation to perform: read or write\",\"enum\":[\"read\",\"write\"],\"type\":\"string\"},\"text\":{\"description\":\"Text to write to the clipboard (required for write)\",\"type\":\"string\"}},\"required\":[\"action\"],\"type\":\"object\"}\n\nTool: text_to_speech\nDescription: Speak text aloud to the user using the device's text-to-speech engine. Use this when the user asks you to read something aloud, or when audio output is appropriate. The tool returns immediately; audio plays in the background on the device. Provide natural, readable text without markdown formatting.\nParameters: {\"properties\":{\"text\":{\"description\":\"The text to speak aloud\",\"type\":\"string\"}},\"required\":[\"text\"],\"type\":\"object\"}\n\nTool: ask_user\nDescription: Ask the user one or more questions when you need clarification, additional information, or confirmation. Each question can optionally provide a list of suggested options for the user to choose from. The user may select an option or provide their own free-text answer for each question. The answers will be returned as a JSON object mapping question IDs to the user's responses.\nParameters: {\"properties\":{\"questions\":{\"description\":\"List of questions to ask the user\",\"items\":{\"properties\":{\"id\":{\"description\":\"Unique identifier for this question\",\"type\":\"string\"},\"options\":{\"description\":\"Optional list of suggested options for the user to choose from\",\"items\":{\"type\":\"string\"},\"type\":\"array\"},\"question\":{\"description\":\"The question text to display to the user\",\"type\":\"string\"},\"selection_type\":{\"description\":\"Answer type: text (free text input, default), single (select exactly one option), multi (select one or more options)\",\"enum\":[\"text\",\"single\",\"multi\"],\"type\":\"string\"}},\"required\":[\"id\",\"question\"],\"type\":\"object\"},\"type\":\"array\"}},\"required\":[\"questions\"],\"type\":\"object\"}\n\nTOOL CALL FORMAT — FOLLOW EXACTLY:\n\n<tool_calls>\n  <invoke name=\"TOOL_NAME_HERE\">\n    <parameter name=\"PARAMETER_NAME\"><![CDATA[PARAMETER_VALUE]]></parameter>\n  </invoke>\n</tool_calls>\n\nRULES:\n1) Use the <tool_calls> XML wrapper format only.\n2) Put one or more <invoke> entries under a single <tool_calls> root.\n3) Use <invoke name=\"...\"> for the tool name and <parameter name=\"...\"> for each argument.\n4) All string values should use <![CDATA[...]]> when they may contain code, markup, JSON, paths, prompts, or other special characters.\n5) Objects use nested XML inside a <parameter>; arrays may repeat <item> children.\n6) Numbers, booleans, and null stay plain text.\n7) Use only the parameter names in the tool schema. Do not invent fields.\n8) Do NOT wrap XML in markdown fences. Do NOT output explanations, role markers, or internal monologue.\n\nPARAMETER SHAPES:\n- string => <parameter name=\"x\"><![CDATA[value]]></parameter>\n- object => <parameter name=\"x\"><field>...</field></parameter>\n- array => <parameter name=\"x\"><item>...</item></parameter>\n- number/bool/null => plain text\n\n【WRONG — Do NOT do these】:\n\nWrong 1 — mixed text after XML:\n  <tool_calls>...</tool_calls> I hope this helps.\nWrong 2 — old canonical tags or raw payloads:\n  <tools><tool_call><tool_name>read_file</tool_name><param>{\"path\":\"x\"}</param></tool_call></tools>\nWrong 3 — Markdown code fences:\n  ```xml\n  <tool_calls>...</tool_calls>\n  ```\n\nRemember: The ONLY valid way to use tools is the <tool_calls>...</tool_calls> XML block at the end of your response.\n\n【CORRECT EXAMPLES】:\n\nExample A — Single tool:\n<tool_calls>\n  <invoke name=\"read_file\">\n    <parameter name=\"path\"><![CDATA[src/main.go]]></parameter>\n  </invoke>\n</tool_calls>\n\nExample B — Two tools in parallel:\n<tool_calls>\n  <invoke name=\"read_file\">\n    <parameter name=\"path\"><![CDATA[src/main.go]]></parameter>\n  </invoke>\n  <invoke name=\"write_to_file\">\n    <parameter name=\"path\"><![CDATA[output.txt]]></parameter>\n    <parameter name=\"content\"><![CDATA[Hello world]]></parameter>\n  </invoke>\n</tool_calls>\n\nExample C — Tool with nested XML parameters:\n<tool_calls>\n  <invoke name=\"ask_followup_question\">\n    <parameter name=\"question\"><![CDATA[Which approach do you prefer?]]></parameter>\n    <parameter name=\"follow_up\"><item><text><![CDATA[Option A]]></text></item><item><text><![CDATA[Option B]]></text></item></parameter>\n  </invoke>\n</tool_calls>\n<｜end▁of▁instructions｜>\n\n<｜User｜>\n<｜User｜>\n在一个类似2022×2022的花园的每个方格中，最初都有一个高度为0的树，园丁和伐木工交替进行以下游戏，园丁首先开始：园丁选择花园中的一个方格，该方格上的每棵树以及周围至多八个方格中的所有树都会增长一单位，伐木工随后选择板上的四个不同方格，这些方格上正高的树都会减少一单位，称一棵树为雄伟的，如果其高度至少为10的六次方.确定园丁能够确保板上最终有K棵雄伟的树，无论伐木工如何操作，求最大的K<｜end▁of▁sentence｜><｜end▁of▁sentence｜>",
+    "prompt": "<|System|>\n**Memories**\nThese are memories stored via the memory_tool that you can reference in future conversations.\n[]\n\n\n**Recent Chats**\nThese are some of the user's recent conversations. You can use them to understand user preferences:\n[\n    {\n        \"title\": \"\",\n        \"last_chat\": \"2026年4月6日\"\n    },\n    {\n        \"title\": \"\",\n        \"last_chat\": \"2026年4月6日\"\n    },\n    {\n        \"title\": \"江青判刑原因\",\n        \"last_chat\": \"2026年4月5日\"\n    },\n    {\n        \"title\": \"GitHub個人檔案\",\n        \"last_chat\": \"2026年4月4日\"\n    },\n    {\n        \"title\": \"DS2API架構圖\",\n        \"last_chat\": \"2026年4月4日\"\n    },\n    {\n        \"title\": \"Markdown範例\",\n        \"last_chat\": \"2026年4月4日\"\n    },\n    {\n        \"title\": \"廣州天氣概況\",\n        \"last_chat\": \"2026年4月4日\"\n    },\n    {\n        \"title\": \"Xbox手把SVG\",\n        \"last_chat\": \"2026年4月4日\"\n    },\n    {\n        \"title\": \"清除记忆\",\n        \"last_chat\": \"2026年4月4日\"\n    },\n    {\n        \"title\": \"SVG與安卓XML示例\",\n        \"last_chat\": \"2026年4月4日\"\n    }\n]\n\n\n\n\n\n\n\n\n\n\nYou have access to these tools:\n\nTool: memory_tool\nDescription: The memory tool stores long-term information across conversations.\nUse `action` to control the operation: `create` (add), `edit` (update), `delete` (remove).\n- No relevant record: `create` + `content`\n- Existing relevant record: `edit` + `id` + `content`\n- Outdated/irrelevant record: `delete` + `id`\nMemories will automatically appear in the <memories> tag in later conversations.\nDo not store sensitive information (e.g., ethnicity, religion, sexual orientation, political views, sex life, criminal records).\nYou may store: preferred name, preferences, plans, work-related notes, chat style preferences, first chat time, etc.\nDo not show memory content directly in the conversation unless the user explicitly asks.\nToday is 2026年4月6日.\nSimilar memories should be merged; prefer updating existing records.\n\nExamples:\n{\"action\":\"create\",\"content\":\"User prefers brief replies and is more active on weekends.\"}\n{\"action\":\"edit\",\"id\":12,\"content\":\"User’s preferred name updated to “A-Xing”, prefers Chinese replies.\"}\n{\"action\":\"delete\",\"id\":7}\nParameters: {\"properties\":{\"action\":{\"description\":\"Operation to perform: create, edit, or delete\",\"enum\":[\"create\",\"edit\",\"delete\"],\"type\":\"string\"},\"content\":{\"description\":\"The content of the memory record (required for create/edit)\",\"type\":\"string\"},\"id\":{\"description\":\"The id of the memory record (required for edit/delete)\",\"type\":\"integer\"}},\"required\":[\"action\"],\"type\":\"object\"}\n\nTool: search_web\nDescription: Search the web for up-to-date or specific information.\nUse this when the user asks for the latest news, current facts, or needs verification.\nGenerate focused keywords and run multiple searches if needed.\nToday is 2026年4月6日.\n\nResponse format:\n- items[].id (short id), title, url, text\n\nCitations:\n- After using results, add `[citation,domain](id)` after the sentence.\n- Multiple citations are allowed.\n- If no results are cited, omit citations.\n\nExample:\nThe capital of France is Paris. [citation,example.com](abc123)\nThe population is about 2.1 million. [citation,example.com](abc123) [citation,example2.com](def456)\nParameters: {\"properties\":{\"query\":{\"description\":\"search keyword\",\"type\":\"string\"},\"topic\":{\"description\":\"search topic (one of `general`, `news`, `finance`)\",\"enum\":[\"general\",\"news\",\"finance\"],\"type\":\"string\"}},\"required\":[\"query\"],\"type\":\"object\"}\n\nTool: scrape_web\nDescription: Scrape a URL for detailed page content.\nUse this when the user requests content from a specific page or when search snippets are insufficient.\nAvoid using it for common questions unless the user asks.\nParameters: {\"properties\":{\"url\":{\"description\":\"url to scrape\",\"type\":\"string\"}},\"required\":[\"url\"],\"type\":\"object\"}\n\nTool: eval_javascript\nDescription: Execute JavaScript code using QuickJS engine (ES2020). The result is the value of the last expression in the code. For calculations with decimals, use toFixed() to control precision. Console output (log/info/warn/error) is captured and returned in 'logs' field. No DOM or Node.js APIs available. Example: '1 + 2' returns 3; 'const x = 5; x * 2' returns 10.\nParameters: {\"properties\":{\"code\":{\"description\":\"The JavaScript code to execute\",\"type\":\"string\"}},\"required\":[\"code\"],\"type\":\"object\"}\n\nTool: get_time_info\nDescription: Get the current local date and time info from the device. Returns year/month/day, weekday, ISO date/time strings, timezone, and timestamp.\nParameters: {\"properties\":{},\"type\":\"object\"}\n\nTool: clipboard_tool\nDescription: Read or write plain text from the device clipboard. Use action: read or write. For write, provide text. Do NOT write to the clipboard unless the user has explicitly requested it.\nParameters: {\"properties\":{\"action\":{\"description\":\"Operation to perform: read or write\",\"enum\":[\"read\",\"write\"],\"type\":\"string\"},\"text\":{\"description\":\"Text to write to the clipboard (required for write)\",\"type\":\"string\"}},\"required\":[\"action\"],\"type\":\"object\"}\n\nTool: text_to_speech\nDescription: Speak text aloud to the user using the device's text-to-speech engine. Use this when the user asks you to read something aloud, or when audio output is appropriate. The tool returns immediately; audio plays in the background on the device. Provide natural, readable text without markdown formatting.\nParameters: {\"properties\":{\"text\":{\"description\":\"The text to speak aloud\",\"type\":\"string\"}},\"required\":[\"text\"],\"type\":\"object\"}\n\nTool: ask_user\nDescription: Ask the user one or more questions when you need clarification, additional information, or confirmation. Each question can optionally provide a list of suggested options for the user to choose from. The user may select an option or provide their own free-text answer for each question. The answers will be returned as a JSON object mapping question IDs to the user's responses.\nParameters: {\"properties\":{\"questions\":{\"description\":\"List of questions to ask the user\",\"items\":{\"properties\":{\"id\":{\"description\":\"Unique identifier for this question\",\"type\":\"string\"},\"options\":{\"description\":\"Optional list of suggested options for the user to choose from\",\"items\":{\"type\":\"string\"},\"type\":\"array\"},\"question\":{\"description\":\"The question text to display to the user\",\"type\":\"string\"},\"selection_type\":{\"description\":\"Answer type: text (free text input, default), single (select exactly one option), multi (select one or more options)\",\"enum\":[\"text\",\"single\",\"multi\"],\"type\":\"string\"}},\"required\":[\"id\",\"question\"],\"type\":\"object\"},\"type\":\"array\"}},\"required\":[\"questions\"],\"type\":\"object\"}\n\nTOOL CALL FORMAT — FOLLOW EXACTLY:\n\n<tool_calls>\n  <invoke name=\"TOOL_NAME_HERE\">\n    <parameter name=\"PARAMETER_NAME\"><![CDATA[PARAMETER_VALUE]]></parameter>\n  </invoke>\n</tool_calls>\n\nRULES:\n1) Use the <tool_calls> XML wrapper format only.\n2) Put one or more <invoke> entries under a single <tool_calls> root.\n3) Use <invoke name=\"...\"> for the tool name and <parameter name=\"...\"> for each argument.\n4) All string values should use <![CDATA[...]]> when they may contain code, markup, JSON, paths, prompts, or other special characters.\n5) Objects use nested XML inside a <parameter>; arrays may repeat <item> children.\n6) Numbers, booleans, and null stay plain text.\n7) Use only the parameter names in the tool schema. Do not invent fields.\n8) Do NOT wrap XML in markdown fences. Do NOT output explanations, role markers, or internal monologue.\n\nPARAMETER SHAPES:\n- string => <parameter name=\"x\"><![CDATA[value]]></parameter>\n- object => <parameter name=\"x\"><field>...</field></parameter>\n- array => <parameter name=\"x\"><item>...</item></parameter>\n- number/bool/null => plain text\n\n【WRONG — Do NOT do these】:\n\nWrong 1 — mixed text after XML:\n  <tool_calls>...</tool_calls> I hope this helps.\nWrong 2 — old canonical tags or raw payloads:\n  <tools><tool_call><tool_name>read_file</tool_name><param>{\"path\":\"x\"}</param></tool_call></tools>\nWrong 3 — Markdown code fences:\n  ```xml\n  <tool_calls>...</tool_calls>\n  ```\n\nRemember: The ONLY valid way to use tools is the <tool_calls>...</tool_calls> XML block at the end of your response.\n\n【CORRECT EXAMPLES】:\n\nExample A — Single tool:\n<tool_calls>\n  <invoke name=\"read_file\">\n    <parameter name=\"path\"><![CDATA[src/main.go]]></parameter>\n  </invoke>\n</tool_calls>\n\nExample B — Two tools in parallel:\n<tool_calls>\n  <invoke name=\"read_file\">\n    <parameter name=\"path\"><![CDATA[src/main.go]]></parameter>\n  </invoke>\n  <invoke name=\"write_to_file\">\n    <parameter name=\"path\"><![CDATA[output.txt]]></parameter>\n    <parameter name=\"content\"><![CDATA[Hello world]]></parameter>\n  </invoke>\n</tool_calls>\n\nExample C — Tool with nested XML parameters:\n<tool_calls>\n  <invoke name=\"ask_followup_question\">\n    <parameter name=\"question\"><![CDATA[Which approach do you prefer?]]></parameter>\n    <parameter name=\"follow_up\"><item><text><![CDATA[Option A]]></text></item><item><text><![CDATA[Option B]]></text></item></parameter>\n  </invoke>\n</tool_calls>\n<|end▁of▁instructions|>\n\n<|User|>\n<|User|>\n在一个类似2022×2022的花园的每个方格中，最初都有一个高度为0的树，园丁和伐木工交替进行以下游戏，园丁首先开始：园丁选择花园中的一个方格，该方格上的每棵树以及周围至多八个方格中的所有树都会增长一单位，伐木工随后选择板上的四个不同方格，这些方格上正高的树都会减少一单位，称一棵树为雄伟的，如果其高度至少为10的六次方.确定园丁能够确保板上最终有K棵雄伟的树，无论伐木工如何操作，求最大的K<|end▁of▁sentence|><|end▁of▁sentence|>",
     "ref_file_ids": [],
     "search_enabled": false,
     "thinking_enabled": true
diff --git a/webui/src/features/chatHistory/chatHistoryUtils.js b/webui/src/features/chatHistory/chatHistoryUtils.js
index 6359c39e..0c100601 100644
--- a/webui/src/features/chatHistory/chatHistoryUtils.js
+++ b/webui/src/features/chatHistory/chatHistoryUtils.js
@@ -3,14 +3,14 @@ export const DISABLED_LIMIT = 0
 export const MESSAGE_COLLAPSE_AT = 700
 export const VIEW_MODE_KEY = 'ds2api_chat_history_view_mode'
 
-const BEGIN_SENTENCE_MARKER = '<｜begin▁of▁sentence｜>'
-const SYSTEM_MARKER = '<｜System｜>'
-const USER_MARKER = '<｜User｜>'
-const ASSISTANT_MARKER = '<｜Assistant｜>'
-const TOOL_MARKER = '<｜Tool｜>'
-const END_INSTRUCTIONS_MARKER = '<｜end▁of▁instructions｜>'
-const END_SENTENCE_MARKER = '<｜end▁of▁sentence｜>'
-const END_TOOL_RESULTS_MARKER = '<｜end▁of▁toolresults｜>'
+const BEGIN_SENTENCE_MARKER = '<|begin▁of▁sentence|>'
+const SYSTEM_MARKER = '<|System|>'
+const USER_MARKER = '<|User|>'
+const ASSISTANT_MARKER = '<|Assistant|>'
+const TOOL_MARKER = '<|Tool|>'
+const END_INSTRUCTIONS_MARKER = '<|end▁of▁instructions|>'
+const END_SENTENCE_MARKER = '<|end▁of▁sentence|>'
+const END_TOOL_RESULTS_MARKER = '<|end▁of▁toolresults|>'
 const CURRENT_INPUT_FILE_PROMPT = 'Continue from the latest state in the attached DS2API_HISTORY.txt context. Treat it as the current working state and answer the latest user request directly.'
 const LEGACY_CURRENT_INPUT_FILE_PROMPTS = new Set([
     'The current request and prior conversation context have already been provided. Answer the latest user request directly.',