You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When iterating on a long DeepSeek V4 Pro turn — long system prompt,
many cached repo files, tool definitions, attached @mentions,
multi-step thinking — the developer cannot see what is actually
about to be sent without sending it.
For V4 Pro users this has a concrete, recurring cost:
Thinking tokens are billed and a single bad turn can be expensive.
The 1M context window means accidental prompt bloat is silent
until /tokens shows it after the fact.
Tool schemas, system prompt, skill packs, and reasoning_content
replay (feat(api): audit V4 reasoning_content replay policy #939) all silently inflate the request input. Today the
only way to know how big the next request is, is to send it.
A new read-only slash command, /dryrun (alias /preview-request),
that renders exactly what the next chat completion request would
look like without sending it.
Two output modes:
/dryrun — one-screen summary:
active provider + base URL (with /v1 vs /beta resolved),
active model + reasoning effort,
message count broken down by role (system / user / assistant /
tool) and a note if reasoning_content replay would be
attached on assistant tool-call messages,
system prompt size in chars / approx tokens,
tool count + total tool-schema approx tokens,
composer-draft size if non-empty,
estimated total input tokens via the existing crate::compaction::estimate_tokens helper, and a footer line
explicitly labeling it as an approximation.
/dryrun --full — the full JSON body that would be POSTed,
pretty-printed, with:
Authorization / API-key headers redacted to sk-…<last4>,
tool-result message bodies truncated past N chars with a [truncated K chars] marker,
any field added later (e.g. strict-tool toggle) automatically
included because the body is built from the same struct the
engine sends.
/dryrun is read-only: it does not mutate api_messages, session.*, telemetry, cache history, or the composer. It does
not call the network. It does not record a turn.
Why this is a good fit
High frequency, V4-Pro-specific value: directly attacks the
"I don't know what I'm about to spend" problem that the 1M
context window + thinking pricing creates.
Minimal new surface: 1 new command file plus three tiny wirings.
Composes well with existing /tokens (post-hoc) and /cost
(post-hoc) by giving the missing pre-hoc view.
Scope (single small PR, target 4 files)
Files touched:
NEWcrates/tui/src/commands/dryrun.rs (~200 LOC):
Builds a synthetic message list from app.api_messages plus
a synthetic user message from app.composer.input if it is
non-empty.
Calls crate::compaction::estimate_tokens for the input
token estimate (no new estimator).
Formats the summary or the JSON body via serde_json against
the same outbound message struct the engine already uses.
Redaction helpers + truncation helpers live in this file with
unit tests.
crates/tui/src/commands/mod.rs — register the command in pub mod, in COMMANDS, and in the dispatch match (mirrors /tokens).
crates/tui/src/localization.rs — add 2 message IDs
(CmdDryrunDescription, DryrunFooterApprox) with strings for
every shipped locale, following the existing pattern.
(Only if the existing engine request-build helper is pub(crate) and not reachable from commands/) a tiny
visibility bump on that single function, with a comment.
No new public crate API.
No change to telemetry, cost, session accounting, cache history,
provider/model switch logic, /clear, or /load.
Alternatives considered
Extending /tokens with a "would-send" mode: rejected — /tokens
is post-hoc and overloading it muddies its contract.
Logging the request to a file behind a debug flag: rejected —
forces the user to leave the TUI and grep, doesn't help during
composition.
A --dry-run flag on the engine: rejected — engine surface is
the wrong layer for an interactive ergonomic and would be a
much larger PR.
Additional context
AI-use disclosure / human-owner statement
This proposal was drafted with AI assistance (GitHub Copilot /
Claude). The human owner is @peixl, who reviews each change
line-by-line, owns the resulting PR, and is responsible for the
test plan and merge.
Start a session with a non-trivial system prompt and one tool
call, type a draft, run /dryrun and /dryrun --full, confirm
numbers and JSON match what the next send produces.
Confirm /dryrun on a fresh session prints non-zero system
prompt size and 0 for everything else, no panics, no network.
Awaiting maintainer accept
Per the contributor guide and PR #967's closing note, no
implementation will be pushed until this issue is accepted by a
maintainer. Filing as a proposal first.
Problem
When iterating on a long DeepSeek V4 Pro turn — long system prompt,
many cached repo files, tool definitions, attached @mentions,
multi-step thinking — the developer cannot see what is actually
about to be sent without sending it.
For V4 Pro users this has a concrete, recurring cost:
until
/tokensshows it after the fact.reasoning_contentreplay (feat(api): audit V4 reasoning_content replay policy #939) all silently inflate the request input. Today the
only way to know how big the next request is, is to send it.
small request-shape regressions hard to spot from the transcript.
Proposed solution
A new read-only slash command,
/dryrun(alias/preview-request),that renders exactly what the next chat completion request would
look like without sending it.
Two output modes:
/dryrun— one-screen summary:/v1vs/betaresolved),tool) and a note if
reasoning_contentreplay would beattached on assistant tool-call messages,
crate::compaction::estimate_tokenshelper, and a footer lineexplicitly labeling it as an approximation.
/dryrun --full— the full JSON body that would be POSTed,pretty-printed, with:
Authorization/ API-key headers redacted tosk-…<last4>,[truncated K chars]marker,included because the body is built from the same struct the
engine sends.
/dryrunis read-only: it does not mutateapi_messages,session.*, telemetry, cache history, or the composer. It doesnot call the network. It does not record a turn.
Why this is a good fit
"I don't know what I'm about to spend" problem that the 1M
context window + thinking pricing creates.
session-accounting risky surface that PR Expose DeepSeek V4 reasoning usage ledger #967 was closed for.
/tokens(post-hoc) and/cost(post-hoc) by giving the missing pre-hoc view.
Scope (single small PR, target 4 files)
Files touched:
NEW
crates/tui/src/commands/dryrun.rs(~200 LOC):app.api_messagesplusa synthetic
usermessage fromapp.composer.inputif it isnon-empty.
crate::compaction::estimate_tokensfor the inputtoken estimate (no new estimator).
serde_jsonagainstthe same outbound message struct the engine already uses.
unit tests.
crates/tui/src/commands/mod.rs— register the command inpub mod, inCOMMANDS, and in the dispatch match (mirrors/tokens).crates/tui/src/localization.rs— add 2 message IDs(
CmdDryrunDescription,DryrunFooterApprox) with strings forevery shipped locale, following the existing pattern.
(Only if the existing engine request-build helper is
pub(crate)and not reachable fromcommands/) a tinyvisibility bump on that single function, with a comment.
No new public crate API.
No change to telemetry, cost, session accounting, cache history,
provider/model switch logic,
/clear, or/load.Alternatives considered
/tokenswith a "would-send" mode: rejected —/tokensis post-hoc and overloading it muddies its contract.
forces the user to leave the TUI and grep, doesn't help during
composition.
--dry-runflag on the engine: rejected — engine surface isthe wrong layer for an interactive ergonomic and would be a
much larger PR.
Additional context
AI-use disclosure / human-owner statement
This proposal was drafted with AI assistance (GitHub Copilot /
Claude). The human owner is @peixl, who reviews each change
line-by-line, owns the resulting PR, and is responsible for the
test plan and merge.
Test plan for the eventual PR
Unit tests in
crates/tui/src/commands/dryrun.rs:dryrun_summary_includes_model_provider_and_token_estimatedryrun_summary_appends_composer_draft_as_synthetic_user_turndryrun_summary_does_not_mutate_app_state(snapshotapi_messages,session.*,composer.inputbefore and after).dryrun_full_redacts_api_key_to_last4dryrun_full_truncates_long_tool_result_bodiesdryrun_handles_empty_session_without_panicWorkspace gates:
cargo test -p deepseek-tui dryrun_ --lockedcargo test -p deepseek-tui --lockedcargo fmt --all -- --checkcargo clippy -p deepseek-tui --all-targets --all-features --locked -- -D warningsgit diff --check origin/main...HEADManual:
call, type a draft, run
/dryrunand/dryrun --full, confirmnumbers and JSON match what the next send produces.
/dryrunon a fresh session prints non-zero systemprompt size and
0for everything else, no panics, no network.Awaiting maintainer accept
Per the contributor guide and PR #967's closing note, no
implementation will be pushed until this issue is accepted by a
maintainer. Filing as a proposal first.