BUG: messageStats incorrectly applies prime-token correction when output and reasoning tokens mix in prime chunks

## Summary

`messageStats()` applies a single aggregate `primeTokens` count against both `output` and `generated` token numerators, but does not track whether prime tokens are reasoning or output tokens. When the first streamed chunk is reasoning (typical for reasoning models like Claude Extended Thinking or Gemini Flash Thinking), `primeTokens` is deducted from the **output** numerator using only `Math.min(primeTokens, output)` as a cap — yielding `decodeOutput = 0` even when all actual output tokens were decoded during the active window.

## Root Cause

**File:** `plugins/tps/tps.js:120-122`

```js
const onActive = decodeSource === "active";
const decodeGenerated = onActive ? Math.max(0, generated - primeTokens) : generated;
const decodeOutput = onActive ? Math.max(0, output - Math.min(primeTokens, output)) : output;
```

`primeTokens` is a single number — the sum of all "prime" chunk tokens regardless of type (reasoning or text). `decodeGenerated` correctly subtracts ALL prime tokens from the combined `generated` total. **But `decodeOutput` applies the same subtraction against `output` only, capped at output's value**, without knowing how many of the prime tokens were actually output vs. reasoning.

## Concrete Reproduction

Consider a reasoning model streaming this sequence:

| Chunk | Type | Chars | Tokens (ratio 4) | Prime? |
|-------|------|-------|-------------------|--------|
| 1 | reasoning | 80 | 20 | yes (first chunk) |
| 2 | text/output | 40 | 10 | no |
| 3 | text/output | 40 | 10 | no |

At completion: `output = 20`, `reasoning = 20`, `generated = 40`, `primeTokens = 20`

```js
decodeOutput = Math.max(0, 20 - Math.min(20, 20)) = 0  // WRONG — all 20 output tokens were actively decoded
```

The correct `decodeOutput` should be **20** (all output tokens arrived during the active-decode window, not during prefill).

## Why This Is Wrong

The first chunk (reasoning, 20 tokens) was decoded during prefill — it is the "prime" chunk. Those 20 reasoning tokens should be excluded from the **generated** numerator (correct: `40 - 20 = 20` active-generated tokens). But they should NOT be excluded from the **output** numerator because they were reasoning, not output.

The current code has no way to distinguish the two. It treats all prime tokens as if they are of the same type as whatever metric is being computed.

## Impact

- **Output TPS metric (`metric: "output"`)** is systematically **undercounted** for reasoning models where the first chunk is reasoning.
- Any stream where reasoning and output tokens are interleaved with tool calls will produce incorrect `outputTps` values after a resume gap (the resume chunk is prime, and if it happens to be reasoning, it pollutes the output numerator).

## Possible Fix

The `GenerationTimer` needs to track prime tokens per type. Minimally, `timingFor` in `tps-meter.tsx` could pass separate `primeOutputTokens` and `primeReasoningTokens`, and `messageStats` would subtract only the matching type:

```js
const decodeOutput = onActive ? Math.max(0, output - (timing.primeOutputTokens ?? 0)) : output;
const decodeGenerated = onActive ? Math.max(0, generated - (timing.primeGeneratedTokens ?? primeTokens)) : generated;
```

## Steps to Reproduce

1. Configure a reasoning model (e.g., extended thinking mode) in OpenCode.
2. Send a prompt that generates both reasoning and output tokens.
3. Observe that `outputTps` is 0 (or "–") in the sidebar, even though output tokens were generated.

## Related

- Issue #5 (zero-token delta corrupts prime-token tracking) is tangential — it affects how prime chunks are detected, but this issue is about what happens with the prime counts AFTER they are correctly detected.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: messageStats incorrectly applies prime-token correction when output and reasoning tokens mix in prime chunks #14

Summary

Root Cause

Concrete Reproduction

Why This Is Wrong

Impact

Possible Fix

Steps to Reproduce

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Chunk	Type	Chars	Tokens (ratio 4)	Prime?
1	reasoning	80	20	yes (first chunk)
2	text/output	40	10	no
3	text/output	40	10	no

BUG: messageStats incorrectly applies prime-token correction when output and reasoning tokens mix in prime chunks #14

Description

Summary

Root Cause

Concrete Reproduction

Why This Is Wrong

Impact

Possible Fix

Steps to Reproduce

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions