BUG: aggregate() pools messages with mixed decodeSource windows, biasing session TPS

## Summary

`aggregate()` in `plugins/tps/tps.js` pools all completed messages by summing `decodeMs` and `decodeGenerated`/`decodeOutput` without reading the `decodeSource` field. When messages have different decode windows, the pooled session average mixes incompatible denominators and becomes an uninterpretable hybrid number.

## Root cause — `aggregate()` at `tps.js:166–227`

The function sums `decodeMs` (line 197) and `decodeGenerated`/`decodeOutput` (lines 191–192) from all completed messages indiscriminately, never examining the `decodeSource` field emitted by each per-message stat (`tps.js:147`).

## The three decodeSource windows have incompatible semantics

`messageStats()` can compute TPS using three different denominators:

| Source | `decodeMs` denominator | `decodeGenerated` numerator |
|---|---|---|
| `"active"` | excludes tool-wait/idle gaps | excludes primeTokens |
| `"first-token"` | includes tool-wait gaps | includes all generated |
| `"end-to-end"` | includes everything (prefill + tools) | includes all generated |

Pooling across them produces `(Σ gen_i − Σ prime_i + Σ gen_j + Σ gen_k) / (Σ activeMs_i + Σ ftMs_j + Σ e2eMs_k)` — a Frankenstein number that represents no single meaningful quantity.

## When mixing actually occurs

`tps-meter.tsx:80–85` — `timingFor(m.id)` returns `undefined` for any message that **never had a GenerationTimer**. Timers are only created on stream events (`tps-meter.tsx:111`). Messages that **completed before the plugin was loaded** have no timer → they fall through to `"end-to-end"` source.

Non-instrumented messages have **inflated denominators** (tool waits, prefill time included) and **very slightly inflated numerators** (primeTokens not excluded). The pooled session average is always **biased downward** vs. the true active-generation TPS.

## Concrete example — ~15% underestimate

3 messages (1 pre-plugin e2e + 2 active, each 500 gen output):

| | gen | primeTokens | decodeMs | true TPS |
|---|---|---|---|---|
| Msg A (e2e) | 500 | 0 | 8000 (incl. prefill+tools) | 62.5 |
| Msg B (active) | 500 | 20 | 5000 (active only) | 96.0 |
| Msg C (active) | 500 | 20 | 5000 (active only) | 96.0 |

**Pooled result:** `(500+480+480) / (8000+5000+5000)` = **81.1 tok/s** — a ~15% underestimate vs. the true active-gen average of **96 tok/s**.

## When it matters

- **Zero impact** in the typical case: plugin loaded at OpenCode startup, captures every message stream → all have `"active"` source → no mixing.
- **Real impact** in "hot-load" scenarios: plugin enabled mid-session → pre-existing messages join the pool with `"end-to-end"` denominators (2–5× larger than active time when tool calls exist), dragging the session average down.

## The `decodeSource` field is emitted but dead

`tps.js:147` emits `decodeSource` on every stat object. `aggregate()` never reads it. The view layer never surfaces it. It is used only internally in `messageStats` to decide prime-correction at `tps.js:120–122`. This field exists specifically to enable the kind of window-separation that `aggregate()` does not perform.

## Suggested fix

At minimum:
1. **Separate aggregates by window type**: return separate pooled values per `decodeSource`, or
2. **Only pool `"active"` messages** when any exist, falling back to the current behavior only when no active timing is available, or
3. **Warn/log** when mixing is detected.

## Related

- The GenerationTimer code (`gen.js`) deliberately excludes tool waits from TPS — this is the core value proposition. Having the session average silently mix them back in undermines that precision promise.
- No existing issue covers this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: aggregate() pools messages with mixed decodeSource windows, biasing session TPS #66

Summary

Root cause — `aggregate()` at `tps.js:166–227`

The three decodeSource windows have incompatible semantics

When mixing actually occurs

Concrete example — ~15% underestimate

When it matters

The `decodeSource` field is emitted but dead

Suggested fix

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Source	`decodeMs` denominator	`decodeGenerated` numerator
`"active"`	excludes tool-wait/idle gaps	excludes primeTokens
`"first-token"`	includes tool-wait gaps	includes all generated
`"end-to-end"`	includes everything (prefill + tools)	includes all generated

	gen	primeTokens	decodeMs	true TPS
Msg A (e2e)	500	0	8000 (incl. prefill+tools)	62.5
Msg B (active)	500	20	5000 (active only)	96.0
Msg C (active)	500	20	5000 (active only)	96.0

BUG: aggregate() pools messages with mixed decodeSource windows, biasing session TPS #66

Description

Summary

Root cause — aggregate() at tps.js:166–227

The three decodeSource windows have incompatible semantics

When mixing actually occurs

Concrete example — ~15% underestimate

When it matters

The decodeSource field is emitted but dead

Suggested fix

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Root cause — `aggregate()` at `tps.js:166–227`

The `decodeSource` field is emitted but dead