Summary
Transcript segments are assigned to the wrong version when the user switches versions while speaking. Long pending-to-confirmed latency on the Vexa side (10–60 s observed) means a segment finalized now often represents speech that started under a previous in_review. We currently tag the segment with whichever version is in review at confirmation time, not at speech time.
Downstream: LLM note generation for version A misses feedback that was spoken while A was the in-review version (because the segment landed on B); notes for B include feedback that was actually about A.
Current behaviour (the bug)
backend/src/dna/transcription_service.py::on_transcription_updated — line 256:
metadata = await self.storage_provider.get_playlist_metadata(playlist_id)
...
version_id = metadata.in_review # ← read ONCE per tick
...
for seg in confirmed:
...
await self.storage_provider.upsert_segment(
playlist_id=playlist_id,
version_id=version_id, # ← same version_id for every seg in the tick
segment_id=segment_id,
data=segment_create,
)
Every segment in a single Vexa tick gets the same version_id, chosen by reading metadata.in_review exactly once at the top of the handler. The segment's own absolute_start_time (when the speech happened) is ignored for routing.
Why this drifts
Vexa's pending-to-confirmed latency is not small. Sampled pair from the live Vexa Cloud feed (Google Meet, vexa_meeting_id=10459, speaker "Dmitriy Grankin"):
// final pending draft before confirmation
{ "absolute_start_time": "2026-04-20T19:35:51.013Z",
"absolute_end_time": "2026-04-20T19:35:54.373Z" } // 3.4 s window
// confirmed (same utterance)
{ "absolute_start_time": "2026-04-20T19:35:51.013Z",
"absolute_end_time": "2026-04-20T19:36:51.689Z" } // 60.7 s window
So a single confirmed segment can span 60+ seconds of speech, finalizing long after the speech began. If the user switched versions during that 60-second window, the whole segment lands on the new version.
Reproduction (manual)
t=0 view version A → in_review = A
t=2 speak "mk020_0020 looks great" [Vexa: pending draft]
t=15 click version B → in_review = B (PUT /playlists/<id>/metadata)
t=17 speak "but mk020_0250 is off" [Vexa: new pending]
t=62 (no user action) [Vexa: confirmed segment,
absolute_start_time=t+2s,
absolute_end_time=t+62s,
text spans both utterances]
→ DNA reads metadata.in_review = B → stored as version_id=B
Expected: feedback about version A is stored under A. Actual: all 60 s of speech attributed to B.
Expected behaviour
A confirmed segment should be stored against the version that was in review during the speech, not during the confirmation.
If the speech straddles a version switch (e.g., 10 s under A and 50 s under B), either:
- (a) assign to whichever version covered the majority of
[absolute_start_time, absolute_end_time], OR
- (b) split the segment at the boundary (requires Vexa word-level timestamps — out of scope for a first pass).
(a) is sufficient for correct LLM attribution in practice; (b) is a future refinement.
Proposed fix
Track in_review history and look up the historical value by segment timestamp at save time.
1. New collection playlist_metadata_history
Append-only log of in_review transitions per playlist.
Compound index: {playlist_id: 1, started_at: 1} (also supports range queries).
2. Storage provider additions
async def append_in_review_history(self, playlist_id: int, version_id: int, at: datetime) -> None:
"""Close the open row for this playlist (ended_at=at) and insert a new
row with started_at=at, version_id=new value."""
async def get_in_review_at(self, playlist_id: int, at: datetime) -> Optional[int]:
"""Return the version_id that was in_review at the given instant, or
None if unknown (no history yet / pre-history segment)."""
3. Wire-up on in_review change
In upsert_playlist_metadata (or at the endpoint level), when the new in_review differs from the existing one, call append_in_review_history(playlist_id, new_in_review, now).
4. Route segments at save time by speech timestamp
In on_transcription_updated:
for seg in confirmed:
...
# Midpoint of the utterance — fair split for segments that straddle a
# switch (option A above). Falls back to current metadata.in_review if
# history lookup returns nothing (first run, clock skew, pre-history).
start = datetime.fromisoformat(absolute_start_time.replace("Z", "+00:00"))
end = datetime.fromisoformat(absolute_end_time.replace("Z", "+00:00"))
midpoint = start + (end - start) / 2
seg_version = await self.storage_provider.get_in_review_at(playlist_id, midpoint)
if seg_version is None:
seg_version = metadata.in_review # fallback
await self.storage_provider.upsert_segment(
playlist_id=playlist_id,
version_id=seg_version,
segment_id=segment_id,
data=segment_create,
)
5. Migration
No data migration required — existing segments keep their current version_id. History is only consulted for future saves. On first bot dispatch after deploy, seed an initial history row from current metadata.in_review (or let the fallback handle it).
Acceptance
Out of scope
- Splitting segments at a boundary (requires Vexa word-level timestamps — separate issue).
- Retroactive re-attribution of existing segments (cost/benefit unclear; probably one-off script later if users want it).
Related
Labels
bug · Backend
Summary
Transcript segments are assigned to the wrong version when the user switches versions while speaking. Long pending-to-confirmed latency on the Vexa side (10–60 s observed) means a segment finalized now often represents speech that started under a previous
in_review. We currently tag the segment with whichever version is in review at confirmation time, not at speech time.Downstream: LLM note generation for version A misses feedback that was spoken while A was the in-review version (because the segment landed on B); notes for B include feedback that was actually about A.
Current behaviour (the bug)
backend/src/dna/transcription_service.py::on_transcription_updated— line 256:Every segment in a single Vexa tick gets the same
version_id, chosen by readingmetadata.in_reviewexactly once at the top of the handler. The segment's ownabsolute_start_time(when the speech happened) is ignored for routing.Why this drifts
Vexa's pending-to-confirmed latency is not small. Sampled pair from the live Vexa Cloud feed (Google Meet, vexa_meeting_id=10459, speaker "Dmitriy Grankin"):
So a single confirmed segment can span 60+ seconds of speech, finalizing long after the speech began. If the user switched versions during that 60-second window, the whole segment lands on the new version.
Reproduction (manual)
Expected: feedback about version A is stored under A. Actual: all 60 s of speech attributed to B.
Expected behaviour
A confirmed segment should be stored against the version that was in review during the speech, not during the confirmation.
If the speech straddles a version switch (e.g., 10 s under A and 50 s under B), either:
[absolute_start_time, absolute_end_time], OR(a) is sufficient for correct LLM attribution in practice; (b) is a future refinement.
Proposed fix
Track
in_reviewhistory and look up the historical value by segment timestamp at save time.1. New collection
playlist_metadata_historyAppend-only log of
in_reviewtransitions per playlist.{ "_id": ObjectId(...), "playlist_id": 45, "version_id": 6991, // the in_review value during this span "started_at": "2026-04-20T19:35:00Z", "ended_at": "2026-04-20T19:50:12Z" // null if still active }Compound index:
{playlist_id: 1, started_at: 1}(also supports range queries).2. Storage provider additions
3. Wire-up on in_review change
In
upsert_playlist_metadata(or at the endpoint level), when the newin_reviewdiffers from the existing one, callappend_in_review_history(playlist_id, new_in_review, now).4. Route segments at save time by speech timestamp
In
on_transcription_updated:5. Migration
No data migration required — existing segments keep their current
version_id. History is only consulted for future saves. On first bot dispatch after deploy, seed an initial history row from currentmetadata.in_review(or let the fallback handle it).Acceptance
version_id = Abecause the utterance midpoint falls in A's span.playlist_metadata_historyrow written on each actualin_reviewchange (not on idempotent PUTs that don't change the value).Out of scope
Related
backend/docs/TRANSCRIPT_MESSAGE_FLOW.md— describes the current flow this issue modifies.Labels
bug·Backend