Skip to content

fix: embed real timestamp in toV7-derived UUIDs#18

Merged
jverre merged 1 commit into
mainfrom
jacques/uuidv7-embed-timestamp
Jun 9, 2026
Merged

fix: embed real timestamp in toV7-derived UUIDs#18
jverre merged 1 commit into
mainfrom
jacques/uuidv7-embed-timestamp

Conversation

@jverre

@jverre jverre commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Problem

toV7() produced fake UUIDv7s. It MD5-hashed the input key and used the first 48 bits of the hash as the timestamp field, only forcing the version nibble to 7. So every toV7-derived ID (trace IDs + all transcript-derived span IDs) carried a random timestamp.

A real trace observed via the Opik MCP server:

Field Value
ID 5de5a0f2-b1f4-7dfb-aef2-9365c5cc4594
start_time 2026-06-09T08:25:29.754Z
time decoded from ID 5241-07-28T00:05:53Z ❌ (~3,200 years off)

The MD5 hashing itself isn't gratuitous — it makes IDs deterministic, which is what the duplicate-fire dedup guard in onPrompt and idempotent upserts rely on. The fix preserves that.

Fix

toV7(key, tsMillis) now embeds a real millisecond timestamp in the first 48 bits and derives only the entropy/node bytes from MD5(key). Same key + same timestamp → same ID (determinism preserved), but the ID now sorts chronologically and its embedded time matches start_time.

Callers updated:

  • Trace — embeds the bucket-aligned time (bucket*5*1000). Bucket alignment is deliberate: both concurrent onPrompt fires share the same bucket, so they still compute an identical ID and the dedup guarantee is unchanged. Trade-off: embedded time can trail start_time by up to the ~5s dedup window (exact ms and perfect determinism are mutually exclusive under bucketed dedup).
  • Spans — embed the message's own transcript timestamp via millisFromISO(p.Timestamp), matching each span's start_time exactly.
  • Parent-span FK reference (onSubagentStop) — looks up the parent Task entry's timestamp so the reference reproduces the exact ID the parent span was built with. Hoisted an already-present transcript read; no extra I/O on the common path.

Notes

  • Existing traces/spans keep their old IDs — this only affects newly generated IDs.
  • New src/uuid_test.go covers: timestamp round-trips through the first 48 bits, version/variant bits, determinism, and millisFromISO parsing incl. the failure→0 fallback.
  • go build, go vet, full go test ./... pass; all four platform binaries rebuilt.

🤖 Generated with Claude Code

toV7 built fake UUIDv7s — it used the first 48 bits of MD5(key) as the
timestamp field and only forced the version nibble to 7, so every derived
trace/span ID carried a garbage timestamp (decoding to ~year 5241).

Embed a real millisecond timestamp in the first 48 bits while keeping the
entropy/node bytes from MD5(key), so IDs stay deterministic (idempotent
upserts + duplicate-fire dedup) but now sort chronologically and match the
entity's start_time:

- trace ID: bucket-aligned ms (keeps both concurrent onPrompt fires on the
  same timestamp, so the dedup guarantee is unchanged)
- span IDs: the message's own transcript timestamp via millisFromISO
- parent-span FK reference in onSubagentStop now looks up the parent entry's
  timestamp so it reproduces the exact ID the parent span was built with

Existing IDs are unaffected; only newly generated IDs decode correctly.

Adds uuid_test.go and rebuilds all platform binaries.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jverre jverre merged commit 4de656f into main Jun 9, 2026
1 check passed
@jverre jverre deleted the jacques/uuidv7-embed-timestamp branch June 12, 2026 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant