Skip to content

Add reactive agents timeline query#4347

Draft
samwillis wants to merge 10 commits into
mainfrom
entity-timeline-query
Draft

Add reactive agents timeline query#4347
samwillis wants to merge 10 commits into
mainfrom
entity-timeline-query

Conversation

@samwillis
Copy link
Copy Markdown
Contributor

Summary

  • Adds createEntityTimelineQuery, a multi-source TanStack DB timeline query for agent streams.
  • Migrates the agents UI timeline to consume the query directly, with live child collections for run items so streamed text updates through DB's fine-grained reactivity instead of rematerializing the whole chat timeline.
  • Adds stable timeline ordering for streamed and optimistic rows, improves pending queued-message placement, and keeps the chat pinned to the bottom during streaming.
  • Adds a patch changeset for @electric-ax/agents-runtime and @electric-ax/agents-server-ui.

Dependencies

This PR depends on TanStack DB support from:

Test plan

  • pnpm --filter @electric-ax/agents-runtime typecheck
  • pnpm --filter @electric-ax/agents-runtime build
  • pnpm --filter @electric-ax/agents-server-ui typecheck
  • Manual browser smoke testing of sending messages and streamed agent responses in the agents UI

Made with Cursor

@netlify
Copy link
Copy Markdown

netlify Bot commented May 18, 2026

Deploy Preview for electric-next ready!

Name Link
🔨 Latest commit 065f4fc
🔍 Latest deploy log https://app.netlify.com/projects/electric-next/deploys/6a0b23669d8fce000822cb13
😎 Deploy Preview https://deploy-preview-4347--electric-next.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

samwillis and others added 10 commits May 18, 2026 15:29
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Use a fine-grained timeline query for the agents UI so streamed run items update through TanStack DB instead of rematerializing the whole chat timeline.

Co-authored-by: Cursor <cursoragent@cursor.com>
Rely on TanStack DB's virtual $synced prop instead of carrying a custom _optimistic field through inbox rows.

Co-authored-by: Cursor <cursoragent@cursor.com>
Use the timeline order token for pending local inbox rows so optimistic timeline ordering no longer depends on the legacy _seq field.

Co-authored-by: Cursor <cursoragent@cursor.com>
Bridge the first pending queued message into the timeline while there is no active run so it does not briefly appear in the pending drawer.

Co-authored-by: Cursor <cursoragent@cursor.com>
Document the multi-source row structure and live child collections returned by createEntityTimelineQuery.

Co-authored-by: Cursor <cursoragent@cursor.com>
Pin the chat timeline on content resize while near the bottom and force a final bottom scroll when a streaming run completes.

Co-authored-by: Cursor <cursoragent@cursor.com>
Mark the agents runtime and server UI for a patch release because the timeline query now uses fine-grained TanStack DB reactivity.

Co-authored-by: Cursor <cursoragent@cursor.com>
@samwillis samwillis force-pushed the entity-timeline-query branch from 78078c2 to 065f4fc Compare May 18, 2026 14:34
@codecov
Copy link
Copy Markdown

codecov Bot commented May 18, 2026

❌ 4 Tests Failed:

Tests completed Failed Passed Skipped
391 4 387 2
View the full list of 4 ❄️ flaky test(s)
test/horton-pull-wake-e2e.test.ts > pull-wake Horton e2e with mocked LLM > dispatches explicit runner-policy wakes and Horton writes mocked responses

Flake rate in main: 100.00% (Passed 0 times, Failed 8 times)

Stack Traces | 20.1s run time
Error: Timed out after 20000ms
 ❯ waitFor test/test-utils.ts:31:9
 ❯ test/horton-pull-wake-e2e.test.ts:185:5
test/wake-registry.test.ts > Wake Registry Integration > WakeEvent in subscriber stream includes source and change details

Flake rate in main: 100.00% (Passed 0 times, Failed 8 times)

Stack Traces | 10s run time
Error: Timed out waiting for 1 wakes (got 0)
 ❯ Timeout._onTimeout test/wake-registry.test.ts:838:13
test/wake-registry.test.ts > Wake Registry Integration > event append triggers wake delivery for change condition

Flake rate in main: 100.00% (Passed 0 times, Failed 8 times)

Stack Traces | 10s run time
Error: Timed out waiting for 1 wakes (got 0)
 ❯ Timeout._onTimeout test/wake-registry.test.ts:838:13
test/wake-registry.test.ts > Wake Registry Integration > spawn with wake registers condition and delivers wake on child run completion

Flake rate in main: 100.00% (Passed 0 times, Failed 8 times)

Stack Traces | 10.1s run time
Error: Timed out waiting for 1 wakes (got 0)
 ❯ Timeout._onTimeout test/wake-registry.test.ts:838:13

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@KyleAMathews
Copy link
Copy Markdown
Contributor

Overall I think this is the right approach. Moving the agents timeline from the aggregate createEntityIncludesQuery -> normalize -> buildTimelineEntries path to a row-oriented createEntityTimelineQuery is the right architecture for streaming. Modeling the timeline as a multi-source TanStack DB query also seems like the right use of the DB PRs: source alias as the discriminant, outer $key as timeline identity, and live child collections for run contents.

A few concrete things I noticed / would double-check:

1. Synthetic pending row $key may not match TanStack’s union key encoding

I saw code constructing a bridged pending timeline row with a key like:

$key: `inbox:${inlinePendingInbox.key}`

But the TanStack multi-source key encoding in the dependency PR appears to encode branch keys more carefully than plain ${alias}:${key} — e.g. distinguishing string/number keys.

If this synthetic row ever needs to reconcile with, de-dupe against, or be replaced by the real live-query row, the key may not match the real outer $key. That could cause a remount, duplicate bubble, or failed de-dupe.

I’d prefer not to manually reconstruct TanStack’s internal union $key format in Electric. Can this use the actual live-query row, or de-dupe using the source-local inbox key instead?

2. Tool-call run items may be keyed inconsistently

For nested run items, I think text rows use the union row key:

key={item.$key}

but tool calls may use the source-local tool-call key:

key={item.toolCall.key}

Since run.items is itself a multi-source union, I think rendered children should consistently use item.$key. Otherwise text/tool-call rows with overlapping source-local keys could collide or remount incorrectly.

This seems like a small concrete fix.

3. Ordering probably needs an explicit tie-breaker

The _timeline_order direction is good — especially because streaming updates should not move old timeline rows.

But the ordering looks like it often relies on a single order token with fallback "~":

coalesce(..., `~`)

If two rows have the same _timeline_order, or if multiple rows fall back to "~", then ordering may depend on underlying insertion/subscription behavior unless TanStack DB guarantees stable ordering for equal sort values.

This is probably worth making deterministic with a secondary stable key, especially at the top-level timeline and nested run-items level.

Not a design objection — just trying to avoid nondeterministic timeline order in edge cases.

4. Optimistic reconciliation may flicker if server keys differ

The optimistic inbox keys look like they’re generated client-side, e.g. something like:

optimistic-${Date.now()}-${index}

If the server-confirmed inbox row comes back with a different key, does TanStack DB reconcile it as the same row, or does the optimistic row disappear and the server row appear as a new row?

If it’s the latter, this could produce a transient duplicate/remount/flicker. Maybe the mutation layer already handles this, but I’d want to confirm because it’s the most likely place for visible weirdness in the optimistic path.

5. $synced === false may be broader than “optimistic pending insert”

Replacing custom _optimistic with TanStack’s $synced virtual prop seems right.

One question: does the timeline filter include any unsynced inbox row? If so, is that intentionally equivalent to “local optimistic pending message”? $synced === false could also describe local edits/cancels/other unsynced inbox mutations.

If only optimistic pending inserts should appear inline, the condition may need to be a bit narrower than just unsynced.

6. First pending-message bridge may affect edit/cancel affordances

The “bridge the first pending queued message into the timeline” behavior makes sense.

One edge I’d check in code: if that first pending message is removed from the pendingMessages list passed to the composer/drawer, can the user still edit/cancel it while it is displayed inline?

If not, this may be a small UX regression: the first queued message is visible inline, but no longer editable/cancellable like the rest of the queue.

7. Completion scroll looks like it may force-scroll even after user scrolled up

The scroll changes mostly look good, but this effect looked suspicious:

if (!previousStreamingAgentKey || lastStreamingAgentKey) return

isNearBottom.current = true
setShowJumpToBottom(false)
scrollToTimelineEnd()

If I’m reading this correctly, when a streaming run completes, it resets isNearBottom and scrolls to bottom unconditionally.

That means a user who scrolls up during streaming may get yanked back down when the run completes. Usually I’d expect completion to scroll only if the user was still pinned.

Similarly, if scrollToTimelineEnd() schedules a RAF and the user scrolls up before that RAF fires, it might still write scrollTop = scrollHeight. For non-forced auto-scroll paths, it may be safer to re-check “still pinned” inside the RAF.

8. caseWhen cast should go away before this is released

I saw this kind of cast:

const { caseWhen } = TanStackDB as typeof TanStackDB & {
  caseWhen: <T>(condition: unknown, value: T) => T
}

Totally fine while stacked on unreleased TanStack DB work, but I wouldn’t want that in the final version. Once the TanStack PRs land, Electric should consume the real public export/types.

Things I would not block on:

  • The general multi-source from approach. That looks like the right model to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants