feat: Subqueries in SELECT for hierarchical data (includes) by kevin-dp · Pull Request #1294 · TanStack/db

kevin-dp · 2026-02-25T08:45:04Z

Summary

Adds support for subqueries inside .select() that produce hierarchical results — each parent row gets a child Collection (e.g., projects with nested issues, issues with nested comments)
Child queries are inner-joined with the parent pipeline so only children matching filtered parents flow through
Supports ORDER BY and LIMIT/OFFSET on child queries (uses grouped ORDER BY so limits are per-parent)
Nested includes work recursively (projects → issues → comments)

Closes #288

Test plan

Basic includes: parent rows have child Collections with correct items
Reactivity: adding/removing children updates child Collections without touching parents
Parent remove + re-add: child Collection resets correctly
Inner join filtering: children only shown for parents matching WHERE
Nested includes: two levels deep (projects → issues → comments)
Ordered child queries: child Collections respect ORDER BY
Ordered + LIMIT: limit applied per parent, not globally; insertions displace correctly
All 1815 existing tests pass (no regressions)

🤖 Generated with Claude Code

changeset-bot · 2026-02-25T08:45:10Z

🦋 Changeset detected

Latest commit: fc269d5

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 14 packages

Name	Type
@tanstack/db	Minor
@tanstack/angular-db	Patch
@tanstack/electric-db-collection	Patch
@tanstack/offline-transactions	Patch
@tanstack/powersync-db-collection	Patch
@tanstack/query-db-collection	Patch
@tanstack/react-db	Patch
@tanstack/rxdb-db-collection	Patch
@tanstack/solid-db	Patch
@tanstack/svelte-db	Patch
@tanstack/trailbase-db-collection	Patch
@tanstack/vue-db	Patch
todos	Patch
@tanstack/db-example-paced-mutations-demo	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

pkg-pr-new · 2026-02-25T08:47:59Z

More templates

@tanstack/angular-db

npm i https://pkg.pr.new/@tanstack/angular-db@1294

@tanstack/db

npm i https://pkg.pr.new/@tanstack/db@1294

@tanstack/db-ivm

npm i https://pkg.pr.new/@tanstack/db-ivm@1294

@tanstack/electric-db-collection

npm i https://pkg.pr.new/@tanstack/electric-db-collection@1294

@tanstack/offline-transactions

npm i https://pkg.pr.new/@tanstack/offline-transactions@1294

@tanstack/powersync-db-collection

npm i https://pkg.pr.new/@tanstack/powersync-db-collection@1294

@tanstack/query-db-collection

npm i https://pkg.pr.new/@tanstack/query-db-collection@1294

@tanstack/react-db

npm i https://pkg.pr.new/@tanstack/react-db@1294

@tanstack/rxdb-db-collection

npm i https://pkg.pr.new/@tanstack/rxdb-db-collection@1294

@tanstack/solid-db

npm i https://pkg.pr.new/@tanstack/solid-db@1294

@tanstack/svelte-db

npm i https://pkg.pr.new/@tanstack/svelte-db@1294

@tanstack/trailbase-db-collection

npm i https://pkg.pr.new/@tanstack/trailbase-db-collection@1294

@tanstack/vue-db

npm i https://pkg.pr.new/@tanstack/vue-db@1294

commit: fc269d5

github-actions · 2026-02-25T08:48:37Z

Size Change: +3.38 kB (+3.65%)

Total Size: 96 kB

Filename	Size	Change
`./packages/db/dist/esm/query/builder/index.js`	4.59 kB	+485 B (+11.83%)	⚠️
`./packages/db/dist/esm/query/compiler/group-by.js`	2.24 kB	+9 B (+0.4%)
`./packages/db/dist/esm/query/compiler/index.js`	2.68 kB	+641 B (+31.5%)	🚨
`./packages/db/dist/esm/query/compiler/order-by.js`	1.5 kB	+52 B (+3.58%)
`./packages/db/dist/esm/query/compiler/select.js`	1.11 kB	+20 B (+1.83%)
`./packages/db/dist/esm/query/ir.js`	738 B	+65 B (+9.66%)	⚠️
`./packages/db/dist/esm/query/live/collection-config-builder.js`	7.65 kB	+2.11 kB (+37.97%)	🚨

ℹ️ View Unchanged

Filename	Size
`./packages/db/dist/esm/collection/change-events.js`	1.39 kB
`./packages/db/dist/esm/collection/changes.js`	1.22 kB
`./packages/db/dist/esm/collection/events.js`	388 B
`./packages/db/dist/esm/collection/index.js`	3.32 kB
`./packages/db/dist/esm/collection/indexes.js`	1.1 kB
`./packages/db/dist/esm/collection/lifecycle.js`	1.75 kB
`./packages/db/dist/esm/collection/mutations.js`	2.34 kB
`./packages/db/dist/esm/collection/state.js`	3.49 kB
`./packages/db/dist/esm/collection/subscription.js`	3.71 kB
`./packages/db/dist/esm/collection/sync.js`	2.41 kB
`./packages/db/dist/esm/deferred.js`	207 B
`./packages/db/dist/esm/errors.js`	4.7 kB
`./packages/db/dist/esm/event-emitter.js`	748 B
`./packages/db/dist/esm/index.js`	2.69 kB
`./packages/db/dist/esm/indexes/auto-index.js`	742 B
`./packages/db/dist/esm/indexes/base-index.js`	766 B
`./packages/db/dist/esm/indexes/btree-index.js`	2.17 kB
`./packages/db/dist/esm/indexes/lazy-index.js`	1.1 kB
`./packages/db/dist/esm/indexes/reverse-index.js`	538 B
`./packages/db/dist/esm/local-only.js`	808 B
`./packages/db/dist/esm/local-storage.js`	2.1 kB
`./packages/db/dist/esm/optimistic-action.js`	359 B
`./packages/db/dist/esm/paced-mutations.js`	496 B
`./packages/db/dist/esm/proxy.js`	3.75 kB
`./packages/db/dist/esm/query/builder/functions.js`	733 B
`./packages/db/dist/esm/query/builder/ref-proxy.js`	1.05 kB
`./packages/db/dist/esm/query/compiler/evaluators.js`	1.43 kB
`./packages/db/dist/esm/query/compiler/expressions.js`	430 B
`./packages/db/dist/esm/query/compiler/joins.js`	2.11 kB
`./packages/db/dist/esm/query/expression-helpers.js`	1.43 kB
`./packages/db/dist/esm/query/live-query-collection.js`	360 B
`./packages/db/dist/esm/query/live/collection-registry.js`	264 B
`./packages/db/dist/esm/query/live/collection-subscriber.js`	2.42 kB
`./packages/db/dist/esm/query/live/internal.js`	145 B
`./packages/db/dist/esm/query/optimizer.js`	2.62 kB
`./packages/db/dist/esm/query/predicate-utils.js`	2.97 kB
`./packages/db/dist/esm/query/subset-dedupe.js`	921 B
`./packages/db/dist/esm/scheduler.js`	1.3 kB
`./packages/db/dist/esm/SortedMap.js`	1.3 kB
`./packages/db/dist/esm/strategies/debounceStrategy.js`	247 B
`./packages/db/dist/esm/strategies/queueStrategy.js`	428 B
`./packages/db/dist/esm/strategies/throttleStrategy.js`	246 B
`./packages/db/dist/esm/transactions.js`	2.9 kB
`./packages/db/dist/esm/utils.js`	924 B
`./packages/db/dist/esm/utils/browser-polyfills.js`	304 B
`./packages/db/dist/esm/utils/btree.js`	5.61 kB
`./packages/db/dist/esm/utils/comparison.js`	952 B
`./packages/db/dist/esm/utils/cursor.js`	457 B
`./packages/db/dist/esm/utils/index-optimization.js`	1.51 kB
`./packages/db/dist/esm/utils/type-guards.js`	157 B

_{compressed-size-action::db-package-size}

github-actions · 2026-02-25T08:49:13Z

Size Change: 0 B

Total Size: 3.7 kB

ℹ️ View Unchanged

Filename	Size
`./packages/react-db/dist/esm/index.js`	225 B
`./packages/react-db/dist/esm/useLiveInfiniteQuery.js`	1.17 kB
`./packages/react-db/dist/esm/useLiveQuery.js`	1.34 kB
`./packages/react-db/dist/esm/useLiveSuspenseQuery.js`	559 B
`./packages/react-db/dist/esm/usePacedMutations.js`	401 B

_{compressed-size-action::react-db-package-size}

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace O(n) parent collection scans with a reverse index (correlationKey → Set<parentKey>) for attaching child Collections to parent rows. The index is populated during parent INSERTs and cleaned up on parent DELETEs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…d collections.

samwillis

This is looking really great. Awesome work!

Depending on if we are going to release before or after followup PRs it may make sense to add some defensive errors for unsupported queries (groupBy, referencing multiple fields on the parent)

samwillis · 2026-02-25T13:49:38Z

packages/db/src/query/builder/index.ts

+          ? where.expression
+          : where
+
+      // Look for eq(a, b) where one side references parent and other references child


I believe this is finding the first expression that references both sides, this is correct. We should consider what something like this does:

q.from({ p: projects }).select(({ p }) => ({ id: p.id, name: p.name, issues: q .from({ i: issues }) .where(({ i }) => and(eq(i.projectId, p.id)), eq(i.createdBy, p.createdBy)) .select(({ i }) => ({ id: i.id, title: i.title, })), })), )

I suspect it breaks at the moment, and so we may want to throw if there is more than one expression matching both sources.

I think it's possible to make this work though by pulling the parent project value temporarily into the child issue pipeline.

Indeed, it breaks right now because the parent row is not in the child pipeline. I added support for this in this PR: #1307

packages/db/src/query/live/collection-config-builder.ts

samwillis · 2026-02-25T13:58:33Z

packages/db/tests/query/includes.test.ts

+      // Re-add project Alpha — should get a fresh child collection
+      projects.utils.begin()
+      projects.utils.write({
+        type: `insert`,
+        value: { id: 1, name: `Alpha Reborn` },
+      })
+      projects.utils.commit()
+
+      const alpha = collection.get(1) as any
+      expect(alpha).toMatchObject({ id: 1, name: `Alpha Reborn` })
+      expect(childItems(alpha.issues)).toEqual([
+        { id: 10, title: `Bug in Alpha` },
+        { id: 11, title: `Feature for Alpha` },
+      ])


samwillis · 2026-02-25T14:14:33Z

ChatGPT review:

Here’s my review of TanStack/db PR #1294 (adds “includes” subqueries / nested child collections). ([GitHub]1)

What this PR is doing (as I understand it)

New “includes subquery” syntax: returning a QueryBuilder from inside a parent .select() field now becomes an IncludesSubquery IR node, by detecting a correlating eq(child.fk, parent.pk) in the child query’s where. ([GitHub]2)
Compiler support: the compiler extracts those IncludesSubquery nodes, compiles the child query “per parent correlation key”, and plumbs an extra correlationKey through the result tuples so the output layer can route rows into the correct child collection. ([GitHub]3)
Output/runtime support: live query builder wires child pipelines via output() callbacks, creates per-parent child Collections, attaches them onto parent rows, and handles nested includes via a shared-buffer + routing-index approach. ([GitHub]4)
Tests: good coverage for basic includes, reactivity (insert/delete), ordered children, per-parent limit, and 2-level nesting (projects → issues → comments). ([GitHub]5)
Related fix: containsAggregate is now defensive against nested Select objects (important because includes introduces nested select-shaped objects). ([GitHub]6)

Overall: the API is very ergonomic, and the nested routing solution is clever.

Things I like

The user-facing API is dead simple and reads like a real ORM include. ([GitHub]5)
The compiler approach (extract includes early, compile child with a parent-key stream, and then let the live layer attach real child collections) is a solid separation of concerns. ([GitHub]3)
The tests hit the highest-risk areas: ordering + limit per parent + nested includes. ([GitHub]5)

Key concerns / suggested changes

1) Alias collisions between parent and child queries (correctness bug risk)

extractCorrelation() decides “parent vs child” purely by membership in parentAliases / childAliases. If an alias name appears in both sets (e.g. user reuses p inside the child query), the correlation detection can mis-classify and/or silently do the wrong thing. ([GitHub]2)

Suggestion

Enforce disjoint alias sets for includes subqueries at build time:
- If childAliases intersects parentAliases, throw a dedicated error (ideally the same family as DuplicateAliasInSubqueryError used elsewhere). ([GitHub]3)
Also consider extending validateQueryStructure to walk select and validate nested IncludesSubquery nodes too (right now it looks focused on from/join QueryRefs). ([GitHub]3)

2) Correlation extraction only matches top-level `eq(ref, ref)`

Right now the includes correlation must be a direct where(() => eq(child.fk, parent.pk)). If someone writes:

.where(({ i }) => and(eq(i.projectId, p.id), eq(i.status, 'open')))

…correlation won’t be found (because it doesn’t traverse and() trees). That’s fine for v1, but it needs to be explicit.

Suggestion

Either:
1. document “correlation must be a top-level where(eq(...))”, and add a clearer error type/message; or
2. traverse boolean expression trees (and/or) to find the first correlating eq.

The current thrown Error(...) is a bit “raw” for public API behavior. ([GitHub]2)

3) Compiler mutates the query IR’s `select` in-place

replaceIncludesInSelect(query.select, key) replaces includes entries with Val(null) so processSelect() doesn’t see them. That’s convenient, but in-place mutation of a query object that may be cached/reused is a footgun (especially if you ever recompile without cache, or if other passes expect to see includes). ([GitHub]3)

Suggestion

Treat IR as immutable here:
- clone select into selectWithoutIncludes,
- compile using the clone,
- keep the original IR intact.

4) Parent “correlation key” changes on update aren’t handled cleanly

In flushIncludesState, Phase 1 runs on changes.inserts > 0, so it will also run for updates (which often show as delete+insert deltas internally). But Phase 5 only cleans up on “pure delete” (deletes > 0 && inserts === 0). ([GitHub]4)

If a parent row’s correlation value can change (even if rare), you can end up with:

stale correlationToParentKeys membership under the old key,
orphaned child collections / routing entries,
parent row still pointing at the old child collection.

Suggestion

Either explicitly state/validate that the correlation field must be stable (usually the parent primary key), or
Track the last correlationKey per parent key and on update:
- remove old mapping + routing entries when the correlationKey changes,
- attach the new child collection.

Given the current code already supports “multiple parents per correlationKey”, it’s close—just missing the “move” case. ([GitHub]4)

5) Child collection IDs use `String(correlationKey)`

id: ${parentId}-${fieldName}-${String(correlationKey)}

If correlationKey is an object/composite, String() becomes "[object Object]" → collisions.
If it contains awkward chars, IDs get messy.

Suggestion

If you want IDs to be stable + readable, consider serializeValue(correlationKey) (you’re already using serializeValue elsewhere) or a tiny hash of a stable serialization. ([GitHub]6)

6) Internal `__correlationKey` property naming

Compiler/runtime appears to rely on a magic __correlationKey on the source row for grouping/limit-per-parent logic. If user data can contain that field, there’s collision risk. ([GitHub]3)

Suggestion

Use a Symbol, or a namespaced internal key that can’t realistically collide, or store correlation metadata outside user rows.

Tests I’d add (small but high value)

Alias collision: parent uses alias p, child also uses alias p → assert it throws a clear error.
Correlation inside and(): demonstrate either supported traversal or a friendly “must be top-level eq” error.
Parent update that changes correlation value (even if discouraged): verify the old child collection detaches and the new one attaches (or verify it’s rejected).
Multiple parents share same correlationKey (since your reverse index supports it): ensure attach/update works for all parents in the set. ([GitHub]4)

Bottom line

This is a strong feature with a very nice API and solid initial test coverage. The biggest things I’d address before merging are:

alias overlap validation (likely correctness),
avoid mutating query IR in compiler,
define behavior/constraints for correlation extraction (top-level eq vs expression traversal),
and handle or forbid parent correlationKey changes.

… parent select

kevin-dp · 2026-02-26T14:52:29Z

@samwillis response to codex' review:

Alias collisions between parent and child queries (correctness bug risk)

Valid concern but I don't think it's a real risk in practice. The child query is built via the builder API where the user explicitly declares aliases in .from({ i: issues }). If they reuse p as a child alias, it would shadow the parent's p in the closure scope — so p.id in the child's .where() would already refer to the child's p, not the parent's.

Correlation extraction only matches top-level eq(ref, ref)

Fixed in #1307

Compiler mutates the query IR’s select in-place

This is a fair observation but the compiler already runs on the output of optimizeQuery(), which returns a new object. And the cache is keyed by the raw query, with queryMapping linking optimized → raw. So in practice the mutation happens on a fresh optimized copy, not the user's original IR.

Still a valid code hygiene concern, but it's out of scope for this PR. If it were to be fixed, it should be its own cleanup.

Parent “correlation key” changes on update aren’t handled cleanly

The correlation field is almost always the parent's primary key (e.g., p.id in eq(i.projectId, p.id)). PKs don't change by definition.

Could a user correlate on a non-PK field? Technically yes, but it would be semantically wrong — the correlation field determines how children are grouped to parents. If it's not stable, the entire grouping model is broken, not just the cleanup logic.

I'd lean toward the first suggestion: document/validate that the correlation field should be a stable key (which it naturally is in every real use case). The "move" case handling would add complexity for a scenario that doesn't really make sense to support.

Child collection IDs use String(correlationKey)

True — the correlation field doesn't have to be the PK (which are restricted to string | number). Someone could correlate on any field, and arbitrary field values could be objects, arrays, dates, etc. Even though it's unusual, String() would silently produce "[object Object]" and cause collisions.

Using serializeValue is a cheap one-line fix that makes it robust. Will do.

Internal __correlationKey property naming

This is fine. We have a couple of these reserved properties around the code. It's unlikely someone would use this name.

Regarding the additional tests:

alias collisions: isn't needed because as we explained this is handled by standard shadowing in TS.
correlation inside and(): these tests already exists in follow up PRs since we introduced support for this.
Parent update that changes correlation value: As discussed, the correlation field is practically always the PK which doesn't change. Testing this would be testing undefined behavior. I'd rather document the constraint than test around it.
Multiple parents sharing same correlationKey: This is a good one. It tests a real scenario (e.g., multiple
projects with the same foreign key value). Added this one.

…hIncludesState reads from that stamp. The stamp is cleaned up at the end of flush so it never leaks to the user

kevin-dp added 6 commits February 24, 2026 14:06

Add support for subqueries in select

45f1065

Unit tests for includes

2467505

Unit tests for ordered subqueries

e702f80

Add support for ordered subquery

f592b81

Unit tests for subqueries with limit

697502c

Support LIMIT and OFFSET in subqueries

1c2728c

ci: apply automated fixes

16ac62b

Add changeset for includes subqueries

d117fea

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

kevin-dp force-pushed the kevin/includes branch from 260157e to d117fea Compare February 25, 2026 09:53

kevin-dp and others added 5 commits February 25, 2026 11:16

ci: apply automated fixes

54830d4

Unit tests for changes to deeply nested collections

d324941

Move fro top-down to bottom-up approach for flushing changes to neste…

f809526

…d collections.

ci: apply automated fixes

15b9862

samwillis reviewed Feb 25, 2026

View reviewed changes

kevin-dp added 4 commits February 26, 2026 14:45

Prefix child collection names to avoid clashes

ec21dbf

Properly serialize correlation key before using it in collection ID

3d74cf5

Additional test as suggested by Codex review

f590917

Unit test to ensure that correlation field does not need to be in the…

3ea52ef

… parent select

Stamp __includesCorrelationKeys on the result before output, and flus…

3ca70b9

…hIncludesState reads from that stamp. The stamp is cleaned up at the end of flush so it never leaks to the user

kevin-dp requested a review from samwillis February 26, 2026 15:06

ci: apply automated fixes

fc269d5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Subqueries in SELECT for hierarchical data (includes)#1294

feat: Subqueries in SELECT for hierarchical data (includes)#1294
kevin-dp wants to merge 19 commits intomainfrom
kevin/includes

kevin-dp commented Feb 25, 2026

Uh oh!

changeset-bot bot commented Feb 25, 2026 •

edited

Loading

Uh oh!

pkg-pr-new bot commented Feb 25, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 25, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 25, 2026 •

edited

Loading

Uh oh!

samwillis left a comment

Uh oh!

samwillis Feb 25, 2026

Uh oh!

kevin-dp Feb 26, 2026

Uh oh!

Uh oh!

samwillis Feb 25, 2026

Uh oh!

samwillis commented Feb 25, 2026

Uh oh!

kevin-dp commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kevin-dp commented Feb 25, 2026

Summary

Test plan

Uh oh!

changeset-bot bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

pkg-pr-new bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

samwillis left a comment

Choose a reason for hiding this comment

Uh oh!

samwillis Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

kevin-dp Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

samwillis Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

samwillis commented Feb 25, 2026

What this PR is doing (as I understand it)

Things I like

Key concerns / suggested changes

1) Alias collisions between parent and child queries (correctness bug risk)

2) Correlation extraction only matches top-level eq(ref, ref)

3) Compiler mutates the query IR’s select in-place

4) Parent “correlation key” changes on update aren’t handled cleanly

5) Child collection IDs use String(correlationKey)

6) Internal __correlationKey property naming

Tests I’d add (small but high value)

Bottom line

Uh oh!

kevin-dp commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

changeset-bot bot commented Feb 25, 2026 •

edited

Loading

pkg-pr-new bot commented Feb 25, 2026 •

edited

Loading

github-actions bot commented Feb 25, 2026 •

edited

Loading

github-actions bot commented Feb 25, 2026 •

edited

Loading

2) Correlation extraction only matches top-level `eq(ref, ref)`

3) Compiler mutates the query IR’s `select` in-place

5) Child collection IDs use `String(correlationKey)`

6) Internal `__correlationKey` property naming