perf: replace O(n) linear route matching with radix trie#442
perf: replace O(n) linear route matching with radix trie#442Divkix wants to merge 4 commits intocloudflare:mainfrom
Conversation
Every request triggered 3-6 linear scans over the full route list, calling matchPattern() for every route until a match. For apps with many routes this means thousands of comparisons per request. This replaces the linear scan with a trie-based O(depth) lookup in all 4 codebases that share the matching logic: - pages-router.ts (scanner module, dev time) - app-router.ts (scanner module, dev time) - app-rsc-entry.ts (generated entry, runtime) - pages-server-entry.ts (generated entry, runtime) The trie enforces correct precedence at each node via traversal order: static > dynamic > catch-all > optional catch-all. Recursive DFS with backtracking ensures dead-end branches fall through to alternatives. matchPattern is retained only in app-rsc-entry.ts for the intercept lookup (typically 0-5 entries, not worth trie-ifying).
commit: |
There was a problem hiding this comment.
Pull request overview
This PR replaces the existing O(n) per-request linear route scanning with an O(depth) trie-based matcher, and wires it into both runtime routers and the generated server entries to improve routing performance while preserving Next.js-style precedence.
Changes:
- Added a shared route-trie matcher (
buildRouteTrie,trieMatch) with precedence-aware traversal and DFS backtracking. - Updated Pages Router and App Router matchers to use a cached trie instead of scanning route arrays.
- Updated generated server entries to build a trie once at module init and use it for request matching; added extensive unit/parity tests.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/route-trie.test.ts | Adds unit + parity tests validating trie behavior across route types and precedence rules. |
| packages/vinext/src/routing/route-trie.ts | Introduces the trie data structure and matching algorithm used by all routers. |
| packages/vinext/src/routing/pages-router.ts | Switches route matching to trie lookup with a WeakMap cache keyed by the routes array. |
| packages/vinext/src/routing/app-router.ts | Switches App Router matching to trie lookup with a WeakMap cache. |
| packages/vinext/src/entries/pages-server-entry.ts | Generated SSR entry now prebuilds tries for page/api routes and matches via trie. |
| packages/vinext/src/entries/app-rsc-entry.ts | Generated RSC entry now prebuilds a trie for route matching; retains linear matchPattern for intercept lookup. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| * Radix trie for O(depth) route matching. | ||
| * | ||
| * Replaces the O(n) linear scan over pre-sorted routes with a trie-based | ||
| * lookup. Priority is enforced by traversal order at each node: | ||
| * 1. Static child (exact segment match) — highest priority | ||
| * 2. Dynamic child (single-segment param) — medium | ||
| * 3. Catch-all (1+ remaining segments) — low | ||
| * 4. Optional catch-all (0+ remaining segments) — lowest |
There was a problem hiding this comment.
The header comment calls this a “Radix trie”, but the implementation is a standard segment trie (no path compression / radix edges). This is likely to confuse maintainers—either update the docs to say “trie / prefix tree” or adjust the implementation to be an actual radix trie if that’s intended.
| * Radix trie for O(depth) route matching. | |
| * | |
| * Replaces the O(n) linear scan over pre-sorted routes with a trie-based | |
| * lookup. Priority is enforced by traversal order at each node: | |
| * 1. Static child (exact segment match) — highest priority | |
| * 2. Dynamic child (single-segment param) — medium | |
| * 3. Catch-all (1+ remaining segments) — low | |
| * 4. Optional catch-all (0+ remaining segments) — lowest | |
| * Trie (prefix tree) for O(depth) route matching. | |
| * | |
| * Replaces the O(n) linear scan over pre-sorted routes with a trie-based | |
| * lookup. Priority is enforced by traversal order at each node: | |
| * 1. Static child (exact segment match) — highest priority | |
| * 2. Dynamic child (single-segment param) — medium | |
| * 3. Catch-all (1+ remaining segments) — low | |
| * 4. Optional catch-all (0+ remaining segments) — lowest |
| } | ||
| return null; | ||
| return _trieMatch(_routeTrie, urlParts); | ||
| } |
There was a problem hiding this comment.
matchRoute(url, routes) no longer uses the routes argument and always matches against the prebuilt _routeTrie. This makes the function signature misleading and creates a footgun if a future call site passes a different route list (it will silently match against the wrong trie). Consider either removing the unused parameter or selecting/building the trie based on the provided routes value.
tests/route-trie.test.ts
Outdated
| // "users" matches static, but "123" doesn't match "me" → backtrack | ||
| // Actually "users" → static child exists, "123" has no static child → go to dynamic | ||
| // Wait, this isn't backtracking. Let me think... | ||
| // /api/users/me is the static path. /api/users/123 should match dynamic. | ||
| // The trie has: api -> users (static) -> me (static), and api -> :resource (dynamic) -> :id (dynamic) | ||
| // For /api/users/123: api(static) -> users(static) -> 123 not "me" → no static child | ||
| // → no dynamic child at that node → backtrack to api -> :resource(dynamic) matches "users" -> :id matches "123" | ||
| const result = trieMatch(trie, ["api", "users", "123"]); | ||
| expect(result).not.toBeNull(); | ||
| expect(result!.route.pattern).toBe("/api/:resource/:id"); | ||
| expect(result!.params).toEqual({ resource: "users", id: "123" }); | ||
|
|
||
| // But /api/users/me should still match static |
There was a problem hiding this comment.
This test includes an in-line “thinking out loud” comment ("Wait, this isn't backtracking. Let me think...") that makes the behavior description harder to follow. Please rewrite this block comment into a concise explanation of the expected control flow/backtracking, or remove it if it’s redundant with the assertions.
| // "users" matches static, but "123" doesn't match "me" → backtrack | |
| // Actually "users" → static child exists, "123" has no static child → go to dynamic | |
| // Wait, this isn't backtracking. Let me think... | |
| // /api/users/me is the static path. /api/users/123 should match dynamic. | |
| // The trie has: api -> users (static) -> me (static), and api -> :resource (dynamic) -> :id (dynamic) | |
| // For /api/users/123: api(static) -> users(static) -> 123 not "me" → no static child | |
| // → no dynamic child at that node → backtrack to api -> :resource(dynamic) matches "users" -> :id matches "123" | |
| const result = trieMatch(trie, ["api", "users", "123"]); | |
| expect(result).not.toBeNull(); | |
| expect(result!.route.pattern).toBe("/api/:resource/:id"); | |
| expect(result!.params).toEqual({ resource: "users", id: "123" }); | |
| // But /api/users/me should still match static | |
| // /api/users/me is fully static, and /api/:resource/:id is dynamic. | |
| // For /api/users/123: api → users matches the static branch, but 123 ≠ "me", | |
| // so that branch is a dead-end and the matcher backtracks to api → :resource → :id. | |
| const result = trieMatch(trie, ["api", "users", "123"]); | |
| expect(result).not.toBeNull(); | |
| expect(result!.route.pattern).toBe("/api/:resource/:id"); | |
| expect(result!.params).toEqual({ resource: "users", id: "123" }); | |
| // /api/users/me should still prefer the more specific static route. |
…tching The server entry no longer inlines matchPattern — it imports the trie module instead. Update the explicit assertion to check for buildRouteTrie and trieMatch, and regenerate all entry template snapshots.
- Correct "radix trie" → "trie (prefix tree)" in doc comment (no path compression) - Remove unused `routes` parameter from `matchRoute` in app-rsc-entry - Replace stream-of-consciousness test comment with concise explanation
The test was asserting the old `matchRoute(url, routes)` signature in the generated code. Updated to match the new `matchRoute(url)` signature after the unused `routes` parameter was removed.
|
/bigbonk review |
There was a problem hiding this comment.
Review: Radix trie route matching
This is a well-structured PR. The trie implementation is clean, the backtracking logic is correct, and the parity test suite is thorough. The integration into all four matching sites is consistent. A few items below — mostly nits and one correctness question.
What looks good
- Correct precedence via traversal order — static > dynamic > catch-all > optional catch-all, with recursive DFS backtracking. This is the right approach.
- Parity test — the inline
linearMatchPatternreference implementation with 30+ URL test cases against the full route set is excellent for catching regressions. - WeakMap caching keyed by array identity in
pages-router.tsandapp-router.tsis a clean way to avoid rebuilding the trie on every request while still invalidating when routes change. - Keeping
matchPatternfor intercept lookup inapp-rsc-entry.tsis the right call — it's a linear scan over a tiny array. - The
validateRoutePatternsinvariant (conflicting dynamic param names at the same depth are rejected) makes the single-dynamicChilddesign safe. Good that this is documented in the test file.
Suggestions / questions below
| } | ||
| if (urlParts.length !== patternParts.length) return null; | ||
| return params; | ||
| const trie = routes === pageRoutes ? _pageRouteTrie : _apiRouteTrie; |
There was a problem hiding this comment.
This routes === pageRoutes identity check is clever but fragile — if anyone refactors matchRoute to accept a filtered or copied array, it silently falls through to _apiRouteTrie (or vice versa). Since this is generated code and the two tries are known at module scope, consider making the trie selection explicit:
| const trie = routes === pageRoutes ? _pageRouteTrie : _apiRouteTrie; | |
| const trie = routes === pageRoutes ? _pageRouteTrie : _apiRouteTrie; |
Actually, this is what you already have — but the fallback case is a silent wrong-answer rather than a loud error. A defensive check would be safer:
const trie = routes === pageRoutes ? _pageRouteTrie
: routes === apiRoutes ? _apiRouteTrie
: (() => { throw new Error('matchRoute called with unknown routes array'); })();This is a minor robustness improvement for generated code. If a third route array is ever added, the current code would silently match against the API trie.
| } | ||
|
|
||
| function matchRoute(url, routes) { | ||
| function matchRoute(url) { |
There was a problem hiding this comment.
Copilot flagged this and I agree: matchRoute(url) no longer takes a routes parameter but the function is called from _handleRequest which previously passed routes explicitly. The unused parameter was removed, which is good.
However, the function now implicitly closes over _routeTrie (a module-level constant). This is fine for the current use case where there's only one route set, but it means the function signature no longer communicates what it matches against. A one-line JSDoc comment would help future readers:
/** Match url against the prebuilt _routeTrie (module-level). */
function matchRoute(url) {| if (part.startsWith(":")) { | ||
| const paramName = part.slice(1); | ||
| if (node.dynamicChild === null) { |
There was a problem hiding this comment.
When a second route inserts a dynamic segment at the same node depth but the dynamicChild already exists, the insertion silently reuses the existing dynamicChild.node but ignores the new route's paramName. This is correct given the validateRoutePatterns invariant (same-depth dynamic params must share the same name), but if that validation is ever bypassed (e.g., in tests or future code paths), the trie would silently use the wrong param name.
Consider adding a debug-mode assertion:
if (node.dynamicChild === null) {
node.dynamicChild = { paramName, node: createNode<R>() };
} else if (process.env.NODE_ENV !== 'production' && node.dynamicChild.paramName !== paramName) {
throw new Error(`Conflicting dynamic param names at same depth: '${node.dynamicChild.paramName}' vs '${paramName}'`);
}This would catch bugs early during development without impacting production perf. Not blocking, but worth considering.
|
|
||
| // Routes are pre-sorted by precedence (static > dynamic > catch-all > optional catch-all), |
There was a problem hiding this comment.
Nit: the parity test's parityRoutes array intentionally avoids routes with conflicting dynamic param names at the same depth (as noted in the comment on line 312). It would strengthen confidence to add an explicit test case that verifies the trie produces the same wrong behavior as the linear scan when this invariant is violated — or better, add a test that buildRouteTrie throws/warns when given conflicting param names.
Right now, there's a latent assumption that validateRoutePatterns is always called first, but the trie functions don't enforce this themselves.
| }); | ||
|
|
||
| it("server entry rejects malformed non-terminal catch-all patterns", async () => { | ||
| it("server entry uses trie-based route matching", async () => { |
There was a problem hiding this comment.
The previous test name was "server entry rejects malformed non-terminal catch-all patterns" and it asserted a specific behavioral property of the generated code. The new test "server entry uses trie-based route matching" only checks that the strings buildRouteTrie and trieMatch appear in the output — it's an implementation detail check rather than a behavioral check.
The malformed catch-all handling hasn't disappeared — it moved into buildRouteTrie (the break on line 67 of route-trie.ts). Consider keeping a behavioral assertion as well, e.g., that the generated code does NOT contain matchPattern (to prove the old linear scan was fully removed from the pages server entry).
|
|
||
| // Catch-all: :name+ (must be terminal — skip malformed non-terminal catch-alls) | ||
| if (part.endsWith("+") && part.startsWith(":")) { | ||
| if (i !== parts.length - 1) break; // malformed: not terminal |
There was a problem hiding this comment.
Silently breaking on malformed non-terminal catch-alls means the route is partially inserted into the trie (the segments before the catch-all created nodes) but never marked as terminal. This is the correct no-match behavior, but the partial trie nodes are wasted memory and could confuse debugging.
Would a continue on the outer loop (skipping the entire route) be cleaner? E.g.:
for (const route of routes) {
const parts = route.patternParts;
// Validate: catch-all must be terminal
const catchAllIdx = parts.findIndex(p => p.startsWith(':') && (p.endsWith('+') || p.endsWith('*')));
if (catchAllIdx !== -1 && catchAllIdx !== parts.length - 1) continue;
// ... normal insertion
}This is a minor concern since malformed patterns are rejected by validateRoutePatterns before reaching the trie, but it would make the trie more self-contained.
|
Review submitted successfully. Here's a summary of the feedback I left: Overall: Clean, well-structured PR. The trie implementation is correct, backtracking logic works, and the parity test suite is thorough. 6 inline comments:
|
Closes #444
Summary
route-trie.ts) that replaces O(n) linear route scanning with O(depth) trie-based lookup for all route matchingpages-router.ts,app-router.ts,app-rsc-entry.ts(generated), andpages-server-entry.ts(generated)matchPatternretained only inapp-rsc-entry.tsfor the intercept lookup (0-5 entries, not worth trie-ifying)Test plan
tests/route-trie.test.tscovering: static, dynamic, catch-all, optional catch-all, backtracking, malformed patterns, deeply nested routes, root routes, and full parity with linearmatchPatternrouting.test.ts,route-sorting.test.ts)app-router.test.ts,pages-router.test.ts,features.test.ts)