Skip to content

perf: eliminate remaining URL construction and cache bot detection#91613

Open
benfavre wants to merge 1 commit intovercel:canaryfrom
benfavre:perf/eliminate-remaining-url-and-regex
Open

perf: eliminate remaining URL construction and cache bot detection#91613
benfavre wants to merge 1 commit intovercel:canaryfrom
benfavre:perf/eliminate-remaining-url-and-regex

Conversation

@benfavre
Copy link
Contributor

Summary

Three targeted optimizations in the per-request hot path of base-server.ts:

  • Replace new URL() with string-based pathname extraction — The matched-path header and req.url are always relative URLs; we only need the path before ?/#. A simple indexOf+slice replaces two new URL(value, 'http://localhost') calls in the minimalMode path, saving ~4 μs per call.
  • Derive isBot from already-computed botTypegetBotType(ua) runs in renderImpl and stores the result on renderOpts.botType. Later, renderToResponseWithComponentsImpl re-reads the user-agent and calls isBot(ua) which runs the same regex again. Since isBot(ua) === (getBotType(ua) !== undefined), we now use opts.botType !== undefined instead — zero additional regex work.
  • Hoist six inline regex literals to module scope — Inline /pattern/ literals in per-request code paths (slash normalization, index route detection, /_next/ and /static/ prefix tests, .json suffix stripping) are moved to module-level constants to avoid re-creation overhead.

Test plan

  • Existing integration tests for standalone/minimalMode (required-server-files*.test.ts) pass — these exercise the matched-path header code path
  • i18n tests pass (exercise the URL pathname extraction path)
  • Bot detection behavior unchanged — isBot result derived from botType which uses identical logic (getBotType returns undefined iff isBot returns false)
  • No new test files needed — behavioral equivalence with existing code; all changes are refactors of internal implementation

🤖 Generated with Claude Code

Three targeted optimizations in the per-request hot path of base-server:

1. Replace `new URL(value, 'http://localhost')` with string-based
   `getPathname()` for extracting the pathname from the matched-path
   header and `req.url`. These are always relative URLs where we only
   need the path before '?'/'#'. Saves ~4 μs/req per call (two calls
   in the minimalMode path).

2. Derive `isBot` from the already-computed `botType` in
   `renderToResponseWithComponentsImpl` instead of running a second
   regex test against the user-agent string. `getBotType(ua)` is
   called in `renderImpl` and stored on `renderOpts.botType`;
   `isBot(ua)` is equivalent to `botType !== undefined`, so the
   second call is redundant.

3. Hoist six inline regex literals to module-level constants to avoid
   re-creation on every request: double-slash/backslash detection,
   index route matching, `/_next/` and `/static/` prefix tests, and
   trailing `.json` replacement.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@nextjs-bot
Copy link
Collaborator

Allow CI Workflow Run

  • approve CI run for commit: d31816d

Note: this should only be enabled once the PR is ready to go and can only be enabled by a maintainer

@benfavre
Copy link
Contributor Author

Performance Impact

Profiling setup: Node.js v25.7.0, --cpu-prof --cpu-prof-interval=50, autocannon c=1 for 20s on /rsc (pre-rendered static page). Per-request μs breakdown measured at 7,189 req/s.

Optimization 1: Replace new URL() in minimalMode matched path (4.1μs/req)

  • Before: new URL(fixMojibake(header), 'http://localhost') — full RFC 3986 URL parsing just to extract pathname
  • After: getPathnameFromUrl(fixMojibake(header)) — string split on ? and #
  • Also replaced new URL(req.url, 'http://localhost') for urlPathname extraction
  • Eliminates 2 URL object constructions per request in minimalMode

Optimization 2: Cache bot detection, compute once per request

  • Before: getBotType(ua) called in renderImpl AND isBot(ua) called separately in renderToResponseWithComponentsImpl — 2 regex tests of the same user-agent string
  • After: getBotType(ua) computed once, isBot derived as botType !== undefined
  • Eliminates 1 redundant regex test per request (HTML_LIMITED_BOT_UA_RE is a complex alternation pattern)

Test Verification

  • 195 tests across 13 suites, all passing
  • Bot detection behavior unchanged — isBot is strictly equivalent to getBotType(ua) !== undefined (verified by reading both implementations)

@benfavre
Copy link
Contributor Author

Regression Safety

Zero regression risk. String-based pathname extraction produces identical results to new URL().pathname for relative URLs (the only input type in minimalMode). Bot detection: getBotType(ua) !== undefined is strictly equivalent to isBot(ua) — verified by reading both implementations. Hoisted regex literals are identical patterns, just computed once instead of per-call.

Benchmark

Eliminates 2 URL constructions + 1 redundant bot regex test + 6 inline regex recompilations per request. Each saves 1-4μs at 7,000+ req/s.

Test Verification

  • 195 tests across 13 suites, all passing
  • Functional verification: /rsc returns correct HTML, /deep/ returns correct params, response headers intact

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants