-
Notifications
You must be signed in to change notification settings - Fork 2
[lib] Mitigate regex ReDoS paths #532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughRefactors sheet formula reference adjustment to a manual parser, adds batch sheet update utility, and hardens regex page search with limits, timeout-aware transactions, literal-only line previews, and expanded tests including a large-input performance case. Changes
Sequence DiagramsequenceDiagram
actor Client
participant RegexSearchService
participant Database
participant ErrorHandler
Client->>RegexSearchService: requestRegexSearch(pattern, searchIn, maxResults)
RegexSearchService->>RegexSearchService: validatePatternLength() / isLiteralRegexPattern()
RegexSearchService->>Database: begin transaction + set local statement_timeout
RegexSearchService->>Database: execute regex search query
alt Query Succeeds
Database-->>RegexSearchService: matchingPages
RegexSearchService->>RegexSearchService: for each page: compute path, if literal -> extractLiteralMatchingLines()
RegexSearchService-->>Client: results (pages, matchingLines, totalMatches)
else Statement Timeout / DB Error
Database-->>ErrorHandler: error
ErrorHandler->>RegexSearchService: isRegexQueryTimeoutError?
RegexSearchService->>RegexSearchService: buildRegexTimeoutResponse(driveSlug, pattern, searchIn)
RegexSearchService-->>Client: timeoutResponse (empty results, guidance, stats=0)
end
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@packages/lib/src/services/drive-search-service.ts`:
- Around line 295-324: extractLiteralMatchingLines currently lowercases both
pattern and lines for matching, causing previews to be case-insensitive while
the upstream PostgreSQL query uses the case-sensitive ~ operator; to fix, make
the function do case-sensitive matching by removing .toLowerCase() and using the
original literalPattern (e.g., set needle = literalPattern) and replace
line.toLowerCase().includes(needle) with line.includes(needle) so previews match
DB behavior; keep use of MAX_REGEX_LINE_PREVIEWS and
MAX_REGEX_LINE_CONTENT_LENGTH unchanged.
- Around line 386-400: The transaction block using db.transaction currently
executes sql`SET LOCAL statement_timeout = ${REGEX_QUERY_TIMEOUT_MS}` which
cannot accept bind parameters; replace that SET LOCAL call with a parameterized
set_config call so the timeout is passed as a parameter (e.g., call sql`SELECT
set_config('statement_timeout', ${REGEX_QUERY_TIMEOUT_MS}::text, true)` before
the select). Update the code around db.transaction / matchingPages so the
set_config SELECT runs inside the same transaction prior to the tx.select,
keeping the third argument true to make the setting local to the transaction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/lib/src/services/drive-search-service.ts (1)
370-376:⚠️ Potential issue | 🔴 CriticalSQL operator precedence bug:
ORwithout parentheses bypasses drive/trash filters.In the
bothbranch, thesqlfragmentcontent ~ $1 OR title ~ $2is combined withand(eq(driveId, …), eq(isTrashed, false), …). Drizzle'sand()emits… AND … AND <fragment>, so the generated SQL becomes:"driveId" = $1 AND "isTrashed" = false AND "content" ~ $2 OR "title" ~ $3Because
ANDbinds tighter thanOR, this is parsed as(… AND content ~ $2) OR (title ~ $3), meaning a title match skips the drive and trash filters entirely and can return pages from any drive.Wrap the
ORin parentheses:Proposed fix
whereConditions = and( eq(pages.driveId, driveId), eq(pages.isTrashed, false), - sql`${pages.content} ~ ${pgPattern} OR ${pages.title} ~ ${pgPattern}` + sql`(${pages.content} ~ ${pgPattern} OR ${pages.title} ~ ${pgPattern})` );
🧹 Nitpick comments (2)
packages/lib/src/services/drive-search-service.ts (2)
389-389: Minor:set_configexpects all-text arguments — pass the timeout as a string.
REGEX_QUERY_TIMEOUT_MSis anumber(3000). While the::textSQL cast should handle this, it's cleaner to pass the value as a string directly to avoid relying on implicit PG casts and to match whatset_config(text, text, boolean)expects.Suggested tweak
await tx.execute( - sql`SELECT set_config('statement_timeout', ${REGEX_QUERY_TIMEOUT_MS}::text, true)` + sql`SELECT set_config('statement_timeout', ${String(REGEX_QUERY_TIMEOUT_MS)}, true)` );
424-443: N+1 parent-chain queries per matched page.Each result page walks up its parent chain with individual
SELECTqueries. For deep hierarchies or many results (up to 100), this can generate a large number of round-trips. This appears to be pre-existing behavior, so no action required for this PR, but it's worth noting for future optimization (e.g., a recursive CTE or pre-fetching parents in bulk).
- Add JSDoc for security constants explaining ReDoS mitigation strategy - Fix SQL OR precedence bug: wrap OR clause in parentheses to prevent bypassing drive/trash filters (critical security fix) - Pass statement_timeout as string to set_config for cleaner PG compat - Add observability: console.warn on regex query timeouts - Align test with case-sensitive PostgreSQL ~ operator behavior Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Addressing Code Review FeedbackThanks for the thorough review @coderabbitai! I've addressed the feedback in commit Critical Issues Fixed:
Minor Issues Fixed:
Observability Added:
Documentation Added:
The N+1 parent-chain query issue is noted as pre-existing behavior - will address in a future optimization PR. |
|
Tip For best results, initiate chat on the files or code changes.
The fixes look great:
The approach to defer the N+1 parent-chain query optimization to a future PR makes sense since it's pre-existing behavior and separate from the security fixes in this PR. The security hardening looks solid now! 🐰 (_/) If you found this review helpful, would you consider giving us a shout-out on X? Thank you for using CodeRabbit! |
Summary
regexSearchPagesagainst user-controlled regex execution by removing dynamicnew RegExp(pattern, ...)usestatement_timeouthandling and stable timeout responsesadjustFormulaReferenceswith a linear scanner/parser to avoid polynomial backtrackingAlert status
js/regex-injection), Fix/typescript test errors #90 (js/polynomial-redos)Testing
pnpm --filter @pagespace/lib testpnpm --filter @pagespace/lib typecheckpnpm --filter web test -- 'src/app/api/drives/[driveId]/search/regex/__tests__/route.test.ts'Reason: dependency install is blocked by network (
ENOTFOUND registry.npmjs.org), so localvitest/tscare unavailable in this worktree.Summary by CodeRabbit
New Features
Bug Fixes
Tests