Skip to content

Conversation

@2witstudios
Copy link
Owner

@2witstudios 2witstudios commented Feb 10, 2026

Summary

Alert status

Testing

  • not run in this environment: pnpm --filter @pagespace/lib test
  • not run in this environment: pnpm --filter @pagespace/lib typecheck
  • not run in this environment: pnpm --filter web test -- 'src/app/api/drives/[driveId]/search/regex/__tests__/route.test.ts'

Reason: dependency install is blocked by network (ENOTFOUND registry.npmjs.org), so local vitest/tsc are unavailable in this worktree.

Summary by CodeRabbit

  • New Features

    • Batch cell update API for efficient sheet modifications
    • Safer, capped regex search with line-preview support and query timeout handling
  • Bug Fixes

    • Improved formula reference adjustment to reliably handle complex patterns and long tokens
  • Tests

    • Expanded test coverage for sheet formulas and regex-driven search behavior, including timeout scenarios

@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 10, 2026

Warning

Rate limit exceeded

@2witstudios has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 3 minutes and 47 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

Refactors sheet formula reference adjustment to a manual parser, adds batch sheet update utility, and hardens regex page search with limits, timeout-aware transactions, literal-only line previews, and expanded tests including a large-input performance case.

Changes

Cohort / File(s) Summary
Sheet Formula Handling
packages/lib/src/sheets/sheet.ts, packages/lib/src/__tests__/sheet.test.ts
Replaced regex-based adjustFormulaReferences with a manual parsing loop and helpers (isUpperAsciiLetter, isAsciiDigit, splitEncodedCellAddress). Added updateSheetCells for batch updates. Expanded tests to include a long-uppercase-run performance case.
Drive Search Service
packages/lib/src/services/drive-search-service.ts, packages/lib/src/services/__tests__/drive-search-service.test.ts
Added regex-related limits and constants, timeout detection, literal-pattern detection, and helper extractLiteralMatchingLines. Refactored regexSearchPages to set statement_timeout in transactions, cap results, and return timeout-aware responses. New tests cover literal matches, unsafe regex handling, and PostgreSQL statement-timeout behavior using mocked transactions.

Sequence Diagram

sequenceDiagram
    actor Client
    participant RegexSearchService
    participant Database
    participant ErrorHandler

    Client->>RegexSearchService: requestRegexSearch(pattern, searchIn, maxResults)
    RegexSearchService->>RegexSearchService: validatePatternLength() / isLiteralRegexPattern()
    RegexSearchService->>Database: begin transaction + set local statement_timeout
    RegexSearchService->>Database: execute regex search query
    alt Query Succeeds
        Database-->>RegexSearchService: matchingPages
        RegexSearchService->>RegexSearchService: for each page: compute path, if literal -> extractLiteralMatchingLines()
        RegexSearchService-->>Client: results (pages, matchingLines, totalMatches)
    else Statement Timeout / DB Error
        Database-->>ErrorHandler: error
        ErrorHandler->>RegexSearchService: isRegexQueryTimeoutError?
        RegexSearchService->>RegexSearchService: buildRegexTimeoutResponse(driveSlug, pattern, searchIn)
        RegexSearchService-->>Client: timeoutResponse (empty results, guidance, stats=0)
    end
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇 I hopped through formulas, letters long and grand,

Tuned regex fences, and held timeouts by the hand.
Lines now safe to peek, updates quick and neat,
Tests jump, tails flick—performance stays upbeat!

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title '[lib] Mitigate regex ReDoS paths' directly and clearly describes the main objective of the pull request, which is to address ReDoS (Regular Expression Denial of Service) vulnerabilities in regex handling, as evidenced by the primary changes to regexSearchPages and adjustFormulaReferences.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch codex/security-codeql-83-90-fixes

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@packages/lib/src/services/drive-search-service.ts`:
- Around line 295-324: extractLiteralMatchingLines currently lowercases both
pattern and lines for matching, causing previews to be case-insensitive while
the upstream PostgreSQL query uses the case-sensitive ~ operator; to fix, make
the function do case-sensitive matching by removing .toLowerCase() and using the
original literalPattern (e.g., set needle = literalPattern) and replace
line.toLowerCase().includes(needle) with line.includes(needle) so previews match
DB behavior; keep use of MAX_REGEX_LINE_PREVIEWS and
MAX_REGEX_LINE_CONTENT_LENGTH unchanged.
- Around line 386-400: The transaction block using db.transaction currently
executes sql`SET LOCAL statement_timeout = ${REGEX_QUERY_TIMEOUT_MS}` which
cannot accept bind parameters; replace that SET LOCAL call with a parameterized
set_config call so the timeout is passed as a parameter (e.g., call sql`SELECT
set_config('statement_timeout', ${REGEX_QUERY_TIMEOUT_MS}::text, true)` before
the select). Update the code around db.transaction / matchingPages so the
set_config SELECT runs inside the same transaction prior to the tx.select,
keeping the third argument true to make the setting local to the transaction.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/lib/src/services/drive-search-service.ts (1)

370-376: ⚠️ Potential issue | 🔴 Critical

SQL operator precedence bug: OR without parentheses bypasses drive/trash filters.

In the both branch, the sql fragment content ~ $1 OR title ~ $2 is combined with and(eq(driveId, …), eq(isTrashed, false), …). Drizzle's and() emits … AND … AND <fragment>, so the generated SQL becomes:

"driveId" = $1 AND "isTrashed" = false AND "content" ~ $2 OR "title" ~ $3

Because AND binds tighter than OR, this is parsed as (… AND content ~ $2) OR (title ~ $3), meaning a title match skips the drive and trash filters entirely and can return pages from any drive.

Wrap the OR in parentheses:

Proposed fix
     whereConditions = and(
       eq(pages.driveId, driveId),
       eq(pages.isTrashed, false),
-      sql`${pages.content} ~ ${pgPattern} OR ${pages.title} ~ ${pgPattern}`
+      sql`(${pages.content} ~ ${pgPattern} OR ${pages.title} ~ ${pgPattern})`
     );
🧹 Nitpick comments (2)
packages/lib/src/services/drive-search-service.ts (2)

389-389: Minor: set_config expects all-text arguments — pass the timeout as a string.

REGEX_QUERY_TIMEOUT_MS is a number (3000). While the ::text SQL cast should handle this, it's cleaner to pass the value as a string directly to avoid relying on implicit PG casts and to match what set_config(text, text, boolean) expects.

Suggested tweak
       await tx.execute(
-        sql`SELECT set_config('statement_timeout', ${REGEX_QUERY_TIMEOUT_MS}::text, true)`
+        sql`SELECT set_config('statement_timeout', ${String(REGEX_QUERY_TIMEOUT_MS)}, true)`
       );

424-443: N+1 parent-chain queries per matched page.

Each result page walks up its parent chain with individual SELECT queries. For deep hierarchies or many results (up to 100), this can generate a large number of round-trips. This appears to be pre-existing behavior, so no action required for this PR, but it's worth noting for future optimization (e.g., a recursive CTE or pre-fetching parents in bulk).

- Add JSDoc for security constants explaining ReDoS mitigation strategy
- Fix SQL OR precedence bug: wrap OR clause in parentheses to prevent
  bypassing drive/trash filters (critical security fix)
- Pass statement_timeout as string to set_config for cleaner PG compat
- Add observability: console.warn on regex query timeouts
- Align test with case-sensitive PostgreSQL ~ operator behavior

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@2witstudios
Copy link
Owner Author

Addressing Code Review Feedback

Thanks for the thorough review @coderabbitai! I've addressed the feedback in commit 77e25155:

Critical Issues Fixed:

  1. SQL OR Precedence Bug - Wrapped the OR clause in parentheses to prevent bypassing drive/trash filters:
    (content ~ pattern OR title ~ pattern)

Minor Issues Fixed:

  1. Case-sensitivity alignment - The test now uses lowercase 'hello' in both lines to match PostgreSQL's case-sensitive ~ operator behavior. The implementation was already case-sensitive.

  2. String timeout parameter - Changed REGEX_QUERY_TIMEOUT_MS::text to String(REGEX_QUERY_TIMEOUT_MS) for cleaner PostgreSQL compatibility.

Observability Added:

  1. Added console.warn for regex query timeouts to enable monitoring of potential attack patterns.

Documentation Added:

  1. Added JSDoc explaining the security constants and ReDoS mitigation strategy.

The N+1 parent-chain query issue is noted as pre-existing behavior - will address in a future optimization PR.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 10, 2026

Tip

For best results, initiate chat on the files or code changes.

@2witstudios, excellent work addressing the feedback! 🎉

The fixes look great:

  1. SQL OR Precedence - Wrapping the OR clause in parentheses is exactly the right fix to ensure the drive/trash filters aren't bypassed.

  2. Case-sensitivity - Good catch aligning the test expectations with PostgreSQL's ~ operator behavior.

  3. String timeout parameter - Using String(REGEX_QUERY_TIMEOUT_MS) is cleaner and more idiomatic.

  4. Observability - Adding console.warn for timeouts is smart for detecting potential attack patterns in production.

  5. Documentation - JSDoc explaining the security constants will help future maintainers understand the ReDoS mitigation strategy.

The approach to defer the N+1 parent-chain query optimization to a future PR makes sense since it's pre-existing behavior and separate from the security fixes in this PR.

The security hardening looks solid now! 🐰

(_/)
(='.'=)
(")
(") ✨


If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

@2witstudios 2witstudios merged commit 2b35bc9 into master Feb 10, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant