⚡ Bolt: Optimize text search regex scanning#229
Conversation
…ne scanning This change optimizes the `processBufferLines` method in `SearchEngine` to significantly improve performance for text search, especially on files with dense matches. Previously, the method found *all* regex matches in a buffer (or file) upfront using `getAllMatches`, allocating an array for all match indices. This was inefficient (O(N) memory and time) when only a few results (`maxResults`) were needed. The optimized implementation: 1. Interleaves `regex.exec` calls with line scanning. 2. Finds matches incrementally as needed. 3. Upon finding a match in a line, processes the line and then explicitly sets `regex.lastIndex` to the start of the next line (skipping the rest of the current line and the newline character). 4. This avoids redundant regex execution for multiple matches on the same line and prevents unnecessary scanning of the rest of the file once `maxResults` is reached. Benchmark results on a 3MB file with 500,000 matches (dense): - Before: ~21ms (with unnecessary O(N) overhead) - After: ~12.9ms (~40% faster) - Memory: Avoids allocating ~4MB array for match indices. This change also removes the now unused `getAllMatches` and `processLinesWithMatches` helper methods. Co-authored-by: AhmmedSamier <17784876+AhmmedSamier@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
💡 What: Optimized
SearchEngine.processBufferLinesto interleave regex matching with line scanning and implemented early exit for dense matches.🎯 Why: The previous implementation pre-calculated all regex matches for the entire buffer/file, which was O(N) in memory and CPU even if
maxResultswas small. This caused unnecessary overhead for large files or files with many matches.📊 Impact: Reduces text search time by ~40% for dense match scenarios (from ~21ms to ~12.9ms on a 3MB test file) and eliminates O(N) memory allocation for match indices.
🔬 Measurement: Verified with a custom reproduction benchmark script (
reproduce_text_search.ts) and existing tests.PR created automatically by Jules for task 9024906484258067808 started by @AhmmedSamier