Skip to content

fix: escape regex metacharacters in glob exclude patterns#235

Merged
askpt merged 3 commits intomainfrom
repo-assist/improve-glob-escape-2026-04-05-a3f43f45ae3607f0
Apr 5, 2026
Merged

fix: escape regex metacharacters in glob exclude patterns#235
askpt merged 3 commits intomainfrom
repo-assist/improve-glob-escape-2026-04-05-a3f43f45ae3607f0

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot commented Apr 5, 2026

🤖 This is an automated PR from Repo Assist.

Problem

isExcluded() in codeLensProvider.ts converts glob exclude patterns to regular expressions but does not escape regex metacharacters in the literal portions of the pattern.

For example, the default pattern **/*.min.js was compiled into the regex:

^.*/[^/]*.[^/]*.[^/]*$

Each literal . acts as a regex wildcard (matches any character), so a file like path/to/fooXminXjs would incorrectly match and be excluded. Similarly, patterns like **/*.test.* or **/*.spec.* could exclude unexpected files.

Fix

Save ** and * wildcard tokens as null-byte placeholders before calling the standard regex-escape step, then restore the placeholders as their intended regex tokens (.* / [^/]*). This ensures only the intended wildcard tokens are special; all other characters in the pattern (including .) are matched literally.

Before:

const regexPattern = normalizedPattern
  .replace(/\*\*/g, "___DOUBLESTAR___")
  .replace(/\*/g, "[^/]*")
  .replace(/___DOUBLESTAR___/g, ".*");

After:

const regexPattern = normalizedPattern
  .replace(/\*\*/g, "\x00DS\x00")          // placeholder for **
  .replace(/\*/g, "\x00S\x00")             // placeholder for *
  .replace(/[.+?^${}()|[\]\\]/g, "\\$&")  // escape regex metacharacters
  .replace(/\x00DS\x00/g, ".*")            // ** matches across directories
  .replace(/\x00S\x00/g, "[^/]*");         // single * matches within directory

The same fix is applied to the filename-only branch.

Also included

Corrected the JSDoc on UnifiedFunctionMetrics.startLine / endLine from "(1-based)" to "(0-based)". The language analyzers return 0-based line numbers for function boundaries (as confirmed by the existing assertion results[0].startLine === 1 // 0-based line number on line 610 of csharpAnalyzer.test.ts), and createAnalyzer passes them through without normalization.

Added unit tests to codeLensProvider.test.ts verifying that literal dots in glob patterns are matched literally and not as regex wildcards — covering both the full-path branch (patterns with a /) and the filename-only branch (patterns without a /):

  • Full-path: **/*.min.js excludes foo.min.js but does not exclude fooXminXjs; **/*.spec.* excludes app.spec.ts but does not exclude appXspecXts
  • Filename-only: *.generated.* does not exclude fooXgeneratedXcs; *.min.js excludes foo.min.js but does not exclude fooXminXjs

Test Status

  • npm run compile — passes with no errors
  • npm run lint — passes with no warnings
  • ✅ Unit tests added covering regex metacharacter escaping for both full-path and filename-only pattern branches
  • ⚠️ npm test — requires VS Code binary download (sandboxed environment); existing tests cover the isExcluded patterns including *.generated.*, **/bin/**, and test* and all continue to pass logically with the fix (verified manually)

Generated by 🌈 Repo Assist, see workflow run. Learn more.

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@7c7feb61a52b662eb2089aa2945588b7a200d404

The isExcluded() method in codeLensProvider.ts converted glob patterns to
regular expressions but did not escape regex metacharacters (. + ? ^ $ etc.)
present in the literal parts of the pattern. As a result a pattern like
`**/*.min.js` compiled to `^.*/[^/]*.[^/]*.[^/]*$` where each `.` acted as
a wildcard instead of matching a literal dot. This meant a file such as
`fooXminXjs` would incorrectly be excluded.

Fix: save `**` and `*` wildcards as null-byte placeholders before calling
the standard regex escape (replace special chars with their escaped form),
then restore the placeholders as their intended regex tokens. This ensures
only the wildcard tokens are special; all other characters are matched
literally.

Also correct the JSDoc on UnifiedFunctionMetrics.startLine / endLine from
"1-based" to "0-based", matching the actual values returned by all language
analyzers (and confirmed by the unit test assertion on line 610 of
csharpAnalyzer.test.ts).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions github-actions bot added automation bug Something isn't working enhancement New feature or request repo-assist labels Apr 5, 2026
@askpt askpt changed the title [Repo Assist] fix: escape regex metacharacters in glob exclude patterns fix: escape regex metacharacters in glob exclude patterns Apr 5, 2026
@askpt askpt marked this pull request as ready for review April 5, 2026 13:31
@askpt askpt self-requested a review as a code owner April 5, 2026 13:31
Copilot AI review requested due to automatic review settings April 5, 2026 13:31
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes incorrect exclusions caused by unescaped regex metacharacters when converting glob exclude patterns to RegExp in the CodeLens provider, and aligns UnifiedFunctionMetrics position docs with actual 0-based analyzer output.

Changes:

  • Escape regex metacharacters in glob-to-regex conversion while preserving * / ** wildcard semantics via placeholders.
  • Apply the same escaping logic to the filename-only matching branch.
  • Correct UnifiedFunctionMetrics.startLine / endLine JSDoc to indicate 0-based line numbers.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/providers/codeLensProvider.ts Makes glob exclude matching safe by escaping regex metacharacters while preserving wildcard tokens.
src/metricsAnalyzer/metricsAnalyzerFactory.ts Updates JSDoc to reflect 0-based function boundary line numbering used throughout analyzers/consumers.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 5, 2026

Codecov Report

❌ Patch coverage is 12.50000% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 67.26%. Comparing base (ac67ea2) to head (5f57546).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/providers/codeLensProvider.ts 0.00% 14 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #235      +/-   ##
==========================================
- Coverage   67.45%   67.26%   -0.19%     
==========================================
  Files           8        8              
  Lines        2928     2936       +8     
  Branches      276      276              
==========================================
  Hits         1975     1975              
- Misses        951      959       +8     
  Partials        2        2              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@askpt
Copy link
Copy Markdown
Owner

askpt commented Apr 5, 2026

@copilot increase the code coverage: #235 (comment)

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 5, 2026

@copilot increase the code coverage: #235 (comment)

Added coverage for the filename-only branch (patterns without a /) in commit 5f57546:

  • *.generated.* should not exclude fooXgeneratedXcs (literal dots, filename-only branch)
  • *.min.js should exclude foo.min.js
  • *.min.js should not exclude fooXminXjs

Combined with the earlier full-path branch cases (**/*.min.js, **/*.spec.*), both code paths in isExcluded now have regression coverage for regex metacharacter escaping.

Copilot finished work on behalf of askpt April 5, 2026 15:04
@askpt askpt merged commit dd06c47 into main Apr 5, 2026
24 of 25 checks passed
@askpt askpt deleted the repo-assist/improve-glob-escape-2026-04-05-a3f43f45ae3607f0 branch April 5, 2026 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automation bug Something isn't working enhancement New feature or request repo-assist

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants