Skip to content

regex surface API#3222

Merged
bobzhang merged 4 commits intomainfrom
yuxiang/regex-api
Feb 26, 2026
Merged

regex surface API#3222
bobzhang merged 4 commits intomainfrom
yuxiang/regex-api

Conversation

@hackwaly
Copy link
Copy Markdown
Contributor

@hackwaly hackwaly commented Feb 12, 2026

Summary

This PR introduces a public Regex surface API for string processing, hardens Unicode/surrogate correctness in regex matching, and aligns substring APIs with panic-on-invalid-index behavior.

What changed

  • Exposed Regex as a public type and added pattern-based construction with lazy internal compilation.
  • Added higher-level Regex capabilities for:
    • literal regex construction
    • regex composition (sequence and alternation)
    • quantifier-based repetition
    • iterative matching, regex-based splitting, and callback-based replacement
  • Expanded match result helpers for full match content, group access, named-group access, and before/after slices.
  • Re-exported Regex from prelude for easier usage.
  • Added user-facing documentation and comprehensive tests for API behavior and edge cases.

Unicode and boundary fixes

  • Character-class parsing now clips positive ranges to valid profile code-point sets.
  • Unicode profile handling excludes surrogate holes to avoid half-surrogate matches.
  • Word-boundary handling now distinguishes start/end non-word categories around surrogate pairs, fixing incorrect non-word-boundary behavior in those positions.

Behavioral change

  • String::sub and StringView::sub now panic on invalid indices instead of raising a view-creation error.
  • The corresponding view-creation error type is removed from the public surface.
  • Related call sites, tests, and changelog entries are updated.

Validation

  • Added and updated extensive tests covering compile/execute paths, split/find/replace behavior, zero-width matches, capture and named-capture access, and surrogate-pair edge cases.
Open in Devin Review

@coveralls
Copy link
Copy Markdown
Collaborator

coveralls commented Feb 12, 2026

Pull Request Test Coverage Report for Build 2643

Details

  • 83 of 91 (91.21%) changed or added relevant lines in 6 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.7%) to 96.711%

Changes Missing Coverage Covered Lines Changed/Added Lines %
bytes/regex.mbt 5 6 83.33%
string/regex.mbt 6 7 85.71%
bytes/regex_methods.mbt 29 32 90.63%
string/regex_methods.mbt 33 36 91.67%
Totals Coverage Status
Change from base Build 2631: 0.7%
Covered Lines: 12086
Relevant Lines: 12497

💛 - Coveralls

@hackwaly hackwaly force-pushed the yuxiang/regex-api branch 3 times, most recently from aeabf96 to ace9749 Compare February 12, 2026 14:44
@hackwaly hackwaly marked this pull request as ready for review February 12, 2026 14:56
@hackwaly hackwaly requested a review from bobzhang February 12, 2026 14:59
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@hackwaly hackwaly marked this pull request as draft February 12, 2026 15:14
@hackwaly hackwaly removed the request for review from bobzhang February 12, 2026 15:14
@hackwaly hackwaly marked this pull request as ready for review February 12, 2026 15:48
@hackwaly hackwaly requested a review from bobzhang February 13, 2026 02:52
@hackwaly hackwaly force-pushed the yuxiang/regex-api branch 2 times, most recently from 8db3295 to 029a2b5 Compare February 13, 2026 13:45
@hackwaly hackwaly marked this pull request as draft February 13, 2026 13:45
@hackwaly hackwaly force-pushed the yuxiang/regex-api branch 2 times, most recently from 8c34579 to 0b7b786 Compare February 14, 2026 09:10
@hackwaly hackwaly marked this pull request as ready for review February 14, 2026 09:20
chatgpt-codex-connector[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

chatgpt-codex-connector[bot]

This comment was marked as resolved.

chatgpt-codex-connector[bot]

This comment was marked as resolved.

@hackwaly hackwaly force-pushed the yuxiang/regex-api branch 2 times, most recently from 54a4506 to 7f76f35 Compare February 25, 2026 10:17
chatgpt-codex-connector[bot]

This comment was marked as resolved.

@hackwaly hackwaly force-pushed the yuxiang/regex-api branch 2 times, most recently from 63aa610 to 81d798d Compare February 25, 2026 10:37
chatgpt-codex-connector[bot]

This comment was marked as resolved.

chatgpt-codex-connector[bot]

This comment was marked as resolved.

chatgpt-codex-connector[bot]

This comment was marked as resolved.

@hackwaly hackwaly force-pushed the yuxiang/regex-api branch 2 times, most recently from 57d3463 to a093415 Compare February 26, 2026 06:11
@bobzhang bobzhang merged commit 0eb6372 into main Feb 26, 2026
13 of 14 checks passed
@bobzhang bobzhang deleted the yuxiang/regex-api branch February 26, 2026 06:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants