Skip to content

Comments

[BUGFIX] Fix escape handling in TextRoleRule#1

Draft
CybotTM wants to merge 1 commit intomainfrom
fix/textrole-escape
Draft

[BUGFIX] Fix escape handling in TextRoleRule#1
CybotTM wants to merge 1 commit intomainfrom
fix/textrole-escape

Conversation

@CybotTM
Copy link
Member

@CybotTM CybotTM commented Feb 24, 2026

Summary

Fixes escape handling bugs in TextRoleRule for code-type text roles (:rst:, :php:, :code:, etc.).

Resolves: TYPO3-Documentation/render-guides#1188

Bug 1: Escaped backslash stays literal in rawPart

Code-type text roles use $rawContent (via phpDocumentor#533). When the author writes \\ to display a single backslash, $rawPart preserved the literal \\ instead of resolving it.

RST input:

:code:`a\\b`

Before (broken):

<code>a\\b</code>

After (fixed):

<code>a\b</code>

Other escapes like \T or \* are intentionally preserved raw in $rawPart — code contexts need literal backslash-letter sequences (e.g., PHP namespaces like \App\Entity).

Bug 2: Escaped backtick swallows closing delimiter

The lexer has a catchable pattern that tokenizes a 3-char sequence (backslash + two backticks) as a single ESCAPED_SIGN token. This swallows the closing backtick of the text role, so TextRoleRule never finds a BACKTICK token to close — it rolls back and the role breaks entirely.

RST input:

:code:`text\``

Before (broken): role is not recognized, rendered as literal text:

:code:`text\``

After (fixed): post-loop recovery detects the swallowed backtick. The 3-char token is split semantically:

\``  =  \`  (escaped backtick → literal `)  +  `  (closing delimiter)

The escape character is consumed, the backtick becomes content:

<code>text`</code>

The recovery is narrowed to only the 3-char token — the only lexer pattern that swallows backticks. Regular 2-char escapes at end-of-input (like \T or \\ without a closing backtick) correctly roll back as genuinely unterminated roles:

RST input (genuinely unterminated — no closing backtick):

:role:`content\T

Result: rolls back, no role node produced (correct — the role was never closed).

Changes

TextRoleRule.php:

  • Resolve \\\ in $rawPart (other escapes preserved raw)
  • Add EOF fallback for the 3-char token: escape consumed, literal backtick in content
  • Extract createTextRoleNode() helper to deduplicate role-building logic
  • Simplify escape tracking from two variables to one ($lastEscapedToken)

TextRoleRuleTest.php:

  • 3 new positive cases: escaped backslash, escaped backtick at end, only escaped backtick
  • 2 new negative cases: unterminated roles with \T and \\ at end → assert rollback

Functional tests:

  • Updated code-textrole-no-escape.html expected output
  • New code-textrole-escape/ test fixture

Test plan

  • All 8 unit tests pass (6 positive + 2 negative rollback cases)
  • All 3 textrole functional tests pass
  • No regressions in existing text role behavior

@gemini-code-assist
Copy link

Summary of Changes

Hello @CybotTM, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses two critical bugs in the TextRoleRule that impacted the correct handling of escape sequences within code-type text roles. The changes ensure that escaped backslashes are properly resolved and that escaped backticks do not prematurely terminate text roles, thereby improving the accuracy and robustness of reStructuredText parsing for code snippets.

Highlights

  • Resolved \\ escape in code-type text roles: The TextRoleRule now correctly resolves \\ to a single \ in $rawPart for code contexts, addressing a bug where it previously remained literal. Other escapes like \T are preserved for literal interpretation.
  • **Fixed \`` consuming closing backtick**: Implemented logic to handle cases where an ESCAPED_SIGNtoken for`` incorrectly consumed the closing backtick. The rule now rolls back the escaped content, treats the backslash as literal, and successfully closes the role, preventing parsing failures.
Changelog
  • packages/guides-restructured-text/src/RestructuredText/Parser/Productions/InlineRules/TextRoleRule.php
    • Added strlen function import.
    • Introduced lastEscapedPartLen and lastEscapedRawPartLen variables to track the length of escaped content.
    • Modified ESCAPED_SIGN token handling to resolve \\ to \ in $rawPart while preserving other escape sequences.
    • Implemented a new conditional block to manage scenarios where an ESCAPED_SIGN token, specifically ```, consumes the closing backtick, including a rollback and re-processing mechanism.
  • packages/guides-restructured-text/tests/unit/Parser/Productions/InlineRules/TextRoleRuleTest.php
    • Added three new data provider cases to thoroughly test the corrected escape handling for text roles, including a\\b, text\``, and ``.
  • tests/Functional/tests/code-textrole-escape/code-textrole-escape.html
    • Added a new functional test output file to demonstrate the correct HTML rendering of code text roles with various escape sequences.
  • tests/Functional/tests/code-textrole-escape/code-textrole-escape.rst
    • Added a new functional test input file containing reStructuredText examples with a\\b, text\``, and `` to verify escape sequence parsing.
  • tests/Functional/tests/code-textrole-no-escape/code-textrole-no-escape.html
    • Updated the expected HTML output for the code-textrole-no-escape test, specifically changing \\ to \ within a <code> tag to reflect the corrected behavior.
Activity
  • All 255 unit tests passed.
  • All 116 functional tests passed, including updated and newly added tests.
  • No regressions were identified in existing text role behavior.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses two bugs in TextRoleRule related to escape sequence handling. The logic for resolving \\ in $rawPart and the fallback for escaped backticks at the end of a role are well-implemented. The new unit and functional tests provide good coverage for these fixes. I have one suggestion to improve code maintainability by refactoring duplicated logic.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes escape-sequence handling for reStructuredText text roles (notably code-like roles such as :code:), ensuring escaped backslashes are represented correctly in raw role content and adding a recovery path when the lexer swallows a closing backtick via ```-style tokenization.

Changes:

  • Adjust TextRoleRule to resolve escaped backslashes (\\\) in $rawPart, while preserving other escapes verbatim for code contexts.
  • Add an EOF fallback in TextRoleRule to handle the lexer’s special ``` + closing-backtick tokenization case.
  • Extend unit + functional coverage with new cases and updated expected HTML outputs.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
packages/guides-restructured-text/src/RestructuredText/Parser/Productions/InlineRules/TextRoleRule.php Fixes raw escape handling and adds EOF fallback logic to close roles when backticks are swallowed.
packages/guides-restructured-text/tests/unit/Parser/Productions/InlineRules/TextRoleRuleTest.php Adds unit test cases for escaped backslash and escaped-backtick-at-end scenarios.
tests/Functional/tests/code-textrole-no-escape/code-textrole-no-escape.html Updates expected output to reflect \\\ behavior in :code: roles.
tests/Functional/tests/code-textrole-escape/code-textrole-escape.rst New functional input covering backslash + backtick edge cases in :code: roles.
tests/Functional/tests/code-textrole-escape/code-textrole-escape.html New functional expected output validating the new escape behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@CybotTM
Copy link
Member Author

CybotTM commented Feb 24, 2026

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses two critical bugs related to escape sequence handling in TextRoleRule. The changes correctly resolve \\ to \ in $rawPart for code contexts and implement a robust fallback mechanism to handle cases where an escaped backtick (\``) incorrectly consumes the closing delimiter. The introduction of the createTextRoleNode` private method enhances code maintainability by encapsulating node creation logic. Furthermore, the addition of comprehensive unit and functional tests, along with the update of an existing functional test, demonstrates thorough validation of the fixes. The solution is well-implemented and directly resolves the reported issues, improving the robustness of text role parsing.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@CybotTM CybotTM changed the title WIP: [BUGFIX] Fix escape handling in TextRoleRule [BUGFIX] Fix escape handling in TextRoleRule Feb 24, 2026
@CybotTM CybotTM requested a review from Copilot February 24, 2026 13:20
@CybotTM
Copy link
Member Author

CybotTM commented Feb 24, 2026

/gemini review

@gemini-code-assist
Copy link

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Fix two bugs causing code-type text roles (:rst:, :php:, :code:,
etc.) to mishandle escape sequences:

1. $rawPart preserved \\ instead of resolving to \. Now resolved
   for escaped-backslash; other escapes (\T, \*) stay raw for code
   contexts (e.g. PHP namespaces).

2. The lexer's \`` catchable pattern (3 chars) swallows the closing
   backtick as part of an ESCAPED_SIGN token. The role never closes
   and rolls back. Now detected post-loop: the escape is consumed
   and a literal backtick becomes content. Narrowed to only the
   3-char token; regular 2-char escapes at EOF correctly roll back.

Resolves: TYPO3-Documentation/render-guides#1188
@CybotTM CybotTM force-pushed the fix/textrole-escape branch from 6be902c to de35afa Compare February 24, 2026 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] :rst: text role cannot display a single backslash character

1 participant