Skip to content
/ server Public

MDEV-38904 Strings: Fix my_convert infinite loop and latin7 collation#4749

Open
itzanway wants to merge 2 commits intoMariaDB:mainfrom
itzanway:fix-mdev-38904
Open

MDEV-38904 Strings: Fix my_convert infinite loop and latin7 collation#4749
itzanway wants to merge 2 commits intoMariaDB:mainfrom
itzanway:fix-mdev-38904

Conversation

@itzanway
Copy link

@itzanway itzanway commented Mar 6, 2026

This PR addresses MDEV-38904, which involves a transitivity violation causing false index corruption in latin7 tables, and a subsequent vulnerability where the server hangs at 100% CPU during string conversion of malformed data.

1. Failsafe Loop Fix: my_convert Infinite Loop (strings/ctype.c)

The Problem: Even in cases of severe table corruption or malformed byte sequences, the server should never hang. The previous implementation of my_convert_using_func and my_convert_fix failed to explicitly force a pointer advancement when the character set's mb_wc function returned a length of 0 (which occurs when reading corrupted byte sequences).
The Fix: Added an explicit from++ advancement when cnvres == 0. This ensures the conversion loop always strictly terminates, replacing the unreadable/malformed bytes with a ? placeholder instead of looping infinitely.

2. Collation Logic Fix: latin7_general_ci (strings/ctype-extra.c)

The Problem:
A transitivity violation existed in the latin7_general_ci collation. In the sort_order_latin7_general_ci array, the hyphen (-, 0x2D) and the space ( , 0x20) were assigned overlapping weights. During MyISAM/Aria index compression, these collisions resulted in circular B-tree pointers, causing CHECK TABLE to falsely report "Key in wrong position" and flagging healthy tables as corrupted.
The Proposed Fix:
Adjusted the sort_order_latin7_general_ci weights to ensure that the space character (Index 32) is uniquely weighted as the minimum printable value, and that the hyphen (Index 45) has a strictly distinct weight to prevent B-tree sorting collisions.


I have updated this PR to address the requested housekeeping items:

  • Squashed all changes into this single, clean commit.
  • Formatted the commit message to strictly comply with CODING_STANDARDS.md.
  • Cleaned the patch to ensure there are no space-only/tab-conversion changes.
  • Added MTR test cases (mdev_38904.test) to reliably verify the loop termination and CHECK TABLE behavior.

Regarding the Collation Weights (Backwards Compatibility):
I completely understand and agree with your warning. Adjusting the existing weights on latin7_general_ci will break backwards compatibility for on-disk B-tree indexes across existing deployments.

I have left the adjusted logic in this commit solely so you can review the proposed transitivity fix. Moving forward to the final review, how would you prefer to architect this to protect user data? Should we implement these corrected weights under a brand new collation ID/name, or is there a preferred versioning mechanism MariaDB uses for patching existing collations?

Looking forward to your guidance!

itzanway added 2 commits March 7, 2026 02:02
The server hangs at 100% CPU when encountering malformed bytes because my_convert_using_func and my_convert_fix did not explicitly force pointer advancement when mb_wc returned a length of 0. This patch adds an explicit from++ advancement when cnvres == 0 to ensure termination.

Additionally, this adjusts the sort_order_latin7_general_ci weights to resolve a transitivity violation where the hyphen (0x2D) and space (0x20) caused index compression collisions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant