Skip to content

Commit c64ef05

Browse files
authored
BUG: fix empty suffix and prefix handling in pyarrow string methods
Python's `str.removeprefix("")` and `str.removesuffix("")` return the original string. The current pyarrow-backed implementation slices with `stop=0` or `start=0` when the prefix or suffix is empty, which can result in unexpected behavior instead of preserving the original values. This PR adds explicit guards for empty prefix and suffix inputs and includes tests to ensure parity with Python semantics.
1 parent 7b51d3a commit c64ef05

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

pandas/core/arrays/_arrow_string_mixins.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -209,11 +209,14 @@ def _str_removeprefix(self, prefix: str):
209209
return self._from_pyarrow_array(result)
210210

211211
def _str_removesuffix(self, suffix: str):
212+
if suffix == "":
213+
return self
212214
ends_with = pc.ends_with(self._pa_array, pattern=suffix)
213215
removed = pc.utf8_slice_codeunits(self._pa_array, 0, stop=-len(suffix))
214216
result = pc.if_else(ends_with, removed, self._pa_array)
215217
return self._from_pyarrow_array(result)
216218

219+
217220
def _str_startswith(
218221
self, pat: str | tuple[str, ...], na: Scalar | lib.NoDefault = lib.no_default
219222
):

0 commit comments

Comments
 (0)