Fix incorrect Content-Length for StringIO with multi-byte characters #7201
+48
−7
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #6917.
super_len()usesseek/tellto measure the length of file-like objects such asStringIOandBytesIO. However,StringIO.tell()returns the character position, not the byte offset. For strings containing multi-byte UTF-8 characters (e.g. emoji), this produces an incorrectContent-Lengthheader that violates RFC 9110 section 8.6.For example,
io.StringIO("\U0001F4A9")(a single emoji) previously returned a length of 1 (character count) instead of 4 (UTF-8 byte count), causing the server to receive aContent-Length: 1header while 4 bytes are actually sent.This is the same class of bug that was fixed for plain
strbodies in #6586 --stris encoded to UTF-8 before measuring, butStringIOwas not. This PR makesStringIOhandling consistent withstrby reading the remaining text, encoding it to UTF-8, and measuring the byte length.Before
After
Changes
src/requests/utils.py: Insuper_len(), detectio.StringIOand read+encode the remaining text to compute the UTF-8 byte length instead of relying ontell().tests/test_utils.py: Addedtest_super_len_stringio_multibytecovering single emoji, mixed content, partially-read StringIO, and position preservation.Test plan
TestSuperLentests pass (ASCII StringIO, BytesIO, partially-read files, etc.)super_len()call🤖 Generated with Claude Code