test_bz2.testDecompressorChunksMaxsize is flaky due to non-deterministic BIG_TEXT

# Bug report

test_bz2 concatenates a bunch of Python test files to get 128 KiB of test data:

https://github.com/python/cpython/blob/1d091a336e60b703a7d7ae4124f652eabe144f4e/Lib/test/test_bz2.py#L69-L80

The exact contents depends on the order of results returned `glob.glob()`, which is in arbitrary, but typically consistent on a single machine. Some of these orderings of globbed files lead to test failures.

Below is mostly Claude's summary, which seems right to me:

The testDecompressorChunksMaxsize test feeds `BIG_DATA[:len(BIG_DATA)-64]` to BZ2Decompressor.decompress with max_length=100 and asserts needs_input is False. This assumes the truncated data contains at least one complete bz2 block so the decompressor can produce output. But bz2 is a block compressor - it cannot produce any output until an entire compressed block is available.

With certain file orderings, the first bz2 block's compressed data extends into the last 64 bytes of BIG_DATA. The truncation then produces an incomplete block, the decompressor consumes all input, returns 0 bytes, and correctly sets needs_input=True.


### Linked PRs
* gh-145608

	# Some tests need more than one block of uncompressed data. Since one block
	# is at least 100,000 bytes, we gather some data dynamically and compress it.
	# Note that this assumes that compression works correctly, so we cannot
	# simply use the bigger test data for all tests.
	test_size = 0
	BIG_TEXT = bytearray(128*1024)
	for fname in glob.glob(os.path.join(glob.escape(os.path.dirname(__file__)), '*.py')):
	with open(fname, 'rb') as fh:
	test_size += fh.readinto(memoryview(BIG_TEXT)[test_size:])
	if test_size > 128*1024:
	break
	BIG_DATA = bz2.compress(BIG_TEXT, compresslevel=1)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test_bz2.testDecompressorChunksMaxsize is flaky due to non-deterministic BIG_TEXT #145607

Bug report

Linked PRs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

test_bz2.testDecompressorChunksMaxsize is flaky due to non-deterministic BIG_TEXT #145607

Description

Bug report

Linked PRs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions