Improve FlakyStrategyDefinition error messages with specific details#4676
Draft
ianhi wants to merge 7 commits intoHypothesisWorks:masterfrom
Draft
Improve FlakyStrategyDefinition error messages with specific details#4676ianhi wants to merge 7 commits intoHypothesisWorks:masterfrom
ianhi wants to merge 7 commits intoHypothesisWorks:masterfrom
Conversation
FlakyStrategyDefinition errors now describe what changed between runs (type mismatch, constraint mismatch, forced value difference, more/fewer draws) instead of a generic "inconsistent data generation" message.
Contributor
Author
|
I moved this back to draft - looking back I was pushing through some hunger when I submitted this - riding the thrill of trying to get it to the end. And in retrospect I need to spend some more time looking over the tests as carefully as I did the code/behavior. Which I suppose is ironic given this is a testing library - but alas. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Moved into a draft. see: #4676 (comment)
🤖 - I used claude extensively for this PR, but have personally reviewed every line to the best of my ability.
Fix for #4673
This PR implements four tightly related changes in the output of hypothesis when there is a flaky failure
Ensure seed is printed
Print the actual different choices in the strategy
the error now says what was different (different constraints, different type, fewer/more draws) instead of just "data generation was inconsistent"
For stateful tests give the replay/give info on how to trigger observability
Fix duplicate FlakyStrategyDefinition errors
when a mismatch was detected during a draw, a second
FlakyStrategyDefinitioncould be raised fromconclude_testif the mismatch also resulted in fewer draws. Now the observer has aflakyflag to prevent this redundant second raise.One of the tricky things is that a
FlakyStrategyDefinitionerror can be thrown with or without a real test failure. In the latter case then you would get a messy output with nested errors (during. the handling another exception...) which made it hard to notice the first error with so much text on the screen. Now the FlakyError is temporarily suppressed and reported in the Hypothesis output, keeping the test failure more visible. (see final example below)I cooked up this demo script with claude to test out the various combinations of failure modes and otuput (e.g. observability) which I found quite helpful. Here is the script:
demo_flaky.py (click to expand)
With these outputs (formatted by claude to elide portions of the error to focus on whats relevant for this PR)
1. Stateful test — constraint mismatch (observability off)
python -m pytest demo_flaky.py -s -k constraint2. Stateful test — constraint mismatch (observability on)
HYPOTHESIS_EXPERIMENTAL_OBSERVABILITY=1 python -m pytest demo_flaky.py -s -k constraint3. Non-stateful — type mismatch
python -m pytest demo_flaky.py -s -k type_mismatch4. Non-stateful — constraint mismatch
python -m pytest demo_flaky.py -s -k plain5. Real bug + suppressed flaky error
python -m pytest demo_flaky.py -s -k more