🚧 wip: System derived negative tests for BAL#2755
Conversation
cd76fb8 to
3a75ec3
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## forks/amsterdam #2755 +/- ##
===================================================
+ Coverage 88.17% 88.62% +0.45%
===================================================
Files 577 577
Lines 35659 35659
Branches 3490 3490
===================================================
+ Hits 31442 31604 +162
+ Misses 3654 3492 -162
Partials 563 563
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hey @fselmo I see two kinds of invalid tests: (1) dynamic invalid tests that depends on an input BAL, (2) static invalid tests that check static properties and can be tested using stand alone tests, such as checking for duplicate entries or checking ordering by swapping etc. The pure function I wrote generates type (1) invalid tests. I now need to wire this in the fill logic. Im thinking the framework expands a valid test to 1 valid + N invalid fixtures. Perhaps a marker. What do you think? |
I love this. We will be brainstorming some ways that we can fuzz test using EELS so I think this might play really nicely with that paradigm as well. One thing I can think of is either a marker or a flag when you fill. Let's say I want to test, locally at least, some tests with these corruptions - I could run with this flag and execute against hive or even the In this sense I think it'd also be valuable for filling and consuming. I just worry a bit that we have ~50k tests for Amsterdam or so (maybe more I forget but something like this) and if we corrupt all of them by N on releases we'd have some insane amount. This is of course a good problem to have and we can decide when and how to control the amount of fuzzing / corruption we do here... but something to think about with the sheer volume that this might produce. On the other hand, if we fill these particularly for consumption via I'd love to hear your thoughts on the above and on any volume / control knob ideas. |
|
These test cases are intentional, yet generated. So its somewhere between the usual tests and fuzzing. I worry about the bloat this creates. But we need to make space for these for EIPs that cover large protocol surface area. Maybe an adjacent "guardrail" workflow thats meant for important milestones (pre devnet/mainnet launch etc). Fuzzing can be part of it too. Let me think through to find a home for this bloat. |
🗒️ Description
Negative tests ensures clients prevents abuse by rejecting corrupted BAL. Existing negative tests are hand written, and provide limited coverage.
This PR helps scale negative test coverage organically by systematically corrupting a valid BAL, thus eliminating the need for hand picking "interesting" corruptions.
How a valid BAL is corrupted
A valid BAL is corrupted to ensure clients verify Correctness and Completeness:
That is, we either tamper a value or omit it from BAL. Additionally we also omit whole account.
The total number of negative test$N$ is:
Where:
Example: Alice transfers 1 ETH to Bob. The BAL has two accounts and three changes (Alice's nonce and balance; Bob's balance):
🔗 Related Issues or PRs
closes #2705
✅ Checklist
just statictype(scope):.