Interval Analysis#965
Conversation
|
Thanks for the PR. Have you ever tried only adding the bound constraints for registers that have small intervals? |
|
Hi @tcherrou ... happy to see the PR from the work you did for your thesis. Can you please rebase on top of the latest |
|
Small is anything that does not get too close to the bit vector sizes, but I think your analysis reaches TOP in most of those cases anyways. From what I have seen on some local tests is that all your intervals are basically small or TOP.
After testing a little bit locally, I noticed that your approach already avoids encoding intervals crossing a sign-boundary, at least if doing BV encodings. So in that sense, you get that result already. However, what could be interesting is to only encode the signedness (>0. <0, =0) as an extreme lightweight encoding. IIRC, this information was useful on the fib_bench benchmarks (the unsat versions), especially when doing the lazy/caat method (and integer encoding?).
For BVs, they don't make sense, I agree. I think they should also be avoided on integers, cause IIRC the bounds just degrade the performance of the solver heavily. But maybe you can try once to see for yourself :) |
|
@hernanponcedeleon It should be pushed now. |
|
@tcherrou after #964, you need to register all classes using reflection in order for the options to work in native mode. You need at least the following patch, maybe even more if you added some further options (I have not yet check the code in details) |
3cfadf6 to
c36ec66
Compare
|
There were duplicate commits in the commit history after rebasing. Removed them with a force push |
hernanponcedeleon
left a comment
There was a problem hiding this comment.
First quick pass up to Interval.java:L241
hernanponcedeleon
left a comment
There was a problem hiding this comment.
Minor comments after a quick check.
@tcherrou please fix the conflicts
99f8c46 to
f1d4570
Compare
c3a0dc5 to
fdbed3b
Compare
d2688e8 to
762c7d4
Compare
Also remove computing redundant information: total number of reads Suggested-by: Hernan Ponce de Leon <hernanl.leon@huawei.com>
Suggested-by: Hernan Ponce de Leon <hernanl.leon@huawei.com>
The analysis does not require the task. Global analysis only needs the memory model to get the `RF` relation. Suggested-by: Thomas Haas <t.haas@tu-bs.de>
Suggested-by: Hernan Ponce de Leon <hernanl.leon@huawei.com>
Change interval analysis option NAIVE to NONE. Introduces optional analyses where an analysis may not be present. Optional analyses are registered via `Context.registerOptional` method. Users must check if an optional analysis has been registered. Suggested-by: Hernan Ponce de Leon <hernanl.leon@huawei.com> and Thomas Haas <t.haas@tu-bs.de>
Suggested-by: Thomas Haas <t.haas@tu-bs.de>
Extract `AbstractionExpressionEvaluator` from `IntervalAnalysisWorklist`
Rename `RegisterStateVisitor`. Move `RegisterState` class to separate file.
Removed unnecessary cast Suggested-by: Hernan Ponce de Leon <hernanl.leon@huawei.com>
More descriptive names. Eliminate local variables that don't help with readibility. Remove redundant assigments.
Grouped related methods together. Removed unnecessary `getType` method.
Match relation analysis naming convention. Suggested-by: Thomas Haas <t.haas@tu-bs.de>
Remove custom exception classes and replace them with Preconditions.
7ecbf5f to
8957a2c
Compare
|
@tcherrou there is only the open discussion in |
Changed `double` counters to `int`. Suggested-by: Hernan Ponce de Leon <hernanl.leon@huawei.com> Use more descriptive labels for #Register reads/top.
|
There is one more operator I want to try and implement and I want to run some benchmarks with some of the suggestions in this conversation. |
Yes, I would suggest getting this one merged (I am fine with the current status, so unless @ThomasHaas has further suggestions I will merge). If you come up with improvements, PRs will always be welcomed :D |
|
I am fine with merging.
What I would find interesting is if you bounded the result of register writes rather than register reads, because the bounds on register writes imply bounds on register reads, but there are less writers than readers in general. If you ever feel bored, you can try out these things :P. |
This sounded interesting to me too.
This was actually present in an early version of the analysis but removed for simplicity. |
9594f82
into
hernanponcedeleon:development
|
Thanks @tcherrou for bearing with us over the long review. Happy to see your thesis work finally merged (@apaolillo might also be happy about this). |
|
Excellent! Congratulations @tcherrou ! 🎉 |
|
I am happy to see it finally merged. |
Interval Analysis
Overview
This pull request adds interval analysis (also known as range analysis or value analysis) to
dartagnan's analysis pipeline after preprocessing.The analysis determines the intervals of integer register in the intermediate representation.
These intervals can be used for program simplification.
Here, it is used to add bounds to variables in the smt encoding to speed up verification.
This work was done for my master's thesis and concluded that using interval analysis.
master_thesis.pdf
The experimental results are a bit outdated as the most recent version is more conservative during the encoding phase (mostly with the bit vector encoding).
Features
Specifics
Intervals
Representation
An interval for a register consists of a lower bound and upper bound such that r is within that range of values.
Since registers have a certain bitwidth, the length of the interval is bounded by the number of values representable with that bitwidth.
For example, for an 8-bit registers, the widest possible interval is
[-128,255].Signedness
To my knowledge, registers are neither signed nor unsigned and signedness by determined by operations.
This ambiguity is dealt with by not making a choice on signedness.
A drawback with this approach is that some interval are sensitive the sign while some are not.
For example, the interval
[-1,1]is sound when interpreted as signed but unsound when interpreted as unsigned.These type of intervals can occur when
In contrast, the interval
[-5,-1]has the same set of values unsigned as signed in terms of their bit vector representation.No wrapping
The "top" value of the interval lattice is
[MIN_SIGNED,MAX_UNSIGNED].Additionally, interval are not wrapped.
This means that overflows and underflows are conservatively dealt with by widening to the top value for that specific bitwidth.
Note
As in #955 llvm represents values above the signed upper bound (e.g. 128 for an 8-bit integer) as their signed versions.
This can lead to unsound intervals as values are interpreted as they are represented in the IR.
Suggestions to remedy this are always welcome.
Analysis
The pull request introduces an interface for querying the interval of an integer register at a specific program event.
There are three analyses, which differ in their precision.
Localevents and propagating information using a work list algorithm.dartagnan's relation analysis and employingread-frompairs to analyseLoadinstructions for additional precision.Note
Since Global analysis depends on Relation analysis, it must be performed after Relation analysis.
Encoding
The interval of a register is used in the encoding phase by using the bounds to constrain the SMT variable associated.
For example, a register
r0with interval[0,5]will have the SMT constraints (assuming the integer encoding)(assert (0 <= r0))and(assert (r0 <= 5)).Notes
To the extent of my knowledge, knowledge of the sign bit is necessary when encoding bounds on bit vectors.
As registers have inherently no sign, only sign insensitive interval are encoded.
Maybe it is possible to add some sort of sign analysis here but I am unsure.
Usage
By default, naive analysis is enabled. One can choose a specific by using the option:
program.analysis.interval.The following choices are available:
naivelocalglobalTests
Existing tests pass on my machine besides the ones that ran out of memory.
The
RelationAnalysisTestfile has been updated to include the updated analysis pipeline.