[ldv-project] Benchmarking new predicate analysis with UFs

Tue Aug 27 11:00:57 MSK 2013

Hello Mikhail,

thanks for sharing these interesting results.
I couldn't look at them in full detail yet, but here are my first thoughts.

Am 26.08.2013 18:16, schrieb Mikhail Mandrykin:
> -- Predicate analysis with bit-vector variables and recursion skipping.
>     The same as the previous configuration, but with
>
>     # Turn on bit-vectors
>     cpa.predicate.replaceBitvectorWithRationalAndFunctionTheory = false
>     cpa.predicate.ignoreExtractConcat=false
>
>     # Turn on field aliasing
>     cpa.predicate.handlePointerAliasing = true
>     cpa.predicate.handleFieldAliasing = true
>     cpa.predicate.handleFieldAccess = true
>
>     # Using 32-bit Linux machine model
>     analysis.machineModel = LINUX32

Did you have
cfa.simplifyPointerExpressions = true
here (as in config/predicateAnaylsis-bitprecise.properties)?

Also note that the first configuration you mentioned also already has 
cpa.predicate.handlePointerAliasing=true
(this is the default actually).

> -- Device Drivers 64 category (partial). Attached as dd64-table.html
>
> In this category I have only analyzed single-file non-cillified drivers.
>
> Overall in this category the new analysis looses 6 points in the real case and
> the results for the BV case are too strange because
> the variable-based analysis failed on too many benchmarks. Though I know we've
> already tried bit-precise analysis on the drivers and
> haven't got so many UNKNOWNs. It seems due to the non-cillified filed being
> used, which turned out to be poorly supported by the
> bit-precise configuration.

That would be explainable if the option
cfa.simplifyPointerExpressions was not set to true.

> == The implementation ==
>
> Current implementation is based upon the r8330 of the svn master branch. So
> the changes in the LHS representation (splitting of
> CUnaryExpression and introduction of CLeftHandSide) aren't merged.
>
> Besides merging this the preliminary CFA transformations:
>     cfa.transformUnsizedArrays = true transforms declarations like int a[] =
> {1, 2, 3}, b[]; into  int a[3] = {1, 2, 3}, *b;
>     cfa.transformPointerArithmetic = true transforms pointer arithmetic into
> array subscripts (e.g.
>            a - 2 + i into a[i - 2] if a is a pointer).
>     cfa.transformArrows = true eliminates all ->s, but doesn't do any
> simplifications of the LHS (unlike cfa.simplifyPointerExpressions)
>     cfa.transformStarAmper = true transforms *&s into just s
>     cfa.transformFunctionPointers = true transforms *f and &f for function
> pointers to just f (to eliminate implicit conversions)
>     cfa.ignoreConst=true just drops all const qualifiers from all the types in
> the CFA
>     cfa.ignoreVolatile=true does the same for the volatile qualifier
> These transformation are currently implemented outside of ASTConverter.java as
> a separate CFA transformer base on
> 4 visitors (for expressions, types, statements and initializers). But the
> visitors transform one CFA into another with different
> expressions, types etc. and this leads to a kind of memory leak since some
> references (from CFA function heads, summary edges and
> function exit nodes) still lead to the old CFA nodes.

The memory leak is probably not a big problem,
but we have to be careful with inconsistent references. If an analysis 
decides to look at the summary edge and then sees different CFA nodes 
from what it has seen before, this could lead to nasty bugs that are 
very difficult to find.

A further problem (in my eyes) with CFA transformations that are 
specific for one analysis is that such configuration options make it 
difficult or even impossible to combine several analysis (as you suggest 
with ExplicitCPA) because each analysis might depend on different 
settings for these values.
So it would be best to not need such options at all.
I know that we already have the cfa.simplifyPointerExpressions option as 
a precedent, and I am not very happy about it (as ExplicitCPA actually 
is better when the option is set to false and PredicateCPA with 
bitvectors needs it set to true), but the temporary variables it 
introduces for expressions like **p are hard to define otherwise (we are 
still looking for a solution here).
However, the options that you present look much simpler to me, couldn't 
these simplifications be implemented directly in the analysis itself
(where C asts are transformed to formulas)?

Options that do not have any disadvantages and just simplify the AST 
could be implemented in the ASTConverter itself (without being 
user-configurable). transformUnsizedArrays could be such an option 
(although we would need to make sure that all references to that 
variable also have pointer type instead of array type because analyses 
expect that the type of variables does not change).

For transformFunctionPointers, I wonder why you need this.
We actually added code to CPAchecker to make the reverse transformation
(adding * to a function pointer when it is called; adding & to a 
function when it is converted to a function pointer) to be able to 
easily detect these cases in the AST and not confuse them with regular 
function calls or variable accesses, and this helped to make the code of 
the analysis easier and give less false answers.
(BTW CIL does the same kind of transformation and adds * and & where 
appropriate as you can see from ldv-regression/callfpointer.*).

Greetings, Philipp
-- 
Philipp Wendler
Lehrstuhl für Softwaresysteme
Universität Passau
Raum IM 110 - Tel.: 0851/509-3093