-16th Nov 2020
+Monday 16th Nov 2020
Formally proving tiny bits of qemu using Frama-C
-[in medias res]
+\f
In early October I was talking to one of the developers of Frama-C,
which is a modular framework for verifying C programs. It's been on
I will provide links to tutorials etc at the end.
+\f
+= OVERVIEW OF FRAMA-C ECOSYSTEM =
+
+The name stands for "Framework for Static Analysis of the C language".
+
+It's modular, with a core program that reads C source code
+and turns it into Abstract Syntax Trees. And a set of plugins
+to do static analysis by annotating these syntax trees.
+Plugins can cooperate, so analysis can be passed between
+plugins.
+
+The following slides are taken from David Mentré‘s 2016 presentation.
+
+\f
I decided to spend a day or two last month seeing if I could formally
prove some code inside qemu, and I arbitrarily picked one of the
smallest pieces of code in the "util/" subdirectory:
- I'm using structs directly from the C code.
- - The comments in the upstream code translate into predicates.
+ - The upstream comments translate into machine-checkable code.
The first upstream function is:
and using the predicates we can write a specification:
- $ less snippets/range_is_empty.c
+ $ cat snippets/range_is_empty.c
And we can compile and prove that:
Frama-C parsed the C code and the formal specification and machine
checked it, and it's correct - the code is bug-free.
-= OVERVIEW OF FRAMA-C ECOSYSTEM =
-
-
-XXX Modular Framework for analysis of C
-
-XXX Take some slides from David Mentre's presentation.
-
-XXX Explain which companies are using Frama-C.
-
-XXX WP plugin
-
-XXX ACSL language
-
-
-
-= BACK TO RANGE.C =
-
+\f
Going back to what we proved so far:
/*@
- Given those assumptions, the code is bug free - you don't need to
write any tests.
+\f
Obviously this is a single line, very trivial function, but I was
quite pleased that I was able to prove it quickly. I kept going on
the range file. The next function is:
return val >= range->lob && val <= range->upb;
}
+\f
The next function is range_make_empty, again easy to prove using the
already existing empty_range predicate. Notice how we declare which
memory locations this function assigns to:
assert(range_is_empty(range));
}
+\f
I'm going to skip forward a few functions to get to an interesting one.
This seems trivial:
One way to find the problem would be to find a COUNTEREXAMPLE. A
counterexample is an instance of an input that satisfies all of the
-preconditions, but makes the postcondition false. Now Frama-C has
+preconditions, but makes the postcondition false. Frama-C has
pluggable provers, and one prover called Z3, originally written by
Microsoft, can be used with Frama-C and can sometimes find
-counterexamples. However for some reason the version of Z3 in Fedora
-does not like to work with the version of Frama-C in Fedora and I
-don't know why.
+counterexamples.
+
+ $ frama-c -wp -wp-rte snippets/range_size.c -wp-prover alt-ergo,why3:z3-ce
+
+Unfortunately Z3 cannot find a counterexample in this case. I even
+upped the timeout to run Z3 longer but it still couldn't find one.
+
+[Z3 model_compact issue: https://git.frama-c.com/pub/frama-c/-/issues/33]
So this is the counterexample which I worked out myself:
$ frama-c -wp -wp-rte snippets/range_size-good.c
+\f
On to the next function. Again this seems very simple, but in fact it
contains a serious problem:
modifications, so it would be possible to maintain the annotations
upstream, and run the proof checker as a CI test.
- - It probably doesn't make sense for qemu right now though, unless we
- could prove more substantial pieces of code.
+ - It probably doesn't make sense for qemu right now though:
+ * concerted effort to prove more substantial pieces of code
+ * would need upstream effort to modularise qemu
+ * co-develop modules and proofs
+\f
= POWER OF 2 =
This is a function from nbdkit:
Essentially bitwise tricks like this are a hard case for automated
theorem proving. I gave up.
+\f
= TIMEVAL DIFFERENCE =
This is another nbdkit function:
understand them. We could contribute these to the Frama-C standard
library.
+\f
= STRING FUNCTIONS =
Uli sent me this function from glibc:
So a proof of the glibc function eludes me.
+\f
There is a set of open source licensed string functions with Frama-C
proofs available:
and this is what the strlen function with proof looks like from that:
- [https://github.com/evdenis/verker/blob/master/src/strlen.c
+ [https://github.com/evdenis/verker/blob/master/src/strlen.c]
+\f
Now you might be asking what happens when you write a function that
uses strlen, for example this trivial function with a working
specification:
strings cannot exist on real computers, but they can exist on
theoretical ones!
+\f
= IN CONCLUSION =
* Frama-C is a real open source tool used by companies to verify