2 Formally proving tiny bits of qemu using Frama-C
6 In early October I was talking to one of the developers of Frama-C,
7 which is a modular framework for verifying C programs. It's been on
8 my radar to learn for a long time, and I'm a co-maintainer of the
9 package in Fedora. I tried to follow one tutorial but never really
10 got past installation and UI issues. But the developer pointed me to
11 a very good, short set of blog postings about using the command line
14 I will provide links to tutorials etc at the end.
17 I decided to spend a day or two last month seeing if I could formally
18 prove some code inside qemu, and I arbitrarily picked one of the
19 smallest pieces of code in the "util/" subdirectory:
21 $ wc -l util/*.c | sort -nr | tail -20
23 Actually there are two files:
25 $ ls -l util/range.c include/qemu/range.h
27 This is what the Range type from the header file looks like. This is
28 the complete definition upstream, with comments which are supposed to
32 * Operations on 64 bit address ranges.
34 * - Ranges must not wrap around 0, but can include UINT64_MAX.
38 * Do not access members directly, use the functions!
39 * A non-empty range has @lob <= @upb.
40 * An empty range has @lob == @upb + 1.
42 uint64_t lob; /* inclusive lower bound */
43 uint64_t upb; /* inclusive upper bound */
45 typedef struct Range Range;
47 +-----------+-----------+------------- - - - - -
49 +-----------+-----------+------------- - - - - -
53 Using the tutorial as my guide I wrote some simple predicates, a
54 "predicate" being a statement which can be true or false:
56 * A non-empty range has @lob <= @upb.
57 * An empty range has @lob == @upb + 1.
60 predicate valid_range(struct Range range) =
61 range.lob <= range.upb + 1;
63 predicate empty_range(struct Range range) =
64 range.lob == range.upb + 1;
67 Notice a few things here:
69 - These are statements written in a formal language.
71 - I'm using structs directly from the C code.
73 - The upstream comments translate into machine-checkable code.
75 The first upstream function is:
78 static inline bool range_is_empty(const Range *range)
80 return range->lob > range->upb;
83 and using the predicates we can write a specification:
85 $ cat snippets/range_is_empty.c
87 And we can compile and prove that:
89 $ frama-c -wp -wp-rte snippets/range_is_empty.c
91 Frama-C parsed the C code and the formal specification and machine
92 checked it, and it's correct - the code is bug-free.
95 = OVERVIEW OF FRAMA-C ECOSYSTEM =
97 Let's interrupt this and talk about Frama-C. The name stands
98 for "Framework for Static Analysis of the C language".
100 It's modular, with a core program that reads C source code
101 and turns it into Abstract Syntax Trees. And a set of plugins
102 to do static analysis by annotating these syntax trees.
103 Plugins can cooperate, so analysis can be passed between
106 The following slides are taken from David Mentré‘s 2016 presentation.
111 Going back to what we proved so far:
114 predicate valid_range(struct Range range) =
115 range.lob <= range.upb + 1;
116 predicate empty_range(struct Range range) =
117 range.lob == range.upb + 1;
120 /* Is range empty? */
122 requires \valid_read(range);
123 requires valid_range(*range);
125 ensures \result <==> empty_range(*range);
127 static inline bool range_is_empty(const Range *range)
129 return range->lob > range->upb;
132 $ frama-c -wp -wp-rte snippets/range_is_empty.c
134 The "@"-comments are ACSL. The C code is parsed by Frama-C into an
135 abstract syntax tree. We told Frama-C to apply the WP plugin. The
136 specification and C code were translated into first-order logic
137 statements and passed to the Alt-Ergo SMT prover to see if it could
138 find a Proof Obligation (PO).
140 We need to be very clear about what happened:
142 - The preconditions must be fulfilled in order for the proof to be
143 correct. If you are checking a whole program then that would check
144 that all callers of this function are fulfilling the preconditions,
145 but if you check only single functions then you are not doing that.
147 - The proof only shows that the post-condition holds.
149 - This only shows that the source code is correct. The compiler
150 could still compile the program incorrectly. If you really want to
151 get into this you have to use a verified C compiler like CompCert,
152 and there are even verified CPU implementations.
154 - Given those assumptions, the code is bug free - you don't need to
158 Obviously this is a single line, very trivial function, but I was
159 quite pleased that I was able to prove it quickly. I kept going on
160 the range file. The next function is:
162 /* Does range contain val? */
163 static inline bool range_contains(const Range *range, uint64_t val)
165 return val >= range->lob && val <= range->upb;
168 This is similarly easy to prove after adding another predicate at the
172 predicate value_in_range(struct Range range, uint64_t value) =
173 range.lob <= value <= range.upb;
176 /* Does range contain val? */
178 requires \valid_read(range);
179 requires valid_range(*range);
181 ensures \result <==> value_in_range(*range, val);
183 static inline bool range_contains(const Range *range, uint64_t val)
185 return val >= range->lob && val <= range->upb;
189 The next function is range_make_empty, again easy to prove using the
190 already existing empty_range predicate. Notice how we declare which
191 memory locations this function assigns to:
193 /* Initialize range to the empty range */
195 requires \valid(range);
196 requires valid_range(*range);
198 ensures empty_range(*range);
200 static inline void range_make_empty(Range *range)
202 *range = range_empty;
203 assert(range_is_empty(range));
207 I'm going to skip forward a few functions to get to an interesting one.
210 /* Get the size of range. */
211 static inline uint64_t range_size(const Range *range)
213 return range->upb - range->lob + 1;
216 My first attempt at a proof was:
219 logic integer size_of_bounds(integer lob, integer upb) =
222 logic integer size_of_range(struct Range range) =
223 size_of_bounds(range.lob, range.upb);
226 /* Get the size of range. */
228 requires \valid(range);
229 requires valid_range(*range);
231 ensures \result == size_of_range(*range);
233 static inline uint64_t range_size(const Range *range)
235 return range->upb - range->lob + 1;
238 A few things to notice about this:
240 - I'm using ACSL "logic" statements, which are similar to functions.
242 - ACSL has an "integer" type which is an unbounded integer.
244 Let's try to prove this one:
246 $ frama-c -wp -wp-rte snippets/range_size.c
249 This was confusing to me because at this point I'd probably spent an
250 hour, and I'd proven about 5 functions successfully, and this function
251 looks equally trivial and yet it cannot be proven. Why?
253 One way to find the problem would be to find a COUNTEREXAMPLE. A
254 counterexample is an instance of an input that satisfies all of the
255 preconditions, but makes the postcondition false. Now Frama-C has
256 pluggable provers, and one prover called Z3, originally written by
257 Microsoft, can be used with Frama-C and can sometimes find
258 counterexamples. However for some reason the version of Z3 in Fedora
259 does not like to work with the version of Frama-C in Fedora and I
262 So this is the counterexample which I worked out myself:
265 logic integer size_of_bounds(integer lob, integer upb) =
268 logic integer size_of_range(struct Range range) =
269 size_of_bounds(range.lob, range.upb);
272 /* Get the size of range. */
274 requires \valid(range);
275 requires valid_range(*range);
277 ensures \result == size_of_range(*range);
279 static inline uint64_t range_size(const Range *range)
281 return range->upb - range->lob + 1;
284 +-------------------------------------------------------+
285 |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX|
286 +-------------------------------------------------------+
287 range->lob == 0 range->upb == UINT64_MAX
289 In this case size_of_range() returns an integer. Recall that integers
290 are an unbounded type. So it returns UINT64_MAX + 1 (a 65 bit
291 number). Our C function on the other hand returns 0.
293 This is a bug in the function. There are really two ways to
296 (1) The function shouldn't be called on empty ranges (size 0), and so
297 if the function returns 0 it means "the total range". Or:
299 (2) The function should have a different signature so it can return a
300 separate indication for "empty range" vs "total range".
302 I went with the first, so the final working specification is this. If
303 we were proving the entire program then we could statically verify
304 that callers do not try to call range_size() on the empty range.
307 requires \valid(range);
308 requires valid_range(*range);
309 requires !empty_range(*range);
311 // case UINT64_MAX+1 -> 0
312 ensures \result == (uint64_t)size_of_range(*range);
314 static inline uint64_t range_size(const Range *range)
316 return range->upb - range->lob + 1;
319 $ frama-c -wp -wp-rte snippets/range_size-good.c
322 On to the next function. Again this seems very simple, but in fact it
323 contains a serious problem:
326 * Initialize range to span the interval [lob,lob + size - 1].
327 * size may be 0. Range must not overflow.
329 static inline void range_init_nofail(Range *range, uint64_t lob,
333 range->upb = lob + size - 1;
336 What does lob == 0, size == 0 mean? It will create a total range.
338 +-------------------------------------------------------+
339 |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX|
340 +-------------------------------------------------------+
341 range->lob == 0 range->upb == UINT64_MAX
343 What does lob == 1, size == 0 mean? It will create an empty range.
345 Because I didn't want to change the code, that makes the proof very
346 long, and it demonstrates another feature of ACSL specifications
350 requires \valid(range);
351 requires lob + (integer)size <= UINT64_MAX;
354 assumes lob > 0 && size == 0;
355 ensures empty_range(*range);
358 ensures range->lob == lob;
359 ensures size_of_bounds(lob, range->upb) == size;
361 assumes lob == 0 && size == 0;
362 ensures total_range(*range);
367 I also had to modify my earlier empty_range predicate:
370 predicate empty_range(struct Range range) =
371 range.lob == range.upb + 1;
375 predicate empty_range(struct Range range) =
376 range.lob > 0 && range.lob == range.upb + 1;
378 predicate total_range(struct Range range) =
379 range.lob == 0 && range.upb == UINT64_MAX;
382 What can we say about this? If you were developing the Range type and
383 the proof at the same time, you probably wouldn't define the Range
384 type in this way, and you probably wouldn't choose an initialization
385 function which had this complex set of behaviours.
387 I did prove most of the rest of the Range functions, but I wanted to
388 point out a few things about the final proof:
390 - The original file includes glib headers, but those cannot be parsed
391 by Frama-C's parser. The problem appears to be some unsupported
392 __attribute__s. Some attributes seem to work, and others are not
393 supported, and this of course limits what you can prove if it
394 depends on any external library that might use these.
396 - For the same reason I had to exclude a glib function for operating
397 on lists of ranges from the proof.
399 - You can run frama-c on the original source file, with a few #ifdef
400 modifications, so it would be possible to maintain the annotations
401 upstream, and run the proof checker as a CI test.
403 - It probably doesn't make sense for qemu right now though, unless we
404 could prove more substantial pieces of code.
409 This is a function from nbdkit:
411 /* Returns true if v is a power of 2.
413 * Uses the algorithm described at
414 * http://graphics.stanford.edu/~seander/bithacks.html#DetermineIfPowerOf2
417 is_power_of_2 (unsigned long v)
419 return v && ((v & (v - 1)) == 0);
422 and my initial specification started by defining:
425 predicate positive_power_of_2 (integer i) =
427 (i == 1 || ((i & 1) == 0 && positive_power_of_2 (i >> 1)));
429 lemma positive_power_of_2 (1);
430 lemma positive_power_of_2 (2);
431 lemma positive_power_of_2 (4);
432 lemma !positive_power_of_2 (7);
435 I was using the lemmas (which are like assertions, except statically
436 verified), to test my predicate, but sadly Frama-C was not able to
437 prove them even though they appear to be trivial.
439 This led to some discussion on stackoverflow:
441 https://stackoverflow.com/questions/64268418/how-do-i-write-an-is-power-of-2-predicate-in-acsl
443 Ultimately it was possible to fix this, but I discovered a few things
444 about Alt-Ergo (the default prover):
446 - It can't handle bitwise operations very well.
448 - You can guide Alt-Ergo, but it requires using the Frama-C GUI which
449 is very non-intuitive. You do end up with a script but I couldn't
450 work out how to integrate this with the command line.
452 - Z3 should be able to solve this, but I was not able to get
453 Z3 working with Frama-C on Fedora.
455 Essentially bitwise tricks like this are a hard case for automated
456 theorem proving. I gave up.
459 = TIMEVAL DIFFERENCE =
461 This is another nbdkit function:
463 /* Return the number of µs (microseconds) in y - x. */
464 static inline int64_t
465 tvdiff_usec (const struct timeval *x, const struct timeval *y)
469 usec = (y->tv_sec - x->tv_sec) * 1000000;
470 usec += y->tv_usec - x->tv_usec;
474 My first attempt at a proof was:
477 predicate valid_timeval (struct timeval tv) =
478 tv.tv_sec >= 0 && tv.tv_usec >= 0 && tv.tv_usec < 1000000;
479 logic integer tv_to_microseconds (struct timeval tv) =
480 tv.tv_sec * 1000000 + tv.tv_usec;
483 /* Return the number of µs (microseconds) in y - x. */
485 requires \valid_read (x) && \valid_read (y);
486 requires valid_timeval (*x) && valid_timeval (*y);
487 ensures \result == tv_to_microseconds (*y) - tv_to_microseconds (*x);
489 static inline int64_t
490 tvdiff_usec (const struct timeval *x, const struct timeval *y)
494 usec = (y->tv_sec - x->tv_sec) * 1000000;
495 usec += y->tv_usec - x->tv_usec;
499 This is not provable, and again the problem is integer overflow. The
500 tv_sec field is a 64 bit integer so it's quite easy to construct two
501 inputs which overflow the output.
503 Fixing this involves a considerably more complex specification:
506 requires \valid_read (x) && \valid_read (y);
507 requires valid_timeval (*x) && valid_timeval (*y);
511 ensures \result == 0;
512 ensures \let diff = tv_to_microseconds (*y) - tv_to_microseconds (*x);
513 INT64_MIN <= diff <= INT64_MAX &&
516 ensures \result == -1;
518 ensures \let diff = tv_to_microseconds (*y) - tv_to_microseconds (*x);
519 !(INT64_MIN <= diff <= INT64_MAX);
524 tvdiff_usec (const struct timeval *x, const struct timeval *y,
528 and of course that is not sufficient because I also had to fix the
529 code so it can return an overflow indication. And actually writing
530 that code is left as an exercise for the reader as it is not at all
533 A few things I learned from this:
535 - This is a function we have had and used for years, and I don't
536 think anyone ever thought it had a problem.
538 - Co-developing the specification alongside the code could help
541 - You can end up with functions that are much harder to use. If we
542 were proof-checking the whole program we could check all callers.
544 - GCC has __builtin_add_overflow and similar which can be used to do
545 mathematical operations and detect overflow, but Frama-C does not
546 understand them. We could contribute these to the Frama-C standard
552 Uli sent me this function from glibc:
554 [https://sourceware.org/git/?p=glibc.git;a=blob;f=string/strverscmp.c]
556 To be honest I had problems even fully understanding the semantics of
557 this function, let alone trying to translate that into ACSL.
559 So I thought I'd look at a simpler function:
561 [https://sourceware.org/git/?p=glibc.git;a=blob;f=string/strlen.c]
563 I want to point out first that this "strlen" is a fallback which is
564 only used if you don't have an optimized function for your hardware,
565 and for the vast majority of users they'll be using something like an
566 AVX-optimized strlen, not this one.
568 This function works in two halves, the first part iterates over single
569 characters until we reach an aligned boundary. And then the second
570 part looks at whole words and uses a clever trick to determine if any
571 byte in the word is zero.
573 In theory a simple specification which would apply to any strlen-style
577 requires valid_read_string (str);
579 ensures (\forall integer i; 0 <= i < \result ==> str[i] != '\0') &&
580 str[\result] == '\0';
582 size_t strlen (const char *str)
586 It would be possible to use this to prove a simple strlen
587 implementation that just worked character by character.
589 Unfortunately for the actual glibc function we hit another limitation
590 of the WP plugin, called the Typed Memory Model.
592 WP models memory as a set of cells, with a type attached to each cell.
593 So a string for example is a set of char cells, each cell being typed
594 as char. You cannot cast these cells to something like a word and try
595 to do a proof over that. Simply a limitation which is well
598 So a proof of the glibc function eludes me.
601 There is a set of open source licensed string functions with Frama-C
604 https://github.com/evdenis/verker
606 and this is what the strlen function with proof looks like from that:
608 [https://github.com/evdenis/verker/blob/master/src/strlen.c]
611 Now you might be asking what happens when you write a function that
612 uses strlen, for example this trivial function with a working
615 $ cat snippets/last_char.c
616 $ frama-c -wp -wp-rte snippets/last_char.c
618 The questions here are:
620 - Is this proving the glibc strlen function?
622 - Are we calling glibc strlen() from the specification?
624 $ less /usr/share/frama-c/libc/string.h
625 $ less /usr/share/frama-c/libc/__fc_string_axiomatic.h
627 And the answers are no and no. In fact what's happening here is we
628 are using Frama-C's own "string.h" header. This header defines and
629 contains a specification for how strlen. But this is also not a proof
630 of strlen: in fact a second file defines what's known as an "axiomatic
631 definition" or theory of how strlen works.
633 Essentially what's happening is we're assuming that strlen in your
634 standard library (which might not even be glibc) works. To make a
635 complete proof you'd also need to additionally prove all libraries you
638 Another little fun thing is Frama-C's strlen function can return -1,
639 which is used to represent an unbounded string. Of course unbounded
640 strings cannot exist on real computers, but they can exist on
646 * Frama-C is a real open source tool used by companies to verify
647 safety-critical software. If you've flown in an Airbus or used
648 electricity from a French nuclear power plant then you've used this
651 * Not as hard to use as I imagined. I spent about 6-12 hours from
652 zero to being able to fully prove a whole file of production code.
654 * Mostly useful for small functions. There are other tools, like TLA+
655 for proving distributed algorithms.
657 * Forces you to think very hard about assumptions and corner cases,
658 but that's the nature of the problem.
660 * Very well documented, questions answered quickly on stackoverflow.
662 But it has its limits:
664 * Typed Memory Model turns out to be restrictive.
666 * No support for malloc.
668 * Not good at bitwise ops (can Z3 help?).
670 * Cannot parse some non-standard __attribute__s (glib).
672 * Lacks some __builtin_* functions.
674 Alternatives and related programs:
676 * Ada SPARK Pro (https://www.adacore.com/sparkpro) is a formally
677 verified subset of the Ada programming language
678 (tagline: "You simply can't write better code").
680 * KLEE (http://klee.github.io/) has similarities but is not a formal
683 * CompCert (http://compcert.inria.fr/) is INRIA's verified C compiler
688 * Tutorial that I followed in October which I thought was a good
690 https://maniagnosis.crsr.net/2017/06/AFL-brute-force-search.html
691 https://maniagnosis.crsr.net/2017/06/AFL-bug-in-quicksearch.html
692 https://maniagnosis.crsr.net/2017/07/AFL-correctness-of-quicksearch.html
694 * Frama-C on Stack Overflow:
695 https://stackoverflow.com/questions/tagged/frama-c
697 * Allan Blanchard's tutorial:
698 https://allan-blanchard.fr/publis/frama-c-wp-tutorial-en.pdf
700 * David Mentre's introduction:
701 https://www.cs.umd.edu/class/spring2016/cmsc838G/frama-c/introduction-to-frama-c_v2.pdf
703 * Upstream documentation:
704 Manual: https://frama-c.com/download/frama-c-user-manual.pdf
705 WP manual: https://frama-c.com/download/frama-c-wp-manual.pdf
706 ACSL: https://frama-c.com/acsl.html
708 * ACSL specifications for various string functions:
709 https://github.com/evdenis/verker