2 Formally proving tiny bits of qemu using Frama-C
6 In early October I was talking to one of the developers of Frama-C,
7 which is a modular framework for verifying C programs. It's been on
8 my radar to learn for a long time, and I'm a co-maintainer of the
9 package in Fedora. I tried to follow one tutorial but never really
10 got past installation and UI issues. But the developer pointed me to
11 a very good, short set of blog postings about using the command line
14 I will provide links to tutorials etc at the end.
16 I decided to spend a day or two seeing if I could formally prove some
17 production code inside qemu, and I arbitrarily picked one of the
18 smallest pieces of code in the "util/" subdirectory:
20 $ wc -l util/*.c | sort -nr | tail -20
22 Actually there are two files:
24 $ ls -l util/range.c include/qemu/range.h
26 This is what the Range type looks like. This is the complete
27 definition upstream, with comments which are supposed to explain how
31 * Operations on 64 bit address ranges.
33 * - Ranges must not wrap around 0, but can include UINT64_MAX.
37 * Do not access members directly, use the functions!
38 * A non-empty range has @lob <= @upb.
39 * An empty range has @lob == @upb + 1.
41 uint64_t lob; /* inclusive lower bound */
42 uint64_t upb; /* inclusive upper bound */
44 typedef struct Range Range;
46 +-----------+-----------+------------- - - - - -
48 +-----------+-----------+------------- - - - - -
52 Using the tutorial as my guide I wrote some simple predicates, a
53 "predicate" being a statement which can be true or false:
55 * A non-empty range has @lob <= @upb.
56 * An empty range has @lob == @upb + 1.
59 predicate valid_range(struct Range range) =
60 range.lob <= range.upb + 1;
62 predicate empty_range(struct Range range) =
63 range.lob == range.upb + 1;
66 Notice a few things here:
68 - These are statements written in a formal language.
70 - I'm using structs directly from the C code.
72 - The comments in the upstream code translate into predicates.
74 The first upstream function is:
77 static inline bool range_is_empty(const Range *range)
79 return range->lob > range->upb;
82 and using the predicates we can write a specification:
86 requires \valid_read(range);
87 requires valid_range(*range);
89 ensures \result <==> empty_range(*range);
91 static inline bool range_is_empty(const Range *range)
93 return range->lob > range->upb;
96 And we can compile and prove that:
98 $ frama-c -wp -wp-rte snippets/range_is_empty.c
100 Frama-C parsed the C code and the formal specification and machine
101 checked it, and it's correct - the code is bug-free.
103 = OVERVIEW OF FRAMA-C ECOSYSTEM =
106 XXX Modular Framework for analysis of C
108 XXX Take some slides from David Mentre's presentation.
110 XXX Explain which companies are using Frama-C.
120 Going back to what we proved so far:
123 predicate valid_range(struct Range range) =
124 range.lob <= range.upb + 1;
126 predicate empty_range(struct Range range) =
127 range.lob == range.upb + 1;
130 /* Is range empty? */
132 requires \valid_read(range);
133 requires valid_range(*range);
135 ensures \result <==> empty_range(*range);
137 static inline bool range_is_empty(const Range *range)
139 return range->lob > range->upb;
142 $ frama-c -wp -wp-rte snippets/range_is_empty.c
143 XXX OUTPUT OF FRAMA-C XXX
145 The "@"-comments are ACSL. The C code is parsed by Frama-C into an
146 abstract syntax tree. We told Frama-C to apply the WP plugin. The
147 specification and C code were translated into first-order logic
148 statements and passed to the Alt-Ergo SMT prover to see if it could
149 find a Proof Obligation (PO).
151 We need to be very clear about what happened:
153 - The preconditions must be fulfilled in order for the proof to be
154 correct. If you are checking a whole program then that would check
155 that all callers of this function are fulfilling the preconditions,
156 but if you check only single functions then you are not doing that.
158 - The proof only shows that the post-condition holds.
160 - This only shows that the source code is correct. The compiler
161 could still compile the program incorrectly. If you really want to
162 get into this you have to use a verified C compiler like CompCert,
163 and there are even verified CPU implementations.
165 - Given those assumptions, the code is bug free - you don't need to
168 Obviously this is a single line, very trivial function, but I was
169 quite pleased that I was able to prove it quickly. I kept going on
170 the range file. The next function is:
172 /* Does range contain val? */
173 static inline bool range_contains(const Range *range, uint64_t val)
175 return val >= range->lob && val <= range->upb;
178 This is similarly easy to prove after adding another predicate at the
182 predicate value_in_range(struct Range range, uint64_t value) =
183 range.lob <= value <= range.upb;
186 /* Does range contain val? */
188 requires \valid_read(range);
189 requires valid_range(*range);
191 ensures \result <==> value_in_range(*range, val);
193 static inline bool range_contains(const Range *range, uint64_t val)
195 return val >= range->lob && val <= range->upb;
198 The next function is range_make_empty, again easy to prove using the
199 already existing empty_range predicate. Notice how we declare which
200 memory locations this function assigns to:
202 /* Initialize range to the empty range */
204 requires \valid(range);
205 requires valid_range(*range);
207 ensures empty_range(*range);
209 static inline void range_make_empty(Range *range)
211 *range = range_empty;
212 assert(range_is_empty(range));
215 I'm going to skip forward a few functions to get to an interesting one.
218 /* Get the size of range. */
219 static inline uint64_t range_size(const Range *range)
221 return range->upb - range->lob + 1;
224 My first attempt at a proof was:
227 logic integer size_of_bounds(integer lob, integer upb) =
230 logic integer size_of_range(struct Range range) =
231 size_of_bounds(range.lob, range.upb);
234 /* Get the size of range. */
236 requires \valid(range);
237 requires valid_range(*range);
239 ensures \result == size_of_range(*range);
241 static inline uint64_t range_size(const Range *range)
243 return range->upb - range->lob + 1;
246 A few things to notice about this:
248 - I'm using ACSL "logic" statements, which are similar to functions.
250 - ACSL has an "integer" type which is an unbounded integer.
252 Let's try to prove this one:
254 $ frama-c ... snippets/range_size.c
257 This was confusing to me because at this point I'd probably spent an
258 hour, and I'd proven about 5 functions successfully, and this function
259 looks equally trivial and yet it cannot be proven. Why?
261 One way to find the problem would be to find a COUNTEREXAMPLE. A
262 counterexample is an instance of an input that satisfies all of the
263 preconditions, but makes the postcondition false. Now Frama-C has
264 pluggable provers, and one prover called Z3, originally written by
265 Microsoft, can be used with Frama-C and can sometimes find
266 counterexamples. However for some reason the version of Z3 in Fedora
267 does not like to work with the version of Frama-C in Fedora and I
270 So this is the counterexample which I worked out myself:
273 logic integer size_of_bounds(integer lob, integer upb) =
276 logic integer size_of_range(struct Range range) =
277 size_of_bounds(range.lob, range.upb);
280 /* Get the size of range. */
282 requires \valid(range);
283 requires valid_range(*range);
285 ensures \result == size_of_range(*range);
287 static inline uint64_t range_size(const Range *range)
289 return range->upb - range->lob + 1;
292 +-------------------------------------------------------+
293 |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX|
294 +-------------------------------------------------------+
295 range->lob == 0 range->upb == UINT64_MAX
297 In this case size_of_range() returns an integer. Recall that integers
298 are an unbounded type. So it returns UINT64_MAX + 1 (a 65 bit
299 number). Our C function on the other hand returns 0.
301 This is a bug in the function. There are really two ways to
304 (1) The function shouldn't be called on empty ranges (size 0), and so
305 if the function returns 0 it means "the total range". Or:
307 (2) The function should have a different signature so it can return a
308 separate indication for "empty range" vs "total range".
310 I went with the first, so the final working specification is this. If
311 we were proving the entire program then we could statically verify
312 that callers do not try to call range_size() on the empty range.
315 requires \valid(range);
316 requires valid_range(*range);
317 requires !empty_range(*range);
319 // case UINT64_MAX+1 -> 0
320 ensures \result == (uint64_t)size_of_range(*range);
322 static inline uint64_t range_size(const Range *range)
324 return range->upb - range->lob + 1;
327 On to the next function. Again this seems very simple, but in fact it
328 contains a serious problem:
331 * Initialize range to span the interval [lob,lob + size - 1].
332 * size may be 0. Range must not overflow.
334 static inline void range_init_nofail(Range *range, uint64_t lob,
338 range->upb = lob + size - 1;
341 What does lob == 0, size == 0 mean? It will create a total range.
343 +-------------------------------------------------------+
344 |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX|
345 +-------------------------------------------------------+
346 range->lob == 0 range->upb == UINT64_MAX
348 What does lob == 1, size == 0 mean? It will create an empty range.
350 Because I didn't want to change the code, that makes the proof very
351 long, and it demonstrates another feature of ACSL specifications
355 requires \valid(range);
356 requires lob + (integer)size <= UINT64_MAX;
359 assumes lob > 0 && size == 0;
360 ensures empty_range(*range);
363 ensures range->lob == lob;
364 ensures size_of_bounds(lob, range->upb) == size;
366 assumes lob == 0 && size == 0;
367 ensures total_range(*range);
372 I also had to modify my earlier empty_range predicate:
375 predicate empty_range(struct Range range) =
376 range.lob == range.upb + 1;
380 predicate empty_range(struct Range range) =
381 range.lob > 0 && range.lob == range.upb + 1;
384 What can we say about this? If you were developing the Range type and
385 the proof at the same time, you probably wouldn't define the Range
386 type in this way, and you probably wouldn't choose an initialization
387 function which had this complex set of behaviours.
389 I did prove most of the rest of the Range functions, but I wanted to
390 point out a few things about the final proof:
392 - The original file includes glib headers, but those cannot be parsed
393 by Frama-C's parser. The problem appears to be some unsupported
394 __attribute__s. Some attributes seem to work, and others are not
395 supported, and this of course limits what you can prove if it
396 depends on any external library that might use these.
398 - For the same reason I had to exclude a glib function for operating
399 on lists of ranges from the proof.
401 - You can run frama-c on the original source file, with a few #ifdef
402 modifications, so it would be possible to maintain the annotations
403 upstream, and run the proof checker as a CI test.
405 - It probably doesn't make sense for qemu right now though, unless we
406 could prove more substantial pieces of code.
410 This is a function from nbdkit:
412 /* Returns true if v is a power of 2.
414 * Uses the algorithm described at
415 * http://graphics.stanford.edu/~seander/bithacks.html#DetermineIfPowerOf2
418 is_power_of_2 (unsigned long v)
420 return v && ((v & (v - 1)) == 0);
423 and my initial specification started by defining:
426 predicate positive_power_of_2 (integer i) =
428 (i == 1 || ((i & 1) == 0 && positive_power_of_2 (i >> 1)));
430 lemma positive_power_of_2 (1);
431 lemma positive_power_of_2 (2);
432 lemma positive_power_of_2 (4);
433 lemma !positive_power_of_2 (7);
436 I was using the lemmas (which are like assertions, except statically
437 verified), to test my predicate, but sadly Frama-C was not able to
438 prove them even though they appear to be trivial.
440 This led to some discussion on stackoverflow:
442 https://stackoverflow.com/questions/64268418/how-do-i-write-an-is-power-of-2-predicate-in-acsl
444 Ultimately it was possible to fix this, but I discovered a few things
445 about Alt-Ergo (the default prover):
447 - It can't handle bitwise operations very well.
449 - You can guide Alt-Ergo, but it requires using the Frama-C GUI which
450 is very non-intuitive. You do end up with a script but I couldn't
451 work out how to integrate this with the command line.
453 - Z3 should be able to solve this, but I was not able to get
454 Z3 working with Frama-C on Fedora.
456 Essentially bitwise tricks like this are a hard case for automated
457 theorem proving. I gave up.
459 = TIMEVAL DIFFERENCE =
461 This is another nbdkit function:
463 /* Return the number of µs (microseconds) in y - x. */
464 static inline int64_t
465 tvdiff_usec (const struct timeval *x, const struct timeval *y)
469 usec = (y->tv_sec - x->tv_sec) * 1000000;
470 usec += y->tv_usec - x->tv_usec;
474 My first attempt at a proof was:
477 predicate valid_timeval (struct timeval tv) =
478 tv.tv_sec >= 0 && tv.tv_usec >= 0 && tv.tv_usec < 1000000;
479 logic integer tv_to_microseconds (struct timeval tv) =
480 tv.tv_sec * 1000000 + tv.tv_usec;
483 /* Return the number of µs (microseconds) in y - x. */
485 requires \valid_read (x) && \valid_read (y);
486 requires valid_timeval (*x) && valid_timeval (*y);
487 ensures \result == tv_to_microseconds (*y) - tv_to_microseconds (*x);
489 static inline int64_t
490 tvdiff_usec (const struct timeval *x, const struct timeval *y)
494 usec = (y->tv_sec - x->tv_sec) * 1000000;
495 usec += y->tv_usec - x->tv_usec;
499 This is not provable, and again the problem is integer overflow. The
500 tv_sec field is a 64 bit integer so it's quite easy to construct two
501 inputs which overflow the output.
503 Fixing this involves a considerably more complex specification:
506 requires \valid_read (x) && \valid_read (y);
507 requires valid_timeval (*x) && valid_timeval (*y);
511 ensures \result == 0;
512 ensures \let diff = tv_to_microseconds (*y) - tv_to_microseconds (*x);
513 INT64_MIN <= diff <= INT64_MAX &&
516 ensures \result == -1;
518 ensures \let diff = tv_to_microseconds (*y) - tv_to_microseconds (*x);
519 !(INT64_MIN <= diff <= INT64_MAX);
524 tvdiff_usec (const struct timeval *x, const struct timeval *y,
529 and of course that is not sufficient because I also had to fix the
530 code so it can return an overflow indication. And actually writing
531 that code is left as an exercise for the reader as it is not at all
534 A few things I learned from this:
536 - This is a function we have had and used for years, and I don't
537 think anyone ever thought it had a problem.
539 - Co-developing the specification alongside the code could help
542 - You can end up with functions that are much harder to use. If we
543 were proof-checking the whole program we could check all callers.
545 - GCC has __builtin_add_overflow and similar which can be used to do
546 mathematical operations and detect overflow, but Frama-C does not
547 understand them. We could contribute these to the Frama-C standard
559 * Frama-C is a real open source tool used by companies to verify
560 safety-critical software. If you've flown in an Airbus or used
561 electricity from a French nuclear power plant then you've used this
564 * Not as hard to use as I imagined. I spent about 6-12 hours from
565 zero to being able to fully prove a whole file of production code.
566 (But I did study logic under Prof Crispin Wright at university).
568 * Mostly useful for small functions. There are other tools, like TLA+
569 for proving distributed algorithms.
571 * Forces you to think very hard about assumptions and corner cases,
572 but that's the nature of the problem.
574 * Very well documented, questions answered quickly on stackoverflow.
576 But it has its limits:
578 * Typed Memory Model turns out to be restrictive.
580 * No support for malloc.
582 * Not good at bitwise ops (can Z3 help?).
584 * Cannot parse some non-standard __attribute__s (glib).