2020-frama-c/notes.txt

   1 16th Nov 2020
   2 Formally proving tiny bits of qemu using Frama-C
   3
   4 [in medias res]
   5
   6 In early October I was talking to one of the developers of Frama-C,
   7 which is a modular framework for verifying C programs.  It's been on
   8 my radar to learn for a long time, and I'm a co-maintainer of the
   9 package in Fedora.  I tried to follow one tutorial but never really
  10 got past installation and UI issues.  But the developer pointed me to
  11 a very good, short set of blog postings about using the command line
  12 tool.
  13
  14 I will provide links to tutorials etc at the end.
  15
  16 I decided to spend a day or two seeing if I could formally prove some
  17 production code inside qemu, and I arbitrarily picked one of the
  18 smallest pieces of code in the "util/" subdirectory:
  19
  20   $ wc -l util/*.c | sort -nr | tail -20
  21
  22 Actually there are two files:
  23
  24   $ ls -l util/range.c include/qemu/range.h
  25
  26 This is what the Range type looks like.  This is the complete
  27 definition upstream, with comments which are supposed to explain how
  28 it works:
  29
  30   /*
  31    * Operations on 64 bit address ranges.
  32    * Notes:
  33    * - Ranges must not wrap around 0, but can include UINT64_MAX.
  34    */
  35   struct Range {
  36       /*
  37        * Do not access members directly, use the functions!
  38        * A non-empty range has @lob <= @upb.
  39        * An empty range has @lob == @upb + 1.
  40        */
  41       uint64_t lob;        /* inclusive lower bound */
  42       uint64_t upb;        /* inclusive upper bound */
  43   };
  44   typedef struct Range Range;
  45
  46   +-----------+-----------+------------- - - - - -
  47   |           |           |
  48   +-----------+-----------+------------- - - - - -
  49                ^         ^
  50               lob ----- upb
  51
  52 Using the tutorial as my guide I wrote some simple predicates, a
  53 "predicate" being a statement which can be true or false:
  54
  55        * A non-empty range has @lob <= @upb.
  56        * An empty range has @lob == @upb + 1.
  57
  58   /*@
  59     predicate valid_range(struct Range range) =
  60       range.lob <= range.upb + 1;
  61
  62     predicate empty_range(struct Range range) =
  63       range.lob == range.upb + 1;
  64   */
  65
  66 Notice a few things here:
  67
  68  - These are statements written in a formal language.
  69
  70  - I'm using structs directly from the C code.
  71
  72  - The comments in the upstream code translate into predicates.
  73
  74 The first upstream function is:
  75
  76   /* Is range empty? */
  77   static inline bool range_is_empty(const Range *range)
  78   {
  79       return range->lob > range->upb;
  80   }
  81
  82 and using the predicates we can write a specification:
  83
  84   /* Is range empty? */
  85   /*@
  86     requires \valid_read(range);
  87     requires valid_range(*range);
  88     assigns \nothing;
  89     ensures \result <==> empty_range(*range);
  90    */
  91   static inline bool range_is_empty(const Range *range)
  92   {
  93       return range->lob > range->upb;
  94   }
  95
  96 And we can compile and prove that:
  97
  98   $ frama-c -wp -wp-rte snippets/range_is_empty.c
  99
 100 Frama-C parsed the C code and the formal specification and machine
 101 checked it, and it's correct - the code is bug-free.
 102
 103 = OVERVIEW OF FRAMA-C ECOSYSTEM =
 104
 105
 106 XXX Modular Framework for analysis of C
 107
 108 XXX Take some slides from David Mentre's presentation.
 109
 110 XXX Explain which companies are using Frama-C.
 111
 112 XXX WP plugin
 113
 114 XXX ACSL language
 115
 116
 117
 118 = BACK TO RANGE.C =
 119
 120 Going back to what we proved so far:
 121
 122   /*@
 123     predicate valid_range(struct Range range) =
 124       range.lob <= range.upb + 1;
 125
 126     predicate empty_range(struct Range range) =
 127       range.lob == range.upb + 1;
 128   */
 129
 130   /* Is range empty? */
 131   /*@
 132     requires \valid_read(range);
 133     requires valid_range(*range);
 134     assigns \nothing;
 135     ensures \result <==> empty_range(*range);
 136    */
 137   static inline bool range_is_empty(const Range *range)
 138   {
 139       return range->lob > range->upb;
 140   }
 141
 142   $ frama-c -wp -wp-rte snippets/range_is_empty.c
 143   XXX OUTPUT OF FRAMA-C XXX
 144
 145 The "@"-comments are ACSL.  The C code is parsed by Frama-C into an
 146 abstract syntax tree.  We told Frama-C to apply the WP plugin.  The
 147 specification and C code were translated into first-order logic
 148 statements and passed to the Alt-Ergo SMT prover to see if it could
 149 find a Proof Obligation (PO).
 150
 151 We need to be very clear about what happened:
 152
 153  - The preconditions must be fulfilled in order for the proof to be
 154    correct.  If you are checking a whole program then that would check
 155    that all callers of this function are fulfilling the preconditions,
 156    but if you check only single functions then you are not doing that.
 157
 158  - The proof only shows that the post-condition holds.
 159
 160  - This only shows that the source code is correct.  The compiler
 161    could still compile the program incorrectly.  If you really want to
 162    get into this you have to use a verified C compiler like CompCert,
 163    and there are even verified CPU implementations.
 164
 165  - Given those assumptions, the code is bug free - you don't need to
 166    write any tests.
 167
 168 Obviously this is a single line, very trivial function, but I was
 169 quite pleased that I was able to prove it quickly.  I kept going on
 170 the range file.  The next function is:
 171
 172   /* Does range contain val? */
 173   static inline bool range_contains(const Range *range, uint64_t val)
 174   {
 175       return val >= range->lob && val <= range->upb;
 176   }
 177
 178 This is similarly easy to prove after adding another predicate at the
 179 top of the file:
 180
 181   /*@
 182     predicate value_in_range(struct Range range, uint64_t value) =
 183       range.lob <= value <= range.upb;
 184   */
 185
 186   /* Does range contain val? */
 187   /*@
 188     requires \valid_read(range);
 189     requires valid_range(*range);
 190     assigns \nothing;
 191     ensures \result <==> value_in_range(*range, val);
 192    */
 193   static inline bool range_contains(const Range *range, uint64_t val)
 194   {
 195       return val >= range->lob && val <= range->upb;
 196   }
 197
 198 The next function is range_make_empty, again easy to prove using the
 199 already existing empty_range predicate.  Notice how we declare which
 200 memory locations this function assigns to:
 201
 202   /* Initialize range to the empty range */
 203   /*@
 204     requires \valid(range);
 205     requires valid_range(*range);
 206     assigns *range;
 207     ensures empty_range(*range);
 208    */
 209   static inline void range_make_empty(Range *range)
 210   {
 211       *range = range_empty;
 212       assert(range_is_empty(range));
 213   }
 214
 215 I'm going to skip forward a few functions to get to an interesting one.
 216 This seems trivial:
 217
 218   /* Get the size of range. */
 219   static inline uint64_t range_size(const Range *range)
 220   {
 221       return range->upb - range->lob + 1;
 222   }
 223
 224 My first attempt at a proof was:
 225
 226   /*@
 227     logic integer size_of_bounds(integer lob, integer upb) =
 228       upb - lob + 1;
 229
 230     logic integer size_of_range(struct Range range) =
 231       size_of_bounds(range.lob, range.upb);
 232   */
 233
 234   /* Get the size of range. */
 235   /*@
 236     requires \valid(range);
 237     requires valid_range(*range);
 238     assigns \nothing;
 239     ensures \result == size_of_range(*range);
 240   */
 241   static inline uint64_t range_size(const Range *range)
 242   {
 243       return range->upb - range->lob + 1;
 244   }
 245
 246 A few things to notice about this:
 247
 248  - I'm using ACSL "logic" statements, which are similar to functions.
 249
 250  - ACSL has an "integer" type which is an unbounded integer.
 251
 252 Let's try to prove this one:
 253
 254   $ frama-c ... snippets/range_size.c
 255   XXX It fails!
 256
 257 This was confusing to me because at this point I'd probably spent an
 258 hour, and I'd proven about 5 functions successfully, and this function
 259 looks equally trivial and yet it cannot be proven.  Why?
 260
 261 One way to find the problem would be to find a COUNTEREXAMPLE.  A
 262 counterexample is an instance of an input that satisfies all of the
 263 preconditions, but makes the postcondition false.  Now Frama-C has
 264 pluggable provers, and one prover called Z3, originally written by
 265 Microsoft, can be used with Frama-C and can sometimes find
 266 counterexamples.  However for some reason the version of Z3 in Fedora
 267 does not like to work with the version of Frama-C in Fedora and I
 268 don't know why.
 269
 270 So this is the counterexample which I worked out myself:
 271
 272   /*@
 273     logic integer size_of_bounds(integer lob, integer upb) =
 274       upb - lob + 1;
 275
 276     logic integer size_of_range(struct Range range) =
 277       size_of_bounds(range.lob, range.upb);
 278   */
 279
 280   /* Get the size of range. */
 281   /*@
 282     requires \valid(range);
 283     requires valid_range(*range);
 284     assigns \nothing;
 285     ensures \result == size_of_range(*range);
 286   */
 287   static inline uint64_t range_size(const Range *range)
 288   {
 289       return range->upb - range->lob + 1;
 290   }
 291
 292   +-------------------------------------------------------+
 293   |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX|
 294   +-------------------------------------------------------+
 295   range->lob == 0                 range->upb == UINT64_MAX
 296
 297 In this case size_of_range() returns an integer.  Recall that integers
 298 are an unbounded type.  So it returns UINT64_MAX + 1 (a 65 bit
 299 number).  Our C function on the other hand returns 0.
 300
 301 This is a bug in the function.  There are really two ways to
 302 look at this:
 303
 304 (1) The function shouldn't be called on empty ranges (size 0), and so
 305 if the function returns 0 it means "the total range".  Or:
 306
 307 (2) The function should have a different signature so it can return a
 308 separate indication for "empty range" vs "total range".
 309
 310 I went with the first, so the final working specification is this.  If
 311 we were proving the entire program then we could statically verify
 312 that callers do not try to call range_size() on the empty range.
 313
 314   /*@
 315     requires \valid(range);
 316     requires valid_range(*range);
 317     requires !empty_range(*range);
 318     assigns \nothing;
 319                         // case UINT64_MAX+1 -> 0
 320     ensures \result == (uint64_t)size_of_range(*range);
 321    */
 322   static inline uint64_t range_size(const Range *range)
 323   {
 324       return range->upb - range->lob + 1;
 325   }
 326
 327 On to the next function.  Again this seems very simple, but in fact it
 328 contains a serious problem:
 329
 330   /*
 331    * Initialize range to span the interval [lob,lob + size - 1].
 332    * size may be 0. Range must not overflow.
 333    */
 334   static inline void range_init_nofail(Range *range, uint64_t lob,
 335                                        uint64_t size)
 336   {
 337       range->lob = lob;
 338       range->upb = lob + size - 1;
 339   }
 340
 341 What does lob == 0, size == 0 mean?  It will create a total range.
 342
 343   +-------------------------------------------------------+
 344   |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX|
 345   +-------------------------------------------------------+
 346   range->lob == 0                 range->upb == UINT64_MAX
 347
 348 What does lob == 1, size == 0 mean?  It will create an empty range.
 349
 350 Because I didn't want to change the code, that makes the proof very
 351 long, and it demonstrates another feature of ACSL specifications
 352 ("behaviors"):
 353
 354   /*@
 355     requires \valid(range);
 356     requires lob + (integer)size <= UINT64_MAX;
 357     assigns *range;
 358     behavior empty:
 359       assumes lob > 0 && size == 0;
 360       ensures empty_range(*range);
 361     behavior non_empty:
 362       assumes size > 0;
 363       ensures range->lob == lob;
 364       ensures size_of_bounds(lob, range->upb) == size;
 365     behavior total:
 366       assumes lob == 0 && size == 0;
 367       ensures total_range(*range);
 368     complete behaviors;
 369     disjoint behaviors;
 370    */
 371
 372 I also had to modify my earlier empty_range predicate:
 373
 374   /*@ // before:
 375     predicate empty_range(struct Range range) =
 376       range.lob == range.upb + 1;
 377   */
 378
 379   /*@ // after:
 380     predicate empty_range(struct Range range) =
 381       range.lob > 0 && range.lob == range.upb + 1;
 382   */
 383
 384 What can we say about this?  If you were developing the Range type and
 385 the proof at the same time, you probably wouldn't define the Range
 386 type in this way, and you probably wouldn't choose an initialization
 387 function which had this complex set of behaviours.
 388
 389 I did prove most of the rest of the Range functions, but I wanted to
 390 point out a few things about the final proof:
 391
 392  - The original file includes glib headers, but those cannot be parsed
 393    by Frama-C's parser.  The problem appears to be some unsupported
 394    __attribute__s.  Some attributes seem to work, and others are not
 395    supported, and this of course limits what you can prove if it
 396    depends on any external library that might use these.
 397
 398  - For the same reason I had to exclude a glib function for operating
 399    on lists of ranges from the proof.
 400
 401  - You can run frama-c on the original source file, with a few #ifdef
 402    modifications, so it would be possible to maintain the annotations
 403    upstream, and run the proof checker as a CI test.
 404
 405  - It probably doesn't make sense for qemu right now though, unless we
 406    could prove more substantial pieces of code.
 407
 408 = POWER OF 2 =
 409
 410 This is a function from nbdkit:
 411
 412   /* Returns true if v is a power of 2.
 413    *
 414    * Uses the algorithm described at
 415    * http://graphics.stanford.edu/~seander/bithacks.html#DetermineIfPowerOf2
 416    */
 417   static inline bool
 418   is_power_of_2 (unsigned long v)
 419   {
 420     return v && ((v & (v - 1)) == 0);
 421   }
 422
 423 and my initial specification started by defining:
 424
 425   /*@
 426     predicate positive_power_of_2 (integer i) =
 427       i > 0 &&
 428       (i == 1 || ((i & 1) == 0 && positive_power_of_2 (i >> 1)));
 429
 430     lemma positive_power_of_2 (1);
 431     lemma positive_power_of_2 (2);
 432     lemma positive_power_of_2 (4);
 433     lemma !positive_power_of_2 (7);
 434    */
 435
 436 I was using the lemmas (which are like assertions, except statically
 437 verified), to test my predicate, but sadly Frama-C was not able to
 438 prove them even though they appear to be trivial.
 439
 440 This led to some discussion on stackoverflow:
 441
 442   https://stackoverflow.com/questions/64268418/how-do-i-write-an-is-power-of-2-predicate-in-acsl
 443
 444 Ultimately it was possible to fix this, but I discovered a few things
 445 about Alt-Ergo (the default prover):
 446
 447  - It can't handle bitwise operations very well.
 448
 449  - You can guide Alt-Ergo, but it requires using the Frama-C GUI which
 450    is very non-intuitive.  You do end up with a script but I couldn't
 451    work out how to integrate this with the command line.
 452
 453  - Z3 should be able to solve this, but I was not able to get
 454    Z3 working with Frama-C on Fedora.
 455
 456 Essentially bitwise tricks like this are a hard case for automated
 457 theorem proving.  I gave up.
 458
 459 = TIMEVAL DIFFERENCE =
 460
 461 This is another nbdkit function:
 462
 463   /* Return the number of µs (microseconds) in y - x. */
 464   static inline int64_t
 465   tvdiff_usec (const struct timeval *x, const struct timeval *y)
 466   {
 467     int64_t usec;
 468
 469     usec = (y->tv_sec - x->tv_sec) * 1000000;
 470     usec += y->tv_usec - x->tv_usec;
 471     return usec;
 472   }
 473
 474 My first attempt at a proof was:
 475
 476   /*@
 477     predicate valid_timeval (struct timeval tv) =
 478       tv.tv_sec >= 0 && tv.tv_usec >= 0 && tv.tv_usec < 1000000;
 479     logic integer tv_to_microseconds (struct timeval tv) =
 480       tv.tv_sec * 1000000 + tv.tv_usec;
 481    */
 482
 483   /* Return the number of µs (microseconds) in y - x. */
 484   /*@
 485     requires \valid_read (x) && \valid_read (y);
 486     requires valid_timeval (*x) && valid_timeval (*y);
 487     ensures \result == tv_to_microseconds (*y) - tv_to_microseconds (*x);
 488    */
 489   static inline int64_t
 490   tvdiff_usec (const struct timeval *x, const struct timeval *y)
 491   {
 492     int64_t usec;
 493
 494     usec = (y->tv_sec - x->tv_sec) * 1000000;
 495     usec += y->tv_usec - x->tv_usec;
 496     return usec;
 497   }
 498
 499 This is not provable, and again the problem is integer overflow.  The
 500 tv_sec field is a 64 bit integer so it's quite easy to construct two
 501 inputs which overflow the output.
 502
 503 Fixing this involves a considerably more complex specification:
 504
 505   /*@
 506     requires \valid_read (x) && \valid_read (y);
 507     requires valid_timeval (*x) && valid_timeval (*y);
 508     requires \valid (r);
 509     assigns *r;
 510     behavior success:
 511       ensures \result == 0;
 512       ensures \let diff = tv_to_microseconds (*y) - tv_to_microseconds (*x);
 513               INT64_MIN <= diff <= INT64_MAX &&
 514               *r == diff;
 515     behavior failure:
 516       ensures \result == -1;
 517       ensures *r == 0;
 518       ensures \let diff = tv_to_microseconds (*y) - tv_to_microseconds (*x);
 519               !(INT64_MIN <= diff <= INT64_MAX);
 520     complete behaviors;
 521     disjoint behaviors;
 522    */
 523   static inline int
 524   tvdiff_usec (const struct timeval *x, const struct timeval *y,
 525                int64_t *r)
 526   ...
 527
 528
 529 and of course that is not sufficient because I also had to fix the
 530 code so it can return an overflow indication.  And actually writing
 531 that code is left as an exercise for the reader as it is not at all
 532 simple.
 533
 534 A few things I learned from this:
 535
 536  - This is a function we have had and used for years, and I don't
 537    think anyone ever thought it had a problem.
 538
 539  - Co-developing the specification alongside the code could help
 540    here.
 541
 542  - You can end up with functions that are much harder to use.  If we
 543    were proof-checking the whole program we could check all callers.
 544
 545  - GCC has __builtin_add_overflow and similar which can be used to do
 546    mathematical operations and detect overflow, but Frama-C does not
 547    understand them.  We could contribute these to the Frama-C standard
 548    library.
 549
 550
 551
 552
 553
 554
 555
 556
 557 = IN CONCLUSION =
 558
 559 * Frama-C is a real open source tool used by companies to verify
 560   safety-critical software.  If you've flown in an Airbus or used
 561   electricity from a French nuclear power plant then you've used this
 562   software already.
 563
 564 * Not as hard to use as I imagined.  I spent about 6-12 hours from
 565   zero to being able to fully prove a whole file of production code.
 566   (But I did study logic under Prof Crispin Wright at university).
 567
 568 * Mostly useful for small functions.  There are other tools, like TLA+
 569   for proving distributed algorithms.
 570
 571 * Forces you to think very hard about assumptions and corner cases,
 572   but that's the nature of the problem.
 573
 574 * Very well documented, questions answered quickly on stackoverflow.
 575
 576 But it has its limits:
 577
 578 * Typed Memory Model turns out to be restrictive.
 579
 580 * No support for malloc.
 581
 582 * Not good at bitwise ops (can Z3 help?).
 583
 584 * Cannot parse some non-standard __attribute__s (glib).