Improve zeroing and detection of zeroes.
This code modifies zero, zero-device, is-zero, is-zero-device.
zero and zero-device are modified so that if the blocks of the device
already contain zeroes, then we don't write zeroes. The reason for
this is to avoid unnecessarily making the underlying storage
non-sparse or (in the qcow2 case) growing it.
is-zero and is-zero-device are modified so that zero detection is
faster. This is a nice side effect of making the first change.
Since avoiding unnecessary zeroing involves reading the blocks before
writing them, whereas before we just blindly wrote, this can be
slower. As you can see from the tests below, in the case where the
disk is sparse, it actually turns out to be faster, because we avoid
allocating the underlying blocks.
However in the case where the disk is non-sparse and full of existing
data, it is much slower. There might be a case for an API flag to
adjust whether or not we perform the zero check. I did not add this
flag because it is unlikely that the caller would have enough
information to be able to set the flag correctly.
(Elapsed time in seconds)
Format Test case Before After
Raw Sparse 16.4 5.3
Preallocated zero 17.0 18.8
Preallocated random 16.0 41.3
Qcow2 preallocation=off 18.7 5.6
preallocation=metadata 17.4 5.8
The current code uses a fixed block size of 4K for reading and
writing. I also tried the same tests with a block size of 64K but it
didn't make any significant difference.
(Thanks to Federico Simoncelli for suggesting this change)