Case study: Adding fast zero to NBD
[About 5 mins]
-* Heading: Copying disk images
+* Heading: Case study baseline
+- link to https://www.redhat.com/archives/libguestfs/2019-August/msg00322.html
+- shell to pre-create source file
-As Rich mentioned, qemu-img convert is a great for copying guest
-images. However, most guest images are sparse, and we want to avoid
-naively reading lots of zeroes on the source then writing lots of
-zeroes on the destination; although this setup makes a great baseline.
+As Rich mentioned, qemu-img convert is a great tool for copying guest
+images, with support for NBD on both source and destination. However,
+most guest images are sparse, and we want to avoid naively reading
+lots of zeroes on the source then writing lots of zeroes on the
+destination. Here's a case study of optimizing that, starting with a
+baseline of straightline copying.
* Heading: Nothing to see here
for every hole turns our linear walk into an O(n^2) ordeal, so we
don't want to use it more than once. So for the rest of my case
study, I investigated what happens when BLOCK_STATUS is unavailable
-(which is in fact the case with qemu 3.1).
+(which is in fact the case with qemu 3.0).
* Heading: Tale of two servers
the wire should result in better performance, right? But in practice,
we discovered an odd effect - some servers were indeed faster this
way, but others were actually slower than the baseline of just writing
-the entire image in a single pass.
+the entire image in a single pass. This pessimization appeared in
+qemu 3.1.
* Heading: The problem
status disabled) and observe behavior (log to see what the client
requests based on handshake results, stats to get timing numbers for
overall performance). Then by tweaking the 'nozero' filter
-parameters, I was able to recreate qemu 2.11 behavior (baseline
+parameters, I was able to recreate qemu 3.0 behavior (baseline
straight copy), qemu 3.1 behavior (blind pre-zeroing pass with speedup
or slowdown based on zeroing implementation), qemu 4.0 behavior (no
speedups without detection of fast zero support, but at least nothing