From fee4463214d94b151751befa76e2a96c8bd29102 Mon Sep 17 00:00:00 2001 From: Eric Blake Date: Wed, 23 Oct 2019 20:59:52 -0500 Subject: [PATCH] Add setup slide for fastzero demo. --- 2019-kvm-forum/.gitignore | 1 + 2019-kvm-forum/4000-case-study-baseline.term | 17 +++++++++++++++++ 2019-kvm-forum/fastzero.d/convert | 7 +++++++ 2019-kvm-forum/notes-04-fast-zero | 21 +++++++++++++-------- 4 files changed, 38 insertions(+), 8 deletions(-) create mode 100755 2019-kvm-forum/4000-case-study-baseline.term create mode 100755 2019-kvm-forum/fastzero.d/convert diff --git a/2019-kvm-forum/.gitignore b/2019-kvm-forum/.gitignore index d678620..397e2db 100644 --- a/2019-kvm-forum/.gitignore +++ b/2019-kvm-forum/.gitignore @@ -1,2 +1,3 @@ /bindings /history +/fastzero.d/src.qcow2 diff --git a/2019-kvm-forum/4000-case-study-baseline.term b/2019-kvm-forum/4000-case-study-baseline.term new file mode 100755 index 0000000..cb70eb1 --- /dev/null +++ b/2019-kvm-forum/4000-case-study-baseline.term @@ -0,0 +1,17 @@ +#!/bin/bash + +source functions + +# Title. +export title="Case Study: copying sparse images" + +# History. +remember 'qemu-img create -f qcow2 src.qcow2 100m' +remember 'for i in $(seq 0 2 99); do qemu-io -f qcow2' \ + '-c "w ${i}m 1m" src.qcow2; done >/dev/null' +remember 'cat ./convert' +remember './convert' + +pushd $talkdir/fastzero.d >/dev/null +terminal +popd >/dev/null diff --git a/2019-kvm-forum/fastzero.d/convert b/2019-kvm-forum/fastzero.d/convert new file mode 100755 index 0000000..212afca --- /dev/null +++ b/2019-kvm-forum/fastzero.d/convert @@ -0,0 +1,7 @@ +#!/bin/bash +nbdkit -U - --filter=log --filter=nozero --filter=blocksize \ + --filter=delay --filter=stats --filter=noextents memory 100m \ + logfile=>(sed -n '/Zero.*\.\./ s/.*\(fast=.\).*/\1/p' |sort|uniq -c) \ + statsfile=/dev/stderr delay-write=20ms delay-zero=5ms maxdata=256k \ + --run 'qemu-img convert -n -f qcow2 -O raw src.qcow2 $nbd' "$@" +st=$?; sleep .2; exit $st diff --git a/2019-kvm-forum/notes-04-fast-zero b/2019-kvm-forum/notes-04-fast-zero index ee1271a..48c3ac8 100644 --- a/2019-kvm-forum/notes-04-fast-zero +++ b/2019-kvm-forum/notes-04-fast-zero @@ -1,12 +1,16 @@ Case study: Adding fast zero to NBD [About 5 mins] -* Heading: Copying disk images +* Heading: Case study baseline +- link to https://www.redhat.com/archives/libguestfs/2019-August/msg00322.html +- shell to pre-create source file -As Rich mentioned, qemu-img convert is a great for copying guest -images. However, most guest images are sparse, and we want to avoid -naively reading lots of zeroes on the source then writing lots of -zeroes on the destination; although this setup makes a great baseline. +As Rich mentioned, qemu-img convert is a great tool for copying guest +images, with support for NBD on both source and destination. However, +most guest images are sparse, and we want to avoid naively reading +lots of zeroes on the source then writing lots of zeroes on the +destination. Here's a case study of optimizing that, starting with a +baseline of straightline copying. * Heading: Nothing to see here @@ -31,7 +35,7 @@ where lseek(SEEK_HOLE) is O(n) rather than O(1), so querying status for every hole turns our linear walk into an O(n^2) ordeal, so we don't want to use it more than once. So for the rest of my case study, I investigated what happens when BLOCK_STATUS is unavailable -(which is in fact the case with qemu 3.1). +(which is in fact the case with qemu 3.0). * Heading: Tale of two servers @@ -42,7 +46,8 @@ source data portions, and not revisit the holes; fewer commands over the wire should result in better performance, right? But in practice, we discovered an odd effect - some servers were indeed faster this way, but others were actually slower than the baseline of just writing -the entire image in a single pass. +the entire image in a single pass. This pessimization appeared in +qemu 3.1. * Heading: The problem @@ -110,7 +115,7 @@ writes slower than zeroes, and no data write larger than 256k, block status disabled) and observe behavior (log to see what the client requests based on handshake results, stats to get timing numbers for overall performance). Then by tweaking the 'nozero' filter -parameters, I was able to recreate qemu 2.11 behavior (baseline +parameters, I was able to recreate qemu 3.0 behavior (baseline straight copy), qemu 3.1 behavior (blind pre-zeroing pass with speedup or slowdown based on zeroing implementation), qemu 4.0 behavior (no speedups without detection of fast zero support, but at least nothing -- 1.8.3.1