Where to go from here: Adding resize to NBD
[About 4-5 mins]

- based somewhat on https://lists.debian.org/nbd/2017/01/msg00016.html

* Heading: Resize: where getting bigger is better
- 8000- slide
  - qemu -> (raw) -> qemu-nbd -> (qcow2) -> image.qcow2
  - qemu -> (qcow2) -> qemu-nbd -> (raw) -> image.qcow2

XXX With all the things we've added to NBD, what do we want to add
next?  Our biggest goal (pardon the pun) is to allow dynamic growth of
image sizes.

There are two ways to consume qcow2 images over NBD.  In the first,
the server reads the qcow2 file and exposes only the raw guest-visible
content to the client.  If the guest writes a lot, the server may grow
the .qcow2 file as needed, but the guest cannot change the size of the
guest-visible address range, and cannot access any qcow2 features such
as backing files, dirty bitmaps, or internal snapshots.

In the second, the server exposes the qcow2 file as-is, and the client
must then parse that metadata into guest content.  The client now has
access to all qcow2 features (including the QMP block_resize command
for altering the size reported to the guest).  However, it cannot
change the size of the underlying .qcow2 container; if more guest
writes and metadata actions occur than the original server size
supports, the operation fails with ENOSPC.  Use of preallocation can
work around this limitation, but it is painful enough to pre-size
things correctly that current documentation recommends always running
in the first mode (raw over the wire) rather than this mode (qcow2
over the wire).

The next few slides will discuss design tradeoffs to be considered
when adding a resize extension.

* Heading: Automatic or explicit
- 8100- slide
 - automatic: NBD_CMD_WRITE past EOF -> server auto-resizes if possible
 - explicit: NBD_CMD_WITE past EOF fails, NBD_CMD_RESIZE to update,
   NBD_CMD_WRITE now succeeds.

POSIX files support automatic growth, insofar as the underlying file
system still has room.  However, block devices do not.  Should NBD
require an explicit NBD_CMD_RESIZE before allowing access to
additional size, or can NBD_CMD_WRITE extending past EOF trigger an
automatic resize?  Should we guarantee zero contents, or may a server
to have unspecified contents in not-yet-written offsets added by a
resize?  If resize can be automatic, should the server advertise this
capability to the client?  Or should automatic resize be something the
client must opt in to using?

* Heading: Simple or structured
- 8200- slide
  - simple: NBD_CMD_RESIZE -> simple reply
  - structured: NBD_CMD_RESIZE -> NBD_REPLY_CHUNK_SIZE+DONE

Sometimes, the client knows when it needs more space, and wants to
inform the server about a new requested size (this includes the case
when resize is automatic).  But even when the client requests one
size, the server may pick a different one (due to rounding to
granularities or to quotas).  In other setups, the server can't resize
on the fly at the request of the client, but can be resized by other
means and will thus need a way for the client to learn whether the
size has changed.  However, returning the server's notion of the
current size requires a structured reply; servers that lack structured
replies would be limited to a boolean success or failure result.  Is
it worth requiring structured replies to implement a resize command?

* Heading: Polling or notification
- 8300- slide
  - NBD_CMD_RESIZE(FLAG_NOTIFY) -> NBD_REPLY_CHUNK_RESIZE+NOT_DONE
    -> NBD_REPLY_CHUNK_RESIZE+NOT_DONE ...

If resize is automatic, or if the server supports external means for
resizing, the client will want some way to learn the server's current
size.  The NBD protocol currently requires that all traffic be
command/response pairs initiated by the client, with no means for the
server to initiate a message unrequested by the client.  However, as
just mentioned, getting a size back would already require a structured
reply, and structured replies allow the server to send back more than
one response before declaring the response complete.  Is it worth
setting up a command flag where the client can request subsequent
notification of size changes as an open-ended request (perhaps
good-until-canceled), where the server can then send replies to that
command as needed on each size change, to allow the client to have a
means to receive events rather than having to periodically poll for
size changes?  Do we need to think about a client having to prevent
against a denial of service from a malicious server that sends too
many responses?

* Heading: Complexity tradeoffs
- 8400-

Should we specify all of the previous choices, with appropriate
handshaking for each knob?  Integration testing becomes more difficult
the more knobs there are to test against.  On the other hand,
additional flexibility allows for more servers to support as much or
as little as easily possible, which has already been proven a
worthwhile model with nbdkit plugins.  Requiring support for
structured replies may be necessary for some features (such as server
notification), but is definitely overkill for an implementation where
polling is adequate.

As with fast zeroes, the way forward will be to implement something
that works in each of qemu, nbdkit, and libnbd, and show that they are
interoperable, so that the NBD protocol specification can then
document how other implementations may also interoperably add the same
support.

* conclusion: XXX
- 9000- wrapup

Thanks for your time this afternoon.  We hope this has been
informative, and welcome any questions at this time.