X-Git-Url: http://git.annexia.org/?p=libguestfs.git;a=blobdiff_plain;f=src%2Fguestfs.pod;h=b50f608959960561b2b3e2b3b09ba726e6d790e3;hp=d034c8e3c9d5844ddeca29cae6bbd390f634fd38;hb=cd96cca38cea638a6db76afceeed76babc9e763c;hpb=a7070682932717f318f57f9aca6188a954a7e9aa diff --git a/src/guestfs.pod b/src/guestfs.pod index d034c8e..b50f608 100644 --- a/src/guestfs.pod +++ b/src/guestfs.pod @@ -124,7 +124,22 @@ disk, an actual block device, or simply an empty file of zeroes that you have created through L. Libguestfs lets you do useful things to all of these. -You can add a disk read-only using L, in which +The call you should use in modern code for adding drives is +L. To add a disk image, allowing writes, and +specifying that the format is raw, do: + + guestfs_add_drive_opts (g, filename, + GUESTFS_ADD_DRIVE_OPTS_FORMAT, "raw", + -1); + +You can add a disk read-only using: + + guestfs_add_drive_opts (g, filename, + GUESTFS_ADD_DRIVE_OPTS_FORMAT, "raw", + GUESTFS_ADD_DRIVE_OPTS_READONLY, 1, + -1); + +or by calling the older function L. In either case libguestfs won't modify the file. Be extremely cautious if the disk image is in use, eg. if it is being @@ -375,6 +390,23 @@ an X86 host). For SELinux guests, you may need to enable SELinux and load policy first. See L in this manpage. +=item * + +I It is not safe to run commands from untrusted, possibly +malicious guests. These commands may attempt to exploit your program +by sending unexpected output. They could also try to exploit the +Linux kernel or qemu provided by the libguestfs appliance. They could +use the network provided by the libguestfs appliance to bypass +ordinary network partitions and firewalls. They could use the +elevated privileges or different SELinux context of your program +to their advantage. + +A secure alternative is to use libguestfs to install a "firstboot" +script (a script which runs when the guest next boots normally), and +to have this script run the commands you want in the normal context of +the running guest, network security and so on. For information about +other security issues, see L. + =back The two main API calls to run commands are L and @@ -686,6 +718,9 @@ Note that in L autosync is the default. So quick and dirty guestfish scripts that forget to sync will work just fine, which can make this very puzzling if you are trying to debug a problem. +Update: Autosync is enabled by default for all API users starting from +libguestfs 1.5.24. + =item Mount option C<-o sync> should not be the default. If you use L, then C<-o sync,noatime> are added @@ -736,17 +771,6 @@ The error message you get from this is also a little obscure. This could be fixed in the generator by specially marking parameters and return values which take bytes or other units. -=item Library should return errno with error messages. - -It would be a nice-to-have to be able to get the original value of -'errno' from inside the appliance along error paths (where set). -Currently L goes through hoops to try to reverse the -error message string into an errno, see the function error() in -fuse/guestmount.c. - -In libguestfs 1.5.4, the protocol was changed so that the -Linux errno is sent back from the daemon. - =item Ambiguity between devices and paths There is a subtle ambiguity in the API between a device name @@ -818,6 +842,323 @@ libguestfs might end up being written out to the swap partition. If this is a concern, scrub the swap partition or don't use libguestfs on encrypted devices. +=head2 MULTIPLE HANDLES AND MULTIPLE THREADS + +All high-level libguestfs actions are synchronous. If you want +to use libguestfs asynchronously then you must create a thread. + +Only use the handle from a single thread. Either use the handle +exclusively from one thread, or provide your own mutex so that two +threads cannot issue calls on the same handle at the same time. + +See the graphical program guestfs-browser for one possible +architecture for multithreaded programs using libvirt and libguestfs. + +=head2 PATH + +Libguestfs needs a kernel and initrd.img, which it finds by looking +along an internal path. + +By default it looks for these in the directory C<$libdir/guestfs> +(eg. C or C). + +Use L or set the environment variable +L to change the directories that libguestfs will +search in. The value is a colon-separated list of paths. The current +directory is I searched unless the path contains an empty element +or C<.>. For example C would +search the current directory and then C. + +=head2 QEMU WRAPPERS + +If you want to compile your own qemu, run qemu from a non-standard +location, or pass extra arguments to qemu, then you can write a +shell-script wrapper around qemu. + +There is one important rule to remember: you I> as +the last command in the shell script (so that qemu replaces the shell +and becomes the direct child of the libguestfs-using program). If you +don't do this, then the qemu process won't be cleaned up correctly. + +Here is an example of a wrapper, where I have built my own copy of +qemu from source: + + #!/bin/sh - + qemudir=/home/rjones/d/qemu + exec $qemudir/x86_64-softmmu/qemu-system-x86_64 -L $qemudir/pc-bios "$@" + +Save this script as C (or wherever), C, +and then use it by setting the LIBGUESTFS_QEMU environment variable. +For example: + + LIBGUESTFS_QEMU=/tmp/qemu.wrapper guestfish + +Note that libguestfs also calls qemu with the -help and -version +options in order to determine features. + +=head2 ABI GUARANTEE + +We guarantee the libguestfs ABI (binary interface), for public, +high-level actions as outlined in this section. Although we will +deprecate some actions, for example if they get replaced by newer +calls, we will keep the old actions forever. This allows you the +developer to program in confidence against the libguestfs API. + +=head2 BLOCK DEVICE NAMING + +In the kernel there is now quite a profusion of schemata for naming +block devices (in this context, by I I mean a physical +or virtual hard drive). The original Linux IDE driver used names +starting with C. SCSI devices have historically used a +different naming scheme, C. When the Linux kernel I +driver became a popular replacement for the old IDE driver +(particularly for SATA devices) those devices also used the +C scheme. Additionally we now have virtual machines with +paravirtualized drivers. This has created several different naming +systems, such as C for virtio disks and C for Xen +PV disks. + +As discussed above, libguestfs uses a qemu appliance running an +embedded Linux kernel to access block devices. We can run a variety +of appliances based on a variety of Linux kernels. + +This causes a problem for libguestfs because many API calls use device +or partition names. Working scripts and the recipe (example) scripts +that we make available over the internet could fail if the naming +scheme changes. + +Therefore libguestfs defines C as the I. Internally C names are translated, if necessary, +to other names as required. For example, under RHEL 5 which uses the +C scheme, any device parameter C is translated to +C transparently. + +Note that this I applies to parameters. The +L, L and similar calls +return the true names of the devices and partitions as known to the +appliance. + +=head3 ALGORITHM FOR BLOCK DEVICE NAME TRANSLATION + +Usually this translation is transparent. However in some (very rare) +cases you may need to know the exact algorithm. Such cases include +where you use L to add a mixture of virtio and IDE +devices to the qemu-based appliance, so have a mixture of C +and C devices. + +The algorithm is applied only to I which are known to be +either device or partition names. Return values from functions such +as L are never changed. + +=over 4 + +=item * + +Is the string a parameter which is a device or partition name? + +=item * + +Does the string begin with C? + +=item * + +Does the named device exist? If so, we use that device. +However if I then we continue with this algorithm. + +=item * + +Replace initial C string with C. + +For example, change C to C. + +If that named device exists, use it. If not, continue. + +=item * + +Replace initial C string with C. + +If that named device exists, use it. If not, return an error. + +=back + +=head3 PORTABILITY CONCERNS WITH BLOCK DEVICE NAMING + +Although the standard naming scheme and automatic translation is +useful for simple programs and guestfish scripts, for larger programs +it is best not to rely on this mechanism. + +Where possible for maximum future portability programs using +libguestfs should use these future-proof techniques: + +=over 4 + +=item * + +Use L or L to list +actual device names, and then use those names directly. + +Since those device names exist by definition, they will never be +translated. + +=item * + +Use higher level ways to identify filesystems, such as LVM names, +UUIDs and filesystem labels. + +=back + +=head1 SECURITY + +This section discusses security implications of using libguestfs, +particularly with untrusted or malicious guests or disk images. + +=head2 GENERAL SECURITY CONSIDERATIONS + +Be careful with any files or data that you download from a guest (by +"download" we mean not just the L command but any +command that reads files, filenames, directories or anything else from +a disk image). An attacker could manipulate the data to fool your +program into doing the wrong thing. Consider cases such as: + +=over 4 + +=item * + +the data (file etc) not being present + +=item * + +being present but empty + +=item * + +being much larger than normal + +=item * + +containing arbitrary 8 bit data + +=item * + +being in an unexpected character encoding + +=item * + +containing homoglyphs. + +=back + +=head2 SECURITY OF MOUNTING FILESYSTEMS + +When you mount a filesystem under Linux, mistakes in the kernel +filesystem (VFS) module can sometimes be escalated into exploits by +deliberately creating a malicious, malformed filesystem. These +exploits are very severe for two reasons. Firstly there are very many +filesystem drivers in the kernel, and many of them are infrequently +used and not much developer attention has been paid to the code. +Linux userspace helps potential crackers by detecting the filesystem +type and automatically choosing the right VFS driver, even if that +filesystem type is obscure or unexpected for the administrator. +Secondly, a kernel-level exploit is like a local root exploit (worse +in some ways), giving immediate and total access to the system right +down to the hardware level. + +That explains why you should never mount a filesystem from an +untrusted guest on your host kernel. How about libguestfs? We run a +Linux kernel inside a qemu virtual machine, usually running as a +non-root user. The attacker would need to write a filesystem which +first exploited the kernel, and then exploited either qemu +virtualization (eg. a faulty qemu driver) or the libguestfs protocol, +and finally to be as serious as the host kernel exploit it would need +to escalate its privileges to root. This multi-step escalation, +performed by a static piece of data, is thought to be extremely hard +to do, although we never say 'never' about security issues. + +In any case callers can reduce the attack surface by forcing the +filesystem type when mounting (use L). + +=head2 PROTOCOL SECURITY + +The protocol is designed to be secure, being based on RFC 4506 (XDR) +with a defined upper message size. However a program that uses +libguestfs must also take care - for example you can write a program +that downloads a binary from a disk image and executes it locally, and +no amount of protocol security will save you from the consequences. + +=head2 INSPECTION SECURITY + +Parts of the inspection API (see L) return untrusted +strings directly from the guest, and these could contain any 8 bit +data. Callers should be careful to escape these before printing them +to a structured file (for example, use HTML escaping if creating a web +page). + +The inspection API parses guest configuration using two external +libraries: Augeas (Linux configuration) and hivex (Windows Registry). +Both are designed to be robust in the face of malicious data, although +denial of service attacks are still possible, for example with +oversized configuration files. + +=head2 RUNNING UNTRUSTED GUEST COMMANDS + +Be very cautious about running commands from the guest. By running a +command in the guest, you are giving CPU time to a binary that you do +not control, under the same user account as the library, albeit +wrapped in qemu virtualization. More information and alternatives can +be found in the section L. + +=head2 CVE-2010-3851 + +https://bugzilla.redhat.com/642934 + +This security bug concerns the automatic disk format detection that +qemu does on disk images. + +A raw disk image is just the raw bytes, there is no header. Other +disk images like qcow2 contain a special header. Qemu deals with this +by looking for one of the known headers, and if none is found then +assuming the disk image must be raw. + +This allows a guest which has been given a raw disk image to write +some other header. At next boot (or when the disk image is accessed +by libguestfs) qemu would do autodetection and think the disk image +format was, say, qcow2 based on the header written by the guest. + +This in itself would not be a problem, but qcow2 offers many features, +one of which is to allow a disk image to refer to another image +(called the "backing disk"). It does this by placing the path to the +backing disk into the qcow2 header. This path is not validated and +could point to any host file (eg. "/etc/passwd"). The backing disk is +then exposed through "holes" in the qcow2 disk image, which of course +is completely under the control of the attacker. + +In libguestfs this is rather hard to exploit except under two +circumstances: + +=over 4 + +=item 1. + +You have enabled the network or have opened the disk in write mode. + +=item 2. + +You are also running untrusted code from the guest (see +L). + +=back + +The way to avoid this is to specify the expected disk format when +adding disks (the optional C option to +L). You should always do this if the disk is +raw format, and it's a good idea for other cases too. + +For disks added from libvirt using calls like L, +the format is fetched from libvirt and passed through. + +For libguestfs tools, use the I<--format> command line parameter as +appropriate. + =head1 CONNECTION MANAGEMENT =head2 guestfs_h * @@ -835,7 +1176,8 @@ L below. Create a connection handle. -You have to call L on the handle at least once. +You have to call L (or one of the equivalent +calls) on the handle at least once. This function returns a non-NULL pointer to a handle on success or NULL on error. @@ -853,17 +1195,55 @@ This closes the connection handle and frees up all resources used. =head1 ERROR HANDLING -The convention in all functions that return C is that they return -C<-1> to indicate an error. You can get additional information on -errors by calling L and/or by setting up an error -handler with L. +API functions can return errors. For example, almost all functions +that return C will return C<-1> to indicate an error. + +Additional information is available for errors: an error message +string and optionally an error number (errno) if the thing that failed +was a system call. + +You can get at the additional information about the last error on the +handle by calling L, L, +and/or by setting up an error handler with +L. + +When the handle is created, a default error handler is installed which +prints the error message string to C. For small short-running +command line programs it is sufficient to do: + + if (guestfs_launch (g) == -1) + exit (EXIT_FAILURE); + +since the default error handler will ensure that an error message has +been printed to C before the program exits. + +For other programs the caller will almost certainly want to install an +alternate error handler or do error handling in-line like this: -The default error handler prints the information string to C. + g = guestfs_create (); + + /* This disables the default behaviour of printing errors + on stderr. */ + guestfs_set_error_handler (g, NULL, NULL); + + if (guestfs_launch (g) == -1) { + /* Examine the error message and print it etc. */ + char *msg = guestfs_last_error (g); + int errnum = guestfs_last_errno (g); + fprintf (stderr, "%s\n", msg); + /* ... */ + } Out of memory errors are handled differently. The default action is to call L. If this is undesirable, then you can set a handler using L. +L returns C if the handle cannot be created, +and because there is no handle if this happens there is no way to get +additional error information. However L is supposed +to be a lightweight operation which can only fail because of +insufficient memory (it returns NULL in this case). + =head2 guestfs_last_error const char *guestfs_last_error (guestfs_h *g); @@ -875,9 +1255,44 @@ returns C. The lifetime of the returned string is until the next error occurs, or L is called. -The error string is not localized (ie. is always in English), because -this makes searching for error messages in search engines give the -largest number of results. +=head2 guestfs_last_errno + + int guestfs_last_errno (guestfs_h *g); + +This returns the last error number (errno) that happened on C. + +If successful, an errno integer not equal to zero is returned. + +If no error, this returns 0. This call can return 0 in three +situations: + +=over 4 + +=item 1. + +There has not been any error on the handle. + +=item 2. + +There has been an error but the errno was meaningless. This +corresponds to the case where the error did not come from a +failed system call, but for some other reason. + +=item 3. + +There was an error from a failed system call, but for some +reason the errno was not captured and returned. This usually +indicates a bug in libguestfs. + +=back + +Libguestfs tries to convert the errno from inside the applicance into +a corresponding errno for the caller (not entirely trivial: the +appliance might be running a completely different operating system +from the library and error numbers are not standardized across +Un*xen). If this could not be done, then the error is translated to +C. In practice this should only happen in very rare +circumstances. =head2 guestfs_set_error_handler @@ -892,6 +1307,9 @@ The callback C will be called if there is an error. The parameters passed to the callback are an opaque data pointer and the error message string. +C is not passed to the callback. To get that the callback must +call L. + Note that the message string C is freed as soon as the callback function returns, so if you want to stash it somewhere you must make your own copy. @@ -927,40 +1345,17 @@ situations. This returns the current out of memory handler. -=head1 PATH +=head1 API CALLS -Libguestfs needs a kernel and initrd.img, which it finds by looking -along an internal path. +@ACTIONS@ -By default it looks for these in the directory C<$libdir/guestfs> -(eg. C or C). +=head1 STRUCTURES -Use L or set the environment variable -L to change the directories that libguestfs will -search in. The value is a colon-separated list of paths. The current -directory is I searched unless the path contains an empty element -or C<.>. For example C would -search the current directory and then C. +@STRUCTS@ -=head1 HIGH-LEVEL API ACTIONS +=head1 AVAILABILITY -=head2 ABI GUARANTEE - -We guarantee the libguestfs ABI (binary interface), for public, -high-level actions as outlined in this section. Although we will -deprecate some actions, for example if they get replaced by newer -calls, we will keep the old actions forever. This allows you the -developer to program in confidence against the libguestfs API. - -@ACTIONS@ - -=head1 STRUCTURES - -@STRUCTS@ - -=head1 AVAILABILITY - -=head2 GROUPS OF FUNCTIONALITY IN THE APPLIANCE +=head2 GROUPS OF FUNCTIONALITY IN THE APPLIANCE Using L you can test availability of the following groups of functions. This test queries the @@ -1050,111 +1445,100 @@ package versioning: Requires: libguestfs >= 1.0.80 -=begin html +=head1 CALLS WITH OPTIONAL ARGUMENTS - - +A recent feature of the API is the introduction of calls which take +optional arguments. In C these are declared 3 ways. The main way is +as a call which takes variable arguments (ie. C<...>), as in this +example: -=end html + int guestfs_add_drive_opts (guestfs_h *g, const char *filename, ...); -=head1 ARCHITECTURE +Call this with a list of optional arguments, terminated by C<-1>. +So to call with no optional arguments specified: -Internally, libguestfs is implemented by running an appliance (a -special type of small virtual machine) using L. Qemu runs as -a child process of the main program. + guestfs_add_drive_opts (g, filename, -1); - ___________________ - / \ - | main program | - | | - | | child process / appliance - | | __________________________ - | | / qemu \ - +-------------------+ RPC | +-----------------+ | - | libguestfs <--------------------> guestfsd | | - | | | +-----------------+ | - \___________________/ | | Linux kernel | | - | +--^--------------+ | - \_________|________________/ - | - _______v______ - / \ - | Device or | - | disk image | - \______________/ +With a single optional argument: -The library, linked to the main program, creates the child process and -hence the appliance in the L function. + guestfs_add_drive_opts (g, filename, + GUESTFS_ADD_DRIVE_OPTS_FORMAT, "qcow2", + -1); -Inside the appliance is a Linux kernel and a complete stack of -userspace tools (such as LVM and ext2 programs) and a small -controlling daemon called L. The library talks to -L using remote procedure calls (RPC). There is a mostly -one-to-one correspondence between libguestfs API calls and RPC calls -to the daemon. Lastly the disk image(s) are attached to the qemu -process which translates device access by the appliance's Linux kernel -into accesses to the image. +With two: -A common misunderstanding is that the appliance "is" the virtual -machine. Although the disk image you are attached to might also be -used by some virtual machine, libguestfs doesn't know or care about -this. (But you will care if both libguestfs's qemu process and your -virtual machine are trying to update the disk image at the same time, -since these usually results in massive disk corruption). + guestfs_add_drive_opts (g, filename, + GUESTFS_ADD_DRIVE_OPTS_FORMAT, "qcow2", + GUESTFS_ADD_DRIVE_OPTS_READONLY, 1, + -1); -=head1 STATE MACHINE +and so forth. Don't forget the terminating C<-1> otherwise +Bad Things will happen! -libguestfs uses a state machine to model the child process: +=head2 USING va_list FOR OPTIONAL ARGUMENTS - | - guestfs_create - | - | - ____V_____ - / \ - | CONFIG | - \__________/ - ^ ^ ^ \ - / | \ \ guestfs_launch - / | _\__V______ - / | / \ - / | | LAUNCHING | - / | \___________/ - / | / - / | guestfs_launch - / | / - ______ / __|____V - / \ ------> / \ - | BUSY | | READY | - \______/ <------ \________/ +The second variant has the same name with the suffix C<_va>, which +works the same way but takes a C. See the C manual for +details. For the example function, this is declared: -The normal transitions are (1) CONFIG (when the handle is created, but -there is no child process), (2) LAUNCHING (when the child process is -booting up), (3) alternating between READY and BUSY as commands are -issued to, and carried out by, the child process. + int guestfs_add_drive_opts_va (guestfs_h *g, const char *filename, + va_list args); -The guest may be killed by L, or may die -asynchronously at any time (eg. due to some internal error), and that -causes the state to transition back to CONFIG. +=head2 CONSTRUCTING OPTIONAL ARGUMENTS -Configuration commands for qemu such as L can only -be issued when in the CONFIG state. +The third variant is useful where you need to construct these +calls. You pass in a structure where you fill in the optional +fields. The structure has a bitmask as the first element which +you must set to indicate which fields you have filled in. For +our example function the structure and call are declared: -The API offers one call that goes from CONFIG through LAUNCHING to -READY. L blocks until the child process is READY to -accept commands (or until some failure or timeout). -L internally moves the state from CONFIG to LAUNCHING -while it is running. + struct guestfs_add_drive_opts_argv { + uint64_t bitmask; + int readonly; + const char *format; + /* ... */ + }; + int guestfs_add_drive_opts_argv (guestfs_h *g, const char *filename, + const struct guestfs_add_drive_opts_argv *optargs); -API actions such as L can only be issued when in the -READY state. These API calls block waiting for the command to be -carried out (ie. the state to transition to BUSY and then back to -READY). There are no non-blocking versions, and no way to issue more -than one command per handle at the same time. +You could call it like this: -Finally, the child process sends asynchronous messages back to the -main program, such as kernel log messages. You can register a -callback to receive these messages. + struct guestfs_add_drive_opts_argv optargs = { + .bitmask = GUESTFS_ADD_DRIVE_OPTS_READONLY_BITMASK | + GUESTFS_ADD_DRIVE_OPTS_FORMAT_BITMASK, + .readonly = 1, + .format = "qcow2" + }; + + guestfs_add_drive_opts_argv (g, filename, &optargs); + +Notes: + +=over 4 + +=item * + +The C<_BITMASK> suffix on each option name when specifying the +bitmask. + +=item * + +You do not need to fill in all fields of the structure. + +=item * + +There must be a one-to-one correspondence between fields of the +structure that are filled in, and bits set in the bitmask. + +=back + +=head2 OPTIONAL ARGUMENTS IN OTHER LANGUAGES + +In other languages, optional arguments are expressed in the +way that is natural for that language. We refer you to the +language-specific documentation for more details on that. + +For guestfish, see L. =head2 SETTING CALLBACKS TO HANDLE EVENTS @@ -1314,108 +1698,111 @@ and note that only one callback can be registered for a handle). The private data area is implemented using a hash table, and should be reasonably efficient for moderate numbers of keys. -=head1 BLOCK DEVICE NAMING - -In the kernel there is now quite a profusion of schemata for naming -block devices (in this context, by I I mean a physical -or virtual hard drive). The original Linux IDE driver used names -starting with C. SCSI devices have historically used a -different naming scheme, C. When the Linux kernel I -driver became a popular replacement for the old IDE driver -(particularly for SATA devices) those devices also used the -C scheme. Additionally we now have virtual machines with -paravirtualized drivers. This has created several different naming -systems, such as C for virtio disks and C for Xen -PV disks. - -As discussed above, libguestfs uses a qemu appliance running an -embedded Linux kernel to access block devices. We can run a variety -of appliances based on a variety of Linux kernels. - -This causes a problem for libguestfs because many API calls use device -or partition names. Working scripts and the recipe (example) scripts -that we make available over the internet could fail if the naming -scheme changes. - -Therefore libguestfs defines C as the I. Internally C names are translated, if necessary, -to other names as required. For example, under RHEL 5 which uses the -C scheme, any device parameter C is translated to -C transparently. - -Note that this I applies to parameters. The -L, L and similar calls -return the true names of the devices and partitions as known to the -appliance. - -=head2 ALGORITHM FOR BLOCK DEVICE NAME TRANSLATION - -Usually this translation is transparent. However in some (very rare) -cases you may need to know the exact algorithm. Such cases include -where you use L to add a mixture of virtio and IDE -devices to the qemu-based appliance, so have a mixture of C -and C devices. - -The algorithm is applied only to I which are known to be -either device or partition names. Return values from functions such -as L are never changed. - -=over 4 - -=item * - -Is the string a parameter which is a device or partition name? - -=item * - -Does the string begin with C? - -=item * - -Does the named device exist? If so, we use that device. -However if I then we continue with this algorithm. - -=item * +=begin html -Replace initial C string with C. + + -For example, change C to C. +=end html -If that named device exists, use it. If not, continue. +=head1 ARCHITECTURE -=item * +Internally, libguestfs is implemented by running an appliance (a +special type of small virtual machine) using L. Qemu runs as +a child process of the main program. -Replace initial C string with C. + ___________________ + / \ + | main program | + | | + | | child process / appliance + | | __________________________ + | | / qemu \ + +-------------------+ RPC | +-----------------+ | + | libguestfs <--------------------> guestfsd | | + | | | +-----------------+ | + \___________________/ | | Linux kernel | | + | +--^--------------+ | + \_________|________________/ + | + _______v______ + / \ + | Device or | + | disk image | + \______________/ -If that named device exists, use it. If not, return an error. +The library, linked to the main program, creates the child process and +hence the appliance in the L function. -=back +Inside the appliance is a Linux kernel and a complete stack of +userspace tools (such as LVM and ext2 programs) and a small +controlling daemon called L. The library talks to +L using remote procedure calls (RPC). There is a mostly +one-to-one correspondence between libguestfs API calls and RPC calls +to the daemon. Lastly the disk image(s) are attached to the qemu +process which translates device access by the appliance's Linux kernel +into accesses to the image. -=head2 PORTABILITY CONCERNS +A common misunderstanding is that the appliance "is" the virtual +machine. Although the disk image you are attached to might also be +used by some virtual machine, libguestfs doesn't know or care about +this. (But you will care if both libguestfs's qemu process and your +virtual machine are trying to update the disk image at the same time, +since these usually results in massive disk corruption). -Although the standard naming scheme and automatic translation is -useful for simple programs and guestfish scripts, for larger programs -it is best not to rely on this mechanism. +=head1 STATE MACHINE -Where possible for maximum future portability programs using -libguestfs should use these future-proof techniques: +libguestfs uses a state machine to model the child process: -=over 4 + | + guestfs_create + | + | + ____V_____ + / \ + | CONFIG | + \__________/ + ^ ^ ^ \ + / | \ \ guestfs_launch + / | _\__V______ + / | / \ + / | | LAUNCHING | + / | \___________/ + / | / + / | guestfs_launch + / | / + ______ / __|____V + / \ ------> / \ + | BUSY | | READY | + \______/ <------ \________/ -=item * +The normal transitions are (1) CONFIG (when the handle is created, but +there is no child process), (2) LAUNCHING (when the child process is +booting up), (3) alternating between READY and BUSY as commands are +issued to, and carried out by, the child process. -Use L or L to list -actual device names, and then use those names directly. +The guest may be killed by L, or may die +asynchronously at any time (eg. due to some internal error), and that +causes the state to transition back to CONFIG. -Since those device names exist by definition, they will never be -translated. +Configuration commands for qemu such as L can only +be issued when in the CONFIG state. -=item * +The API offers one call that goes from CONFIG through LAUNCHING to +READY. L blocks until the child process is READY to +accept commands (or until some failure or timeout). +L internally moves the state from CONFIG to LAUNCHING +while it is running. -Use higher level ways to identify filesystems, such as LVM names, -UUIDs and filesystem labels. +API actions such as L can only be issued when in the +READY state. These API calls block waiting for the command to be +carried out (ie. the state to transition to BUSY and then back to +READY). There are no non-blocking versions, and no way to issue more +than one command per handle at the same time. -=back +Finally, the child process sends asynchronous messages back to the +main program, such as kernel log messages. You can register a +callback to receive these messages. =head1 INTERNALS @@ -1571,45 +1958,6 @@ The daemon self-limits the frequency of progress messages it sends (see C). Not all calls generate progress messages. -=head1 MULTIPLE HANDLES AND MULTIPLE THREADS - -All high-level libguestfs actions are synchronous. If you want -to use libguestfs asynchronously then you must create a thread. - -Only use the handle from a single thread. Either use the handle -exclusively from one thread, or provide your own mutex so that two -threads cannot issue calls on the same handle at the same time. - -See the graphical program guestfs-browser for one possible -architecture for multithreaded programs using libvirt and libguestfs. - -=head1 QEMU WRAPPERS - -If you want to compile your own qemu, run qemu from a non-standard -location, or pass extra arguments to qemu, then you can write a -shell-script wrapper around qemu. - -There is one important rule to remember: you I> as -the last command in the shell script (so that qemu replaces the shell -and becomes the direct child of the libguestfs-using program). If you -don't do this, then the qemu process won't be cleaned up correctly. - -Here is an example of a wrapper, where I have built my own copy of -qemu from source: - - #!/bin/sh - - qemudir=/home/rjones/d/qemu - exec $qemudir/x86_64-softmmu/qemu-system-x86_64 -L $qemudir/pc-bios "$@" - -Save this script as C (or wherever), C, -and then use it by setting the LIBGUESTFS_QEMU environment variable. -For example: - - LIBGUESTFS_QEMU=/tmp/qemu.wrapper guestfish - -Note that libguestfs also calls qemu with the -help and -version -options in order to determine features. - =head1 LIBGUESTFS VERSION NUMBERS Since April 2010, libguestfs has started to make separate development