SUSE Linux Enterprise Server 16.0

Managing KVM Virtualization on SUSE Linux Enterprise Server

Publication Date: 28 May 2026

WHAT?: KVM (Kernel Virtual Machine) is a virtualization solution that transforms the Linux kernel into a hypervisor for running multiple isolated virtual environments.
WHY?: Use KVM virtualization to consolidate server workloads and significantly save hardware resources.
EFFORT: It takes less than 15 minutes of reading time to understand the concept of virtualization.
GOAL: By the end of this guide, you will have a configured KVM host and be able to deploy virtual machine guests with optimized storage, networking and direct hardware access.

Revision History: Managing KVM Virtualization on SUSE Linux Enterprise Server

1 Setting up a KVM VM Host Server #

This section documents how to set up and use SUSE Linux Enterprise Server 16.0 as a QEMU-KVM based virtual machine host.

Tip: Resources

The virtual guest system needs the same hardware resources as if it were installed on a physical machine. The more guests you plan to run on the host system, the more hardware resources—CPU, disk, memory and network—you need to add to the VM Host Server.

1.1 CPU support for virtualization #

To run KVM, your CPU must support virtualization, and virtualization needs to be enabled in the BIOS. The file /proc/cpuinfo includes information about your CPU features.

To find out whether your system supports virtualization, see the section Architecture Support in the article Virtualization Limits and Support.

1.2 Required software #

The KVM host requires several packages to be installed. To install all necessary packages, do the following:

Install the patterns-server-kvm_server and patterns-server-kvm_tools.
Create a Network Bridge. If you do not plan to dedicate an additional physical network card to your virtual guests, network bridging is a standard way to connect the guest machines to the network.
After all the required packages are installed (and the new network setup activated), try to load the KVM kernel module relevant for your CPU type—kvm_intel or kvm_amd:
```
# modprobe kvm_amd
```
Check if the module is loaded into memory:
```
> lsmod | grep kvm
kvm_amd               237568  20
kvm                  1376256  17 kvm_amd
```
Now the KVM host is ready to serve KVM VM Guests.

1.3 KVM host-specific features #

You can improve the performance of KVM-based VM Guests by letting them fully use specific features of the VM Host Server's hardware (paravirtualization). This section introduces techniques that allow guests to access the physical host's hardware directly, bypassing the emulation layer for optimal performance.

Tip

Examples included in this section assume basic knowledge of the qemu-system-ARCH command-line options.

1.3.1 Using the host storage with `virtio-scsi` #

virtio-scsi is an advanced storage stack for KVM. It replaces the former virtio-blk stack for SCSI device pass-through. It has several advantages over virtio-blk:

Improved scalability

KVM guests have a limited number of PCI controllers, which results in a limited number of attached devices. virtio-scsi solves this limitation by grouping multiple storage devices on a single controller. Each device on a virtio-scsi controller is represented as a logical unit, or LUN.

Standard command set

virtio-blk uses a small set of commands that need to be known to both the virtio-blk driver and the virtual machine monitor, and so introducing a new command requires updating both the driver and the monitor.

By comparison, virtio-scsi does not define commands, but rather a transport protocol for these commands following the industry-standard SCSI specification. This approach is shared with other technologies, such as Fibre Channel, ATAPI and USB devices.

Device naming

virtio-blk devices are presented inside the guest as /dev/vdX, which is different from device names in physical systems and may cause migration problems.

virtio-scsi keeps the device names identical to those on physical systems, making the virtual machines easily relocatable.

SCSI device pass-through

For virtual disks backed by a whole LUN on the host, it is preferable for the guest to send SCSI commands directly to the LUN (pass-through). This is limited in virtio-blk, as guests need to use the virtio-blk protocol instead of SCSI command pass-through, and, moreover, it is not available for Windows guests. virtio-scsi natively removes these limitations.

1.3.1.1 `virtio-scsi` usage #

KVM supports the SCSI pass-through feature with the virtio-scsi-pci device:

# qemu-system-x86_64 [...] \
-device virtio-scsi-pci,id=scsi

1.3.2 Accelerated networking with `vhost-net` #

The vhost-net module is used to accelerate KVM's paravirtualized network drivers. It provides better latency and greater network throughput. Use the vhost-net driver by starting the guest with the following example command line:

# qemu-system-x86_64 [...] \
-netdev tap,id=guest0,vhost=on,script=no \
-net nic,model=virtio,netdev=guest0,macaddr=00:16:35:AF:94:4B

guest0 is an identification string of the vhost-driven device.

1.3.3 Scaling network performance with multiqueue virtio-net #

As the number of virtual CPUs increases in VM Guests, QEMU offers a way of improving network performance using multiqueue. Multiqueue virtio-net scales network performance by allowing VM Guest virtual CPUs to transfer packets in parallel. Multiqueue support is required on both the VM Host Server and VM Guest sides.

Tip: Performance benefit

The multiqueue virtio-net solution is most beneficial in the following cases:

Network traffic packets are large.
VM Guest has many connections active at the same time, mainly between the guest systems, or between the guest and the host, or between the guest and an external system.
The number of active queues is equal to the number of virtual CPUs in the VM Guest.

Note

While multiqueue virtio-net increases the total network throughput, it increases CPU consumption as it uses the virtual CPU's power.

Procedure 1: How to enable multiqueue virtio-net #

The following procedure lists important steps to enable the multiqueue feature with qemu-system-ARCH. It assumes that a tap network device with multiqueue capability (supported since kernel version 3.8) is set up on the VM Host Server.

In qemu-system-ARCH, enable multiqueue for the tap device:
```
-netdev tap,vhost=on,queues=2*N
```
where N stands for the number of queue pairs.
In qemu-system-ARCH, enable multiqueue and specify MSI-X (Message Signaled Interrupt) vectors for the virtio-net-pci device:
```
-device virtio-net-pci,mq=on,vectors=2*N+2
```
where the formula for the number of MSI-X vectors results from: N vectors for TX (transmit) queues, N for RX (receive) queues, one for configuration purposes, and one for possible VQ (vector quantization) control.
In the VM Guest, enable multiqueue on the relevant network interface (eth0 in this example):
```
> sudo ethtool -L eth0 combined 2*N
```

The resulting qemu-system-ARCH command line looks similar to the following example:

qemu-system-x86_64 [...] -netdev tap,id=guest0,queues=8,vhost=on \
-device virtio-net-pci,netdev=guest0,mq=on,vectors=10

The id of the network device (guest0) needs to be identical for both options.

Inside the running VM Guest, specify the following command with root privileges:

> sudo ethtool -L eth0 combined 8

Now the guest system networking uses the multiqueue support from the qemu-system-ARCH hypervisor.

1.3.4 VFIO: secure direct access to devices #

Directly assigning a PCI device to a VM Guest (PCI pass-through) avoids performance issues caused by avoiding any emulation in performance-critical paths. VFIO replaces the traditional KVM PCI Pass-Through device assignment. A prerequisite for this feature is a VM Host Server configuration as described in Important: Requirements for VFIO and SR-IOV.

To be able to assign a PCI device via VFIO to a VM Guest, you need to find out which IOMMU Group it belongs to. The IOMMU (input/output memory management unit that connects a direct memory access-capable I/O bus to the main memory) API supports the notion of groups. A group is a set of devices that can be isolated from all other devices in the system. Groups are therefore the unit of ownership used by VFIO.

Procedure 2: Assigning a PCI device to a VM Guest via VFIO #

Identify the host PCI device to assign to the guest.
```
> sudo lspci -nn
[...]
00:10.0 Ethernet controller [0200]: Intel Corporation 82576 \
Virtual Function [8086:10ca] (rev 01)
[...]
```
Note down the device ID, 00:10.0 in this example, and the vendor ID (8086:10ca).

Find the IOMMU group of this device:

> sudo readlink /sys/bus/pci/devices/0000\:00\:10.0/iommu_group
../../../kernel/iommu_groups/20

The IOMMU group for this device is 20. Now you can check the devices belonging to the same IOMMU group:

> sudo ls -l /sys/bus/pci/devices/0000\:01\:10.0/iommu_group/devices/
[...] 0000:00:1e.0 -> ../../../../devices/pci0000:00/0000:00:1e.0
[...] 0000:01:10.0 -> ../../../../devices/pci0000:00/0000:00:1e.0/0000:01:10.0
[...] 0000:01:10.1 -> ../../../../devices/pci0000:00/0000:00:1e.0/0000:01:10.1

Unbind the device from the device driver:

> sudo echo "0000:01:10.0" > /sys/bus/pci/devices/0000\:01\:10.0/driver/unbind

Bind the device to the vfio-pci driver using the vendor ID from step 1:
```
> sudo echo "8086 153a" > /sys/bus/pci/drivers/vfio-pci/new_id
```
A new device /dev/vfio/IOMMU_GROUP is created as a result, /dev/vfio/20 in this case.
Change the ownership of the newly created device:
```
> sudo chown qemu.qemu /dev/vfio/DEVICE
```

Now run the VM Guest with the PCI device assigned.

> sudo qemu-system-ARCH [...] -device
     vfio-pci,host=00:10.0,id=ID

Important: No hotplugging

As of SUSE Linux Enterprise Server 16.0, hotplugging of PCI devices passed to a VM Guest via VFIO is not supported.

You can find more detailed information on the VFIO driver in the /usr/src/linux/Documentation/vfio.txt file (package kernel-source needs to be installed).

1.3.5 VirtFS: sharing directories between host and guests #

VM Guests normally run in a separate computing space—they are provided their own memory range, dedicated CPUs, and file system space. The ability to share parts of the VM Host Server's file system makes the virtualization environment more flexible by simplifying mutual data exchange. Network file systems, such as CIFS and NFS, have been the traditional way of sharing directories. But as they are not specifically designed for virtualization purposes, they suffer from major performance and feature issues.

Note

SELinux Requirement: For security_model=mapped, configure SELinux context:

#  semanage fcontext -a -t virtiofsd_t "/tmp(/.*)?"
#  restorecon -Rv /tmp

1.3.5.1 Host Configuration #

Nothing needs to be done on the host side aside from installing virtiofsd. The VM Guest xml libvirt file should have a configuration like:

<filesystem type="virtiofs" accessmode="mapped"/>
   <driver="virtiofs"/>
   <source dir="/tmp"/>
   <target dir="host_tmp"/>
   <alias name="fs0"/>
   <address type="pci" domain="0x0000" bus="0x01" slot="0x00" function="0x0"/>
</filesystem>

Important: 9p is deprecated

9p Protocol is a legacy solution with critical flaws. Moreover, it incurs ~30-50% higher CPU overhead than virtiofs for sequential I/O due to constant context switching between user and kernel space.

1.3.5.1.1 Access Mode Options for virtiofs #

The accessmode attribute in the <filesystem> element defines how guest file permissions map to host permissions. Only two values are valid:

Table 1: virtiofs Access Mode Options #

accessmode Description Security Implications

`accessmode`	Description	Security Implications
`mapped`	The default mode. Maps guest UIDs/GIDs to host UIDs/GIDs using a translation table. Guest files appear as if owned by a dedicated "virtiofs" user on the host (typically UID 1000). Example: Guest user `uid=1000` writes to host `/tmp` → Host file appears as owned by `virtiofsd` user (not the guest user).	Recommended for all environments. Prevents guest users from directly accessing host user accounts. Does not require matching host users.
`passthrough`	Guest UIDs/GIDs are used directly on the host. The guest must have matching users on the host. Example: Guest user `uid=1000` writes to host `/tmp` → Host file appears as owned by the user with `uid=1000`.	Only for trusted guests (e.g., same-tenant cloud environments). Requires matching host users (e.g., host must have `useradd -u 1000 guestuser`). Security risk: Compromised guest can directly access host user accounts.
`none`	Invalid value (common documentation error). `virtiofsd` rejects this option with error: `security_model=none is not supported`. Always use `mapped` (default) or `passthrough`.	Causes immediate configuration failure.

mapped

The default mode. Maps guest UIDs/GIDs to host UIDs/GIDs using a translation table. Guest files appear as if owned by a dedicated "virtiofs" user on the host (typically UID 1000).

Example: Guest user uid=1000 writes to host /tmp → Host file appears as owned by virtiofsd user (not the guest user).

Recommended for all environments.

Prevents guest users from directly accessing host user accounts.

Does not require matching host users.

passthrough

Guest UIDs/GIDs are used directly on the host. The guest must have matching users on the host.

Example: Guest user uid=1000 writes to host /tmp → Host file appears as owned by the user with uid=1000.

Only for trusted guests (e.g., same-tenant cloud environments).

Requires matching host users (e.g., host must have useradd -u 1000 guestuser).

Security risk: Compromised guest can directly access host user accounts.

none

Invalid value (common documentation error).

virtiofsd rejects this option with error: security_model=none is not supported.

Always use mapped (default) or passthrough.

Causes immediate configuration failure.

Note

Key Configuration Rule: accessmode='mapped' must match the host's security_model=mapped in virtiofsd. Mismatched modes cause mount failures with errors like: Failed to set security context: Operation not permitted.

1.3.5.2 Guest Configuration #

On the guest, load the kernel module and mount the file system:

Example 1: Guest Mount Command #

The virtiofs module should be loaded automatically, if not do:

#  modprobe virtiofs

Now you can mount the target directory on your VM Guest:

#  mount -t virtiofs -o dax host_tmp /mnt/hosttmp

Options:

virtiofs
dax: Enables direct access for performance (recommended). dax cannot be used with the BTRFS file system.
host_tmp: The target directory in the VM Guest configuration.

1.3.5.3 Persistent mounts virtiofs across reboot #

Simply add this line to /etc/fstab:

host_tmp /mnt/hosttmp virtiofs  rw,nofail 0 0

1.3.5.4 Troubleshooting Common Issues #

Guest Kernel Check: verify the module virtiofs is loaded:
```
#  lsmod | grep virtiofs
```
Permission denied: Check SELinux context (see Section 1.3.5.1, “Host Configuration”)
Mount fails with 9p: Verify you used -t virtiofs (not 9p)
Guest writes not syncing: Add cache=none to mount options

1.3.6 KSM: sharing memory pages between guests #

Kernel Same Page Merging (KSM) is a Linux kernel feature that merges identical memory pages from multiple running processes into one memory region. Because KVM guests run as processes under Linux, KSM provides the memory overcommit feature to hypervisors for more efficient use of memory. Therefore, if you need to run multiple virtual machines on a host with limited memory, KSM may be helpful to you.

KSM stores its status information in the files under the /sys/kernel/mm/ksm directory:

> ls -1 /sys/kernel/mm/ksm
full_scans
merge_across_nodes
pages_shared
pages_sharing
pages_to_scan
pages_unshared
pages_volatile
run
sleep_millisecs

For more information on the meaning of the /sys/kernel/mm/ksm/* files, see /usr/src/linux/Documentation/vm/ksm.txt (package kernel-source).

To use KSM, do the following.

Although SLES includes KSM support in the kernel, it is disabled by default. To enable it, run the following command:
```
# echo 1 > /sys/kernel/mm/ksm/run
```
Now run several VM Guests under KVM and inspect the content of files pages_sharing and pages_shared, for example:
```
> while [ 1 ]; do cat /sys/kernel/mm/ksm/pages_shared; sleep 1; done
13522
13523
13519
13518
13520
13520
13528
```

2 Guest installation #

Revision History: Managing KVM Virtualization on SUSE Linux Enterprise Server

The libvirt-based tools such as virt-manager and virt-install offer convenient interfaces to set up and manage virtual machines. They act as a kind of wrapper for the qemu-system-ARCH command. However, it is also possible to use qemu-system-ARCH directly without using libvirt-based tools.

Warning: qemu-system-ARCH and libvirt

Virtual Machines created with qemu-system-ARCH are not visible to the libvirt-based tools.

2.1 Basic installation with `qemu-system-ARCH` #

In the following example, a virtual machine for a SUSE Linux Enterprise Server 11 installation is created. For detailed information on the commands, refer to the respective man pages.

If you do not already have an image of a system that you want to run in a virtualized environment, you need to create one from the installation media. In such a case, you need to prepare a hard disk image, and obtain an image of the installation media or the media itself.

Create a hard disk with qemu-img.

> qemu-img create1 -f raw2 /images/sles/hda3 8G4

1	The subcommand `create` tells `qemu-img` to create a new image.
2	Specify the disk's format with the `-f` parameter.
3	The full path to the image file.
4	The size of the image, 8 GB in this case. The image is created as a Sparse image file that grows when the disk is filled with data. The specified size defines the maximum size to which the image file can grow.

After at least one hard disk image is created, you can set up a virtual machine with qemu-system-ARCH that boots into the installation system:

# qemu-system-x86_64 -name "sles"1-machine accel=kvm -M pc2 -m 7683 \
-smp 24 -boot d5 \
-drive file=/images/sles/hda,if=virtio,index=0,media=disk,format=raw6 \
-drive file=/isos/15 SP6-Online-ARCH-GM-media1.iso,index=1,media=cdrom7 \
-net nic,model=virtio,macaddr=52:54:00:05:11:118 -net user \
-vga cirrus9 -balloon virtio10

1	Name of the virtual machine that is displayed in the window caption and can be used for the VNC server. This name must be unique.
2	Specifies the machine type. Use `qemu-system-ARCH` `-M ?` to display a list of valid parameters. `pc` is the default Standard PC.
3	Maximum amount of memory for the virtual machine.
4	Defines an SMP system with two processors.
5	Specifies the boot order. Valid values are `a`, `b` (floppy 1 and 2), `c` (first hard disk), `d` (first CD-ROM), or `n` to `p` (Ether-boot from network adapter 1-3). Defaults to `c`.
6	Defines the first (`index=0`) hard disk. It is accessed as a paravirtualized (`if=virtio`) drive in `raw` format.
7	The second (`index=1`) image drive acts as a CD-ROM.
8	Defines a paravirtualized (`model=virtio`) network adapter with the MAC address `52:54:00:05:11:11`. Be sure to specify a unique MAC address; otherwise, a network conflict may occur.
9	Specifies the graphic card. If you specify `none`, the graphic card is disabled.
10	Defines the paravirtualized balloon device that allows dynamically changing the amount of memory (up to the maximum value specified with the parameter `-m`).

After the installation of the guest operating system finishes, you can start the related virtual machine without the need to specify the CD-ROM device:

# qemu-system-x86_64 -name "sles" -machine type=pc,accel=kvm -m 768 \
-smp 2 -boot c \
-drive file=/images/sles/hda,if=virtio,index=0,media=disk,format=raw \
-net nic,model=virtio,macaddr=52:54:00:05:11:11 \
-vga cirrus -balloon virtio

2.2 Managing disk images with `qemu-img` #

In the previous section (see Section 2.1, “Basic installation with qemu-system-ARCH”), we used the qemu-img command to create an image of a hard disk. You can, however, use qemu-img for general disk image manipulation. This section introduces qemu-img subcommands to help manage disk images flexibly.

2.2.1 General information on qemu-img invocation #

qemu-img uses subcommands (like zypper does) to do specific tasks. Each subcommand understands a different set of options. Certain options are general and used by more of these subcommands, while others are unique to the related subcommand. See the qemu-img man page (man 1 qemu-img) for a list of all supported options. qemu-img uses the following general syntax:

> qemu-img subcommand [options]

and supports the following subcommands:

create: Creates a new disk image on the file system.
check: Checks an existing disk image for errors.
compare: Check if two images have the same content.
map: Dumps the metadata of the image file name and its backing file chain.
amend: Modifies the options specific to the image format for the image file.
convert: Converts an existing disk image to a new one in a different format.
info: Displays information about the relevant disk image.
snapshot: Manages snapshots of existing disk images.
commit: Applies changes made to an existing disk image.
rebase: Creates a new base image based on an existing image.
resize: Increases or decreases the size of an existing image.

2.2.2 Creating, converting, and checking disk images #

This section describes how to create disk images, check their condition, convert a disk image from one format to another, and get detailed information about a particular disk image.

2.2.2.1 qemu-img create #

Use qemu-img create to create a new disk image for your VM Guest operating system. The command uses the following syntax:

> qemu-img create -f fmt1 -o options2 fname3 size4

1	The format of the target image. Supported formats are `raw` and `qcow2`.
2	Certain image formats support additional options to be passed on the command line. You can specify them here with the `-o` option. The `raw` image format supports only the `size` option, so it is possible to insert `-o size=8G` instead of adding the size option at the end of the command.
3	Path to the target disk image to be created.
4	Size of the target disk image (if not already specified with the `-o size=<image_size>` option. Optional suffixes for the image size are `K` (kilobyte), `M` (megabyte), `G` (gigabyte), or `T` (terabyte).

To create a new disk image sles.raw in the directory /images growing up to a maximum size of 4 GB, run the following command:

> qemu-img create -f raw -o size=4G /images/sles.raw
Formatting '/images/sles.raw', fmt=raw size=4294967296

> ls -l /images/sles.raw
-rw-r--r-- 1 tux users 4294967296 Nov 15 15:56 /images/sles.raw

> qemu-img info /images/sles.raw
image: /images/sles11.raw
file format: raw
virtual size: 4.0G (4294967296 bytes)
disk size: 0

As you can see, the virtual size of the newly created image is 4 GB, but the actual reported disk size is 0, as no data has been written to the image yet.

Tip: VM Guest images on the Btrfs file system

If you need to create a disk image on the Btrfs file system, you can use nocow=on to reduce the performance overhead created by the copy-on-write feature of Btrfs:

> qemu-img create -o nocow=on test.img 8G

If you, however, want to use copy-on-write, for example, for creating snapshots or sharing them across virtual machines, then leave the command line without the nocow option.

2.2.2.2 qemu-img convert #

Use qemu-img convert to convert disk images to another format. To get a complete list of image formats supported by QEMU, run qemu-img -h and look at the last line of the output. The command uses the following syntax:

> qemu-img convert -c1 -f fmt2 -O out_fmt3 -o options4 fname5 out_fname6

1	Applies compression to the target disk image. Only `qcow` and `qcow2` formats support compression.
2	The format of the source disk image. It is normally autodetected and can therefore be omitted.
3	The format of the target disk image.
4	Specify additional options relevant to the target image format. Use `-o ?` to view the list of options supported by the target image format.
5	Path to the source disk image to be converted.
6	Path to the converted target disk image.

> qemu-img convert -O vmdk /images/sles.raw \
/images/sles.vmdk

> ls -l /images/
-rw-r--r-- 1 tux users 4294967296 16. lis 10.50 sles.raw
-rw-r--r-- 1 tux users 2574450688 16. lis 14.18 sles.vmdk

To see a list of options relevant for the selected target image format, run the following command (replace vmdk with your image format):

> qemu-img convert -O vmdk /images/sles.raw /images/sles.vmdk -o ?
Supported options:
size             Virtual disk size
backing_file     File name of a base image
compat6          VMDK version 6 image
subformat        VMDK flat extent format, can be one of {monolithicSparse \
    (default) | monolithicFlat | twoGbMaxExtentSparse | twoGbMaxExtentFlat}
scsi             SCSI image

2.2.2.3 qemu-img check #

Use qemu-img check to check the existing disk image for errors. Not all disk image formats support this feature. The command uses the following syntax:

> qemu-img check -f fmt1 fname2

1	The format of the source disk image. It is normally autodetected and can therefore be omitted.
2	Path to the source disk image to be checked.

If no error is found, the command returns no output. Otherwise, the type and number of errors found are shown.

> qemu-img check -f qcow2 /images/sles.qcow2
ERROR: invalid cluster offset=0x2af0000
[...]
ERROR: invalid cluster offset=0x34ab0000
378 errors were found on the image.

2.2.2.4 Increasing the size of an existing disk image #

When creating a new image, you must specify its maximum size before the image is created (see Section 2.2.2.1, “qemu-img create”). After you have installed the VM Guest and have been using it for a certain time, the initial size of the image may no longer be sufficient. In that case, add more space to it.

To increase the size of an existing disk image by 2 gigabytes, use:

> qemu-img resize /images/sles.raw +2GB

Note

You can resize the disk image using the formats raw and qcow2. To resize an image in another format, convert it to a supported format with qemu-img convert first.

The image now contains an empty space of 2 GB after the final partition. You can resize existing partitions or add new ones.

2.2.2.5 Advanced options for the qcow2 file format #

qcow2 is the main disk image format used by QEMU. Its size grows on demand, and disk space is only allocated when it is needed by the virtual machine.

A qcow2-formatted file is organized in units of constant size. These units are called clusters. Viewed from the guest side, the virtual disk is also divided into clusters of the same size. QEMU defaults to 64 kB clusters, but you can specify a different value when creating a new image:

> qemu-img create -f qcow2 -o cluster_size=128K virt_disk.qcow2 4G

A qcow2 image contains a set of tables organized in two levels that are called the L1 and L2 tables. There is just one L1 table per disk image, while there can be many L2 tables depending on how big the image is.

To read or write data to the virtual disk, QEMU needs to read its corresponding L2 table to find out the relevant data location. Because reading the table for each I/O operation consumes system resources, QEMU keeps a cache of L2 tables in memory to speed up disk access.

2.2.2.5.1 Choosing the right cache size #

The cache size relates to the amount of allocated space. L2 cache can map the following amount of virtual disk:

disk_size = l2_cache_size * cluster_size / 8

With the default 64 kB of cluster size, that is

disk_size = l2_cache_size * 8192

Therefore, to have a cache that maps n gigabytes of disk space with the default cluster size, you need

l2_cache_size = disk_size_GB * 131072

QEMU uses 1 MB (1048576 bytes) of L2 cache by default. Following the above formulas, 1 MB of L2 cache covers 8 GB (1048576 / 131072) of virtual disk. This means that the performance is fine with the default L2 cache size if your virtual disk size is up to 8 GB. For larger disks, you can speed up disk access by increasing the L2 cache size.

2.2.2.5.2 Configuring the cache size #

You can use the -drive option on the QEMU command line to specify the cache size. Alternatively, when communicating via QMP, use the blockdev-add command. For more information on QMP, see Section 3.11, “QMP - QEMU machine protocol”.

The following options configure the cache size for the virtual guest:

l2-cache-size: The maximum size of the L2 table cache.
refcount-cache-size: The maximum size of the refcount block cache. For more information on refcount, see https://raw.githubusercontent.com/qemu/qemu/master/docs/qcow2-cache.txt.
cache-size: The maximum size of both caches combined.

When specifying values for the options above, be aware of the following:

The size of both the L2 and refcount block caches needs to be a multiple of the cluster size.
If you only set one option, QEMU automatically adjusts the other options so that the L2 cache is 4 times bigger than the refcount cache.

The refcount cache is used much less often than the L2 cache; therefore, you can keep it small:

# qemu-system-ARCH [...] \
 -drive file=disk_image.qcow2,l2-cache-size=4194304,refcount-cache-size=262144

2.2.2.5.3 Reducing memory usage #

The larger the cache, the more memory it consumes. There is a separate L2 cache for each qcow2 file. When using a lot of big disk images, you may need a considerably large amount of memory. Memory consumption is even worse if you add backing files (Section 2.2.4, “Manipulate disk images effectively”) and snapshots (see Section 2.2.3, “Managing snapshots of virtual machines with qemu-img”) to the guest's setup chain.

This is why QEMU introduced the cache-clean-interval setting. It defines an interval in seconds after which all cache entries that have not been accessed are removed from memory.

The following example removes all unused cache entries every 10 minutes:

# qemu-system-ARCH [...] -drive file=hd.qcow2,cache-clean-interval=600

If this option is not set, the default value is 0, and it disables this feature.

2.2.3 Managing snapshots of virtual machines with qemu-img #

Virtual Machine snapshots are snapshots of the complete environment in which a VM Guest is running. The snapshot includes the state of the processor (CPU), memory (RAM), devices, and all writable disks.

Snapshots are helpful when you need to save your virtual machine in a particular state. For example, after you have configured network services on a virtualized server and want to quickly start the virtual machine in the same state you last saved it. Or you can create a snapshot after the virtual machine has been powered off to create a backup state before you try something experimental and make the VM Guest unstable. This section introduces the latter case, while the former is described in Section 3, “Virtual machine administration using QEMU monitor”.

To use snapshots, your VM Guest must contain at least one writable hard disk image in qcow2 format. This device is normally the first virtual hard disk.

Virtual Machine snapshots are created with the savevm command in the interactive QEMU monitor. To make identifying a particular snapshot easier, you can assign it a tag. For more information on the QEMU monitor, see Section 3, “Virtual machine administration using QEMU monitor”.

Once your qcow2 disk image contains saved snapshots, you can inspect them with the qemu-img snapshot command.

Warning: Shut down the VM Guest

Do not create or delete virtual machine snapshots with the qemu-img snapshot command while the virtual machine is running. Otherwise, you may damage the disk image with the state of the virtual machine saved.

2.2.3.1 Listing existing snapshots #

Use qemu-img snapshot -l DISK_IMAGE to view a list of all existing snapshots saved in the disk_image image. You can get the list even while the VM Guest is running.

> qemu-img snapshot -l /images/sles.qcow2
Snapshot list:
ID1       TAG2               VM SIZE3        DATE4          VM CLOCK5
1         booting                4.4M 2013-11-22 10:51:10   00:00:20.476
2         booted                 184M 2013-11-22 10:53:03   00:02:05.394
3         logged_in              273M 2013-11-22 11:00:25   00:04:34.843
4         ff_and_term_running    372M 2013-11-22 11:12:27   00:08:44.965

1	Unique auto-incremented identification number of the snapshot.
2	Unique description string of the snapshot. It is meant as a human-readable version of the ID.
3	The disk space occupied by the snapshot. The more memory is consumed by running applications, the bigger the snapshot is.
4	Time and date the snapshot was created.
5	The current state of the virtual machine's clock.

2.2.3.2 Creating snapshots of a powered-off virtual machine #

Use qemu-img snapshot -c SNAPSHOT_TITLE DISK_IMAGE to create a snapshot of the current state of a virtual machine that was previously powered off.

> qemu-img snapshot -c backup_snapshot /images/sles.qcow2

> qemu-img snapshot -l /images/sles.qcow2
Snapshot list:
ID        TAG                 VM SIZE                DATE       VM CLOCK
1         booting                4.4M 2013-11-22 10:51:10   00:00:20.476
2         booted                 184M 2013-11-22 10:53:03   00:02:05.394
3         logged_in              273M 2013-11-22 11:00:25   00:04:34.843
4         ff_and_term_running    372M 2013-11-22 11:12:27   00:08:44.965
5         backup_snapshot           0 2013-11-22 14:14:00   00:00:00.000

If something breaks in your VM Guest and you need to restore the state of the saved snapshot (ID 5 in our example), power off your VM Guest and execute the following command:

> qemu-img snapshot -a 5 /images/sles.qcow2

The next time you run the virtual machine with qemu-system-ARCH, it will be in the state of snapshot number 5.

Note

The qemu-img snapshot -c command is not related to the savevm command of QEMU monitor (see Section 3, “Virtual machine administration using QEMU monitor”). For example, you cannot apply a snapshot with qemu-img snapshot -a on a snapshot created with savevm in QEMU's monitor.

2.2.3.3 Deleting snapshots #

Use qemu-img snapshot -d SNAPSHOT_ID DISK_IMAGE to delete old or unneeded snapshots of a virtual machine. This saves disk space inside the qcow2 disk image, as the space occupied by the snapshot data is restored:

> qemu-img snapshot -d 2 /images/sles.qcow2

2.2.4 Manipulate disk images effectively #

Imagine the following real-life situation: you are a server administrator who runs and manages several virtualized operating systems. One group of these systems is based on one specific distribution, while another group (or groups) is based on different versions of the distribution or even on a different (and maybe non-Unix) platform. To make the case even more complex, individual virtual guest systems based on the same distribution differ according to the department and deployment. A file server typically uses a different setup and services than a Web server does, while both may still be based on SUSE® Linux Enterprise Server.

With QEMU it is possible to create “base” disk images. You can use them as template virtual machines. These base images save you plenty of time because you do not need to install the same operating system more than once.

2.2.4.1 Base and derived images #

First, build a disk image as usual and install the target system on it. For more information, see Section 2.1, “Basic installation with qemu-system-ARCH” and Section 2.2.2, “Creating, converting, and checking disk images”. Then build a new image while using the first one as a base image. The base image is also called a backing file. After your new derived image is built, never boot the base image again, but boot the derived image instead. Several derived images may depend on one base image at the same time. Therefore, changing the base image can damage the dependencies. While using your derived image, QEMU writes changes to it and uses the base image only for reading.

It is a good practice to create a base image from a freshly installed (and, if needed, registered) operating system with no patches applied and no additional applications installed or removed. Later on, you can create another base image with the latest patches applied and based on the original base image.

2.2.4.2 Creating derived images #

Note

While you can use the raw format for base images, you cannot use it for derived images because the raw format does not support the backing_file option. Use, for example, the qcow2 format for the derived images.

For example, /images/sles_base.raw is the base image holding a freshly installed system.

> qemu-img info /images/sles_base.raw
image: /images/sles_base.raw
file format: raw
virtual size: 4.0G (4294967296 bytes)
disk size: 2.4G

The image's reserved size is 4 GB, the actual size is 2.4 GB, and its format is raw. Create an image derived from the /images/sles_base.raw base image with:

> qemu-img create -f qcow2 /images/sles_derived.qcow2 \
-o backing_file=/images/sles_base.raw
Formatting '/images/sles_derived.qcow2', fmt=qcow2 size=4294967296 \
backing_file='/images/sles_base.raw' encryption=off cluster_size=0

Look at the derived image details:

> qemu-img info /images/sles_derived.qcow2
image: /images/sles_derived.qcow2
file format: qcow2
virtual size: 4.0G (4294967296 bytes)
disk size: 140K
cluster_size: 65536
backing file: /images/sles_base.raw \
(actual path: /images/sles_base.raw)

Although the reserved size of the derived image is the same as the size of the base image (4 GB), the actual size is only 140 KB. The reason is that only changes made to the system inside the derived image are saved. Run the derived virtual machine, register it, if needed, and apply the latest patches. Do any other changes in the system, such as removing unneeded or installing new software packages. Then shut the VM Guest down and examine its details once more:

> qemu-img info /images/sles_derived.qcow2
image: /images/sles_derived.qcow2
file format: qcow2
virtual size: 4.0G (4294967296 bytes)
disk size: 1.1G
cluster_size: 65536
backing file: /images/sles_base.raw \
(actual path: /images/sles_base.raw)

The disk size value has grown to 1.1 GB, which is the disk space occupied by the changes on the file system compared to the base image.

2.2.4.3 Rebasing derived images #

After you have modified the derived image (applied patches, installed specific applications, changed environment settings, etc.), it reaches the desired state. At that point, you can merge the original base image and the derived image to create a new base image.

Your original base image (/images/sles_base.raw) holds a freshly installed system. It can be a template for new modified base images, while the new one can contain the same system as the first one plus all security and update patches applied, for example. After you have created this new base image, you can use it as a template for more specialized derived images as well. The new base image becomes independent of the original one. The process of creating base images from derived ones is called rebasing:

> qemu-img convert /images/sles_derived.qcow2 \
-O raw /images/sles_base2.raw

This command created the new base image /images/sles_base2.raw using the raw format.

> qemu-img info /images/sles_base2.raw
image: /images/sles11_base2.raw
file format: raw
virtual size: 4.0G (4294967296 bytes)
disk size: 2.8G

The new image is 0.4 gigabytes larger than the original base image. It uses no backing file, and you can easily create new derived images based on it. This lets you create a sophisticated hierarchy of virtual disk images for your organization, saving a lot of time and work.

2.2.4.4 Mounting an image on a VM Host Server #

It can be useful to mount a virtual disk image on the host system.

Linux systems can mount an internal partition of a raw disk image using a loopback device. The first example procedure is more complex but more illustrative, while the second one is straightforward:

Procedure 3: Mounting a disk image by calculating the partition offset #

Set a loop device on the disk image whose partition you want to mount.
```
> losetup /dev/loop0 /images/sles_base.raw
```

Find the sector size and the starting sector number of the partition you want to mount.

> fdisk -lu /dev/loop0

Disk /dev/loop0: 4294 MB, 4294967296 bytes
255 heads, 63 sectors/track, 522 cylinders, total 8388608 sectors
Units = sectors of 1 * 512 = 5121 bytes
Disk identifier: 0x000ceca8

       Device Boot      Start         End      Blocks   Id  System
/dev/loop0p1              63     1542239      771088+  82  Linux swap
/dev/loop0p2   *     15422402    8385929     3421845   83  Linux

1	The disk sector size.
2	The starting sector of the partition.

Calculate the partition start offset:
sector_size * sector_start = 512 * 1542240 = 789626880

Delete the loop and mount the partition inside the disk image with the calculated offset on a prepared directory.

> losetup -d /dev/loop0
> mount -o loop,offset=789626880 \
/images/sles_base.raw /mnt/sles/
> ls -l /mnt/sles/
total 112
drwxr-xr-x   2 root root  4096 Nov 16 10:02 bin
drwxr-xr-x   3 root root  4096 Nov 16 10:27 boot
drwxr-xr-x   5 root root  4096 Nov 16 09:11 dev
[...]
drwxrwxrwt  14 root root  4096 Nov 24 09:50 tmp
drwxr-xr-x  12 root root  4096 Nov 16 09:16 usr
drwxr-xr-x  15 root root  4096 Nov 16 09:22 var

Copy one or more files onto the mounted partition and unmount it when finished.

> cp /etc/X11/xorg.conf /mnt/sles/root/tmp
> ls -l /mnt/sles/root/tmp
> umount /mnt/sles/

Warning: Do not write to images that are currently in use

Never mount a partition of an image of a running virtual machine in read-write mode. This could corrupt the partition and break the whole VM Guest.

3 Virtual machine administration using QEMU monitor #

Revision History: Managing KVM Virtualization on SUSE Linux Enterprise Server

When a virtual machine is invoked by the qemu-system-ARCH command, for example, qemu-system-x86_64, a monitor console is provided for performing interaction with the user. Using the commands available in the monitor console, it is possible to inspect the running operating system, change removable media, take screenshots or audio grabs and control other aspects of the virtual machine.

Note

The following sections list selected useful QEMU monitor commands and their purpose. To get the full list, enter help in the QEMU monitor command line.

3.1 Accessing monitor console #

Tip: No monitor console for libvirt

You can access the monitor console only if you started the virtual machine directly with the qemu-system-ARCH command and are viewing its graphical output in a built-in QEMU window.

If you started the virtual machine with libvirt, for example, using virt-manager, and are viewing its output via VNC or Spice sessions, you cannot access the monitor console directly. You can, however, send the monitor command to the virtual machine via virsh:

# virsh qemu-monitor-command COMMAND

The way you access the monitor console depends on which display device you use to view the output of a virtual machine. Find more details about displays in Section 4.3.2.2, “Display options”. For example, to view the monitor while the -display gtk option is in use, press Ctrl–Alt–2. Similarly, when the -nographic option is in use, you can switch to the monitor console by pressing the following key combination: Ctrl–AC.

To get help while using the console, use help or ?. To get help for a specific command, use help COMMAND.

3.2 Getting information about the guest system #

To get information about the guest system, use info. If used without any option, a list of possible options is printed. Options determine which part of the system is analyzed:

info version: Shows the version of QEMU.
info commands: Lists available QMP commands.
info network: Shows the network state.
info chardev: Shows the character devices.
info block: Information about block devices, such as hard disks, floppy drives, or CD-ROMs.
info blockstats: Read and write statistics on block devices.
info registers: Shows the CPU registers.
info cpus: Shows information about available CPUs.
info history: Shows the command-line history.
info irq: Shows the interrupt statistics.
info pic: Shows the i8259 (PIC) state.
info pci: Shows the PCI information.
info tlb: Shows virtual to physical memory mappings.
info mem: Shows the active virtual memory mappings.
info jit: Shows dynamic compiler information.
info kvm: Shows the KVM information.
info numa: Shows the NUMA information.
info usb: Shows the guest USB devices.
info usbhost: Shows the host USB devices.
info profile: Shows the profiling information.
info capture: Shows the capture (audio grab) information.
info snapshots: Shows the currently saved virtual machine snapshots.
info status: Shows the current virtual machine status.
info mice: Shows which guest mice are receiving events.
info vnc: Shows the VNC server status.
info name: Shows the current virtual machine name.
info uuid: Shows the current virtual machine's UUID.
info usernet: Shows the user network stack connection states.
info migrate: Shows the migration status.
info balloon: Shows the balloon device information.
info qtree: Shows the device tree.
info qdm: Shows the qdev device model list.
info roms: Shows the ROMs.
info migrate_cache_size: Shows the current migration xbzrle (“Xor Based Zero Run Length Encoding”) cache size.
info migrate_capabilities: Shows the status of the multiple migration capabilities, such as xbzrle compression.
info mtree: Shows the VM Guest memory hierarchy.
info trace-events: Shows available trace-events and their status.

3.3 Changing VNC password #

To change the VNC password, use the change vnc password command and enter the new password:

(qemu) change vnc password
Password: ********
(qemu)

3.4 Managing devices #

To add a new disk while the guest is running (hotplug), use the drive_add and device_add commands. First define a new drive to be added as a device to bus 0:

(qemu) drive_add 0 if=none,file=/tmp/test.img,format=raw,id=disk1
OK

You can confirm your new device by querying the block subsystem:

(qemu) info block
[...]
disk1: removable=1 locked=0 tray-open=0 file=/tmp/test.img ro=0 drv=raw \
encrypted=0 bps=0 bps_rd=0 bps_wr=0 iops=0 iops_rd=0 iops_wr=0

After the new drive is defined, it needs to be connected to a device so that the guest can see it. The typical device would be a virtio-blk-pci or scsi-disk. To get the full list of available values, run:

(qemu) device_add ?
name "VGA", bus PCI
name "usb-storage", bus usb-bus
[...]
name "virtio-blk-pci", bus virtio-bus

Now add the device

(qemu) device_add virtio-blk-pci,drive=disk1,id=myvirtio1

and confirm with

(qemu) info pci
[...]
Bus  0, device   4, function 0:
    SCSI controller: PCI device 1af4:1001
      IRQ 0.
      BAR0: I/O at 0xffffffffffffffff [0x003e].
      BAR1: 32 bit memory at 0xffffffffffffffff [0x00000ffe].
      id "myvirtio1"

Tip

Devices added with the device_add command can be removed from the guest with device_del. Enter help device_del on the QEMU monitor command line for more information.

To release the device or file connected to the removable media device, use the eject DEVICE command. Use the optional -f to force ejection.

To change removable media (like CD-ROMs), use the change DEVICE command. The name of the removable media can be determined using the info block command:

(qemu) info block
ide1-cd0: type=cdrom removable=1 locked=0 file=/dev/sr0 ro=1 drv=host_device
(qemu) change ide1-cd0 /path/to/image

3.5 Controlling keyboard and mouse #

It is possible to use the monitor console to emulate keyboard and mouse input if necessary. For example, if your graphical user interface intercepts certain key combinations at a low level (such as Ctrl–Alt–F1 in X Window System), you can still enter them using the sendkey KEYS:

sendkey ctrl-alt-f1

To list the key names used in the KEYS option, enter sendkey and press →|.

To control the mouse, the following commands can be used:

mouse_move DX dy [DZ]: Move the active mouse pointer to the specified coordinates dx, dy with the optional scroll axis dz.
mouse_button VAL: Change the state of the mouse buttons (1=left, 2=middle, 4=right).
mouse_set INDEX: Set which mouse device receives events. Device index numbers can be obtained with the info mice command.

3.6 Changing available memory #

If the virtual machine was started with the -balloon virtio option (the paravirtualized balloon device is therefore enabled), you can change the available memory dynamically. For more information about enabling the balloon device, see Section 2.1, “Basic installation with qemu-system-ARCH”.

To get information about the balloon device in the monitor console and to determine whether the device is enabled, use the info balloon command:

(qemu) info balloon

If the balloon device is enabled, use the balloon MEMORY_IN_MB command to set the requested amount of memory:

(qemu) balloon 400

3.7 Dumping virtual machine memory #

To save the content of the virtual machine memory to a disk or console output, use the following commands:

memsaveADDRSIZEFILENAME

Saves a virtual memory dump starting at ADDR of size SIZE to file FILENAME

pmemsaveADDRSIZEFILENAME

Saves a physical memory dump starting at ADDR of size SIZE to file FILENAME-

x /FMTADDR

Makes a virtual memory dump starting at address ADDR and formatted according to the FMT string. The FMT string consists of three parameters COUNTFORMATSIZE:

The COUNT parameter is the number of items to be dumped.

The FORMAT can be x (hex), d (signed decimal), u (unsigned decimal), o (octal), c (char) or i (assembly instruction).

The SIZE parameter can be b (8 bits), h (16 bits), w (32 bits) or g (64 bits). On x86, h or w can be specified with the i format to respectively select 16 or 32-bit code instruction size.

xp /FMTADDR

Makes a physical memory dump starting at address ADDR and formatted according to the FMT string. The FMT string consists of three parameters COUNTFORMATSIZE:

The COUNT parameter is the number of items to be dumped.

The FORMAT can be x (hex), d (signed decimal), u (unsigned decimal), o (octal), c (char) or i (asm instruction).

3.8 Managing virtual machine snapshots #

Managing snapshots in QEMU monitor is not supported by SUSE yet. The information found in this section may be helpful in specific cases.

Virtual Machine snapshots are snapshots of the complete virtual machine, including the state of the CPU, RAM, and the content of all writable disks. To use virtual machine snapshots, you must have at least one non-removable and writable block device using the qcow2 disk image format.

Snapshots are helpful when you need to save your virtual machine in a particular state. For example, after you have configured network services on a virtualized server and want to quickly start the virtual machine in the same state that was saved last. You can also create a snapshot after the virtual machine has been powered off to create a backup state before you try something experimental and make the VM Guest unstable. This section introduces the former case, while the latter is described in Section 2.2.3, “Managing snapshots of virtual machines with qemu-img”.

The following commands are available for managing snapshots in QEMU monitor:

savevmNAME

Creates a new virtual machine snapshot under the tag NAME or replaces an existing snapshot.

loadvmNAME

Loads a virtual machine snapshot tagged NAME.

delvm

Deletes a virtual machine snapshot.

info snapshots

Prints information about available snapshots.

(qemu) info snapshots
Snapshot list:
ID1      TAG2                 VM SIZE3   DATE4          VM CLOCK5
1         booting                4.4M 2013-11-22 10:51:10   00:00:20.476
2         booted                 184M 2013-11-22 10:53:03   00:02:05.394
3         logged_in              273M 2013-11-22 11:00:25   00:04:34.843
4         ff_and_term_running    372M 2013-11-22 11:12:27   00:08:44.965

1	Unique auto-incremented identification number of the snapshot.
2	Unique description string of the snapshot. It is meant as a human-readable version of the ID.
3	The disk space occupied by the snapshot. The more memory is consumed by running applications, the bigger the snapshot is.
4	Time and date the snapshot was created.
5	The current state of the virtual machine's clock.

3.9 Suspending and resuming virtual machine execution #

The following commands are available for suspending and resuming virtual machines:

stop: Suspends the execution of the virtual machine.
cont: Resumes the execution of the virtual machine.
system_reset: Resets the virtual machine. The effect is similar to the reset button on a physical machine. This may leave the file system in an unclean state.
system_powerdown: Sends an ACPI shutdown request to the machine. The effect is similar to the power button on a physical machine.
q or quit: Terminates QEMU immediately.

3.10 Live migration #

The live migration process allows to transmit any virtual machine from one host system to another host system without any interruption in availability. It is possible to change hosts permanently or only during maintenance.

The requirements for live migration:

libvirt requirements apply.
Live migration is only possible between VM Host Servers with the same CPU features.
AHCI interface, VirtFS feature, and the -mem-path command line option are not compatible with migration.
The guest on the source and destination hosts must be started in the same way.
-snapshot qemu command line option should not be used for migration (and this qemu command line option is not supported).

Important: Support status

The postcopy mode is not yet supported in SUSE Linux Enterprise Server. It is released as a technology preview only. For more information about postcopy, see https://wiki.qemu.org/Features/PostCopyLiveMigration.

More recommendations can be found on the following Web site: https://www.linux-kvm.org/page/Migration

The live migration process has the following steps:

The virtual machine instance is running on the source host.
The virtual machine is started on the destination host in frozen listening mode. The parameters used are the same as on the source host plus the -incoming tcp:IP:PORT parameter, where IP specifies the IP address and PORT specifies the port for listening to the incoming migration. If 0 is set as the IP address, the virtual machine listens on all interfaces.
On the source host, switch to the monitor console and use the migrate -d tcp: DESTINATION_IP:PORT command to initiate the migration.
To determine the state of the migration, use the info migrate command in the monitor console on the source host.
To cancel the migration, use the migrate_cancel command in the monitor console on the source host.
To set the maximum tolerable downtime for migration in seconds, use the migrate_set_downtime NUMBER_OF_SECONDS command.
To set the maximum speed for migration in bytes per second, use the migrate_set_speed BYTES_PER_SECOND command.

3.11 QMP - QEMU machine protocol #

QMP is a JSON-based protocol that allows applications—such as libvirt—to communicate with a running QEMU instance. There are several ways you can access the QEMU monitor using QMP commands.

3.11.1 Access QMP via standard input/output #

The most flexible way to use QMP is by specifying the -mon option. The following example creates a QMP instance using standard input/output. In the following examples, -> marks lines with commands sent from the client to the running QEMU instance, while <- marks lines with the output returned from QEMU.

> sudo qemu-system-x86_64 [...] \
-chardev stdio,id=mon0 \
-mon chardev=mon0,mode=control,pretty=on

<- {
    "QMP": {
        "version": {
            "qemu": {
                "micro": 0,
                "minor": 0,
                "major": 2
            },
            "package": ""
        },
        "capabilities": [
        ]
    }
}

When a new QMP connection is established, QMP sends its greeting message and enters capabilities negotiation mode. In this mode, only the qmp_capabilities command works. To exit capabilities negotiation mode and enter command mode, the qmp_capabilities command must be issued first:

-> { "execute": "qmp_capabilities" }
<- {
    "return": {
    }
}

"return": {} is a QMP's success response.

QMP's commands can have arguments. For example, to eject a CD-ROM drive, enter the following:

->{ "execute": "eject", "arguments": { "device": "ide1-cd0" } }
<- {
    "timestamp": {
        "seconds": 1410353381,
        "microseconds": 763480
    },
    "event": "DEVICE_TRAY_MOVED",
    "data": {
        "device": "ide1-cd0",
        "tray-open": true
    }
}
{
    "return": {
    }
}

3.11.2 Access QMP via telnet #

Instead of the standard input/output, you can connect the QMP interface to a network socket and communicate with it via a specified port:

> sudo qemu-system-x86_64 [...] \
-chardev socket,id=mon0,host=localhost,port=4444,server,nowait \
-mon chardev=mon0,mode=control,pretty=on

And then run telnet to connect to port 4444:

> telnet localhost 4444
Trying ::1...
Connected to localhost.
Escape character is '^]'.
<- {
    "QMP": {
        "version": {
            "qemu": {
                "micro": 0,
                "minor": 0,
                "major": 2
            },
            "package": ""
        },
        "capabilities": [
        ]
    }
}

You can create several monitor interfaces at the same time. The following example creates one HMP instance—human monitor which understands “normal” QEMU monitor's commands—on the standard input/output, and one QMP instance on localhost port 4444:

> sudo qemu-system-x86_64 [...] \
-chardev stdio,id=mon0 -mon chardev=mon0,mode=readline \
-chardev socket,id=mon1,host=localhost,port=4444,server,nowait \
  -mon chardev=mon1,mode=control,pretty=on

3.11.3 Access QMP via Unix socket #

Invoke QEMU using the -qmp option and create a Unix socket:

> sudo qemu-system-x86_64 [...] \
-qmp unix:/tmp/qmp-sock,server --monitor stdio

QEMU waiting for connection on: unix:./qmp-sock,server

To communicate with the QEMU instance via the /tmp/qmp-sock socket, use nc (see man 1 nc for more information) from another terminal on the same host:

> sudo nc -U /tmp/qmp-sock
<- {"QMP": {"version": {"qemu": {"micro": 0, "minor": 0, "major": 2} [...]

3.11.4 Access QMP via `libvirt`'s `virsh` command #

If you run your virtual machines under libvirt, you can communicate with its running guests by running the virsh qemu-monitor-command:

> sudo virsh qemu-monitor-command vm_guest1 \
--pretty '{"execute":"query-kvm"}'
<- {
    "return": {
        "enabled": true,
        "present": true
    },
    "id": "libvirt-8"
}

In the above example, we ran the simple command query-kvm, which checks if the host is capable of running KVM and if KVM is enabled.

Tip: Generating human-readable output

To use the standard human-readable output format of QEMU instead of the JSON format, use the --hmp option:

> sudo virsh qemu-monitor-command vm_guest1 --hmp "query-kvm"

4 Running virtual machines with `qemu-system` #

Revision History: Managing KVM Virtualization on SUSE Linux Enterprise Server

Once you have a virtual disk image ready (for more information on disk images, see Section 2.2, “Managing disk images with qemu-img”), you can start the related virtual machine. Section 2.1, “Basic installation with qemu-system-ARCH” introduced simple commands to install and run a VM Guest. This article focuses on a more detailed explanation of qemu-system-ARCH usage and shows solutions for more specific tasks. For a complete list of qemu-system-ARCH's options, see its man page (man 1 qemu).

4.1 Basic `qemu-system-ARCH` invocation #

The qemu-system-ARCH command uses the following syntax:

qemu-system-ARCH OPTIONS1 -drive file=DISK_IMAGE2

1	`qemu-system-ARCH` understands many options. Most of them define parameters of the emulated hardware, while others affect more general emulator behavior. If you do not supply any options, default values are used, and you need to supply the path to a disk image to be run.
2	Path to the disk image holding the guest system you want to virtualize. `qemu-system-ARCH` supports many image formats. Use `qemu-img` `--help` to list them.

Important: AArch64 architecture

KVM support is available only for 64-bit Arm® architecture (AArch64). Running QEMU on the AArch64 architecture requires you to specify:

A machine type designed for QEMU Arm® virtual machines using the -machine virt-VERSION_NUMBER option.

A firmware image file using the -bios option.

You can specify the firmware image files alternatively using the -drive options, for example:

-drive file=/usr/share/edk2/aarch64/QEMU_EFI-pflash.raw,if=pflash,format=raw
-drive file=/var/lib/libvirt/qemu/nvram/opensuse_VARS.fd,if=pflash,format=raw

A CPU of the VM Host Server using the -cpu host option (default is cortex-15).
The same Generic Interrupt Controller (GIC) version as the host using the -machine gic-version=host option (default is 2).
If a graphic mode is needed, a graphic device of type virtio-gpu-pci.

For example:

> sudo qemu-system-aarch64 [...] \
 -bios /usr/share/qemu/qemu-uefi-aarch64.bin \
 -cpu host \
 -device virtio-gpu-pci \
 -machine virt,accel=kvm,gic-version=host

4.2 General `qemu-system-ARCH` options #

This section introduces general qemu-system-ARCH options and options related to the basic emulated hardware, such as the virtual machine's processor, memory, model type, or time processing methods.

-name NAME_OF_GUEST

Specifies the name of the running guest system. The name is displayed in the window caption and used for the VNC server.

-boot OPTIONS

Specifies the order in which the defined drives are booted. Drives are represented by letters, where a and b stand for the floppy drives 1 and 2, c stands for the first hard disk, d stands for the first CD-ROM drive, and n to p stand for Ether-boot network adapters.

For example, qemu-system-ARCH [...] -boot order=ndc first tries to boot from the network, then from the first CD-ROM drive, and finally from the first hard disk.

-pidfile FILENAME

Stores QEMU's process identification number (PID) in a file. This is useful if you run QEMU from a script.

-nodefaults

By default, QEMU creates basic virtual devices even if you do not specify them on the command line. This option turns this feature off, and you must specify every single device manually, including graphical and network cards, parallel or serial ports, or virtual consoles. Even QEMU monitor is not attached by default.

-daemonize

“Daemonizes” the QEMU process after it is started. QEMU detaches from the standard input and standard output after it is ready to receive connections on any of its devices.

Note: SeaBIOS BIOS implementation

SeaBIOS is the default BIOS used. You can boot USB devices, any drive (CD-ROM, Floppy or a hard disk). It has USB mouse and keyboard support and supports multiple VGA cards. For more information about SeaBIOS, refer to the SeaBIOS Web site.

4.2.1 Basic virtual hardware #

4.2.1.1 Machine type #

You can specify the type of the emulated machine. Run qemu-system-ARCH -M help to view a list of supported machine types.

Note: ISA-PC

The machine type isapc: ISA-only-PC is unsupported.

4.2.1.2 CPU model #

To specify the type of the processor (CPU) model, run qemu-system-ARCH -cpu MODEL. Use qemu-system-ARCH -cpu help to view a list of supported CPU models.

4.2.1.3 Other basic options #

The following is a list of the most commonly used options while launching qemu from the command line. To see all options available, refer to the qemu-doc man page.

-m MEGABYTES: Specifies how many megabytes are used for the virtual RAM size.
-balloon virtio: Specifies a paravirtualized device to dynamically change the amount of virtual RAM assigned to the VM Guest. The upper limit is the amount of memory specified with -m.
-smp NUMBER_OF_CPUS: Specifies how many CPUs to emulate. QEMU supports up to 255 CPUs on the PC platform (up to 64 with KVM acceleration used). This option also takes other CPU-related parameters, such as number of sockets, the number of sockets, the number of cores per socket, or the number of threads per core.

The following is an example of a working qemu-system-ARCH command line:

> sudo qemu-system-x86_64 \
 -name "SLES 16.0" \
 -M pc-i440fx-2.7 -m 512 \
 -machine accel=kvm -cpu kvm64 -smp 2 \
 -drive format=raw,file=/images/sles.raw

Figure 1: QEMU window with SLES as VM Guest #

-no-acpi: Disables ACPI support.
-S: QEMU starts with the CPU stopped. To start the CPU, enter c in QEMU monitor. For more information, see Section 3, “Virtual machine administration using QEMU monitor”.

4.2.2 Storing and reading the configuration of virtual devices #

-readconfig CFG_FILE

Instead of entering the device configuration options on the command line each time you want to run VM Guest, qemu-system-ARCH can read them from a file that was either previously saved with -writeconfig or edited manually.

-writeconfig CFG_FILE

Dumps the current virtual machine's device configuration to a text file. It can consequently be reused with the -readconfig option.

> sudo qemu-system-x86_64 -name "SLES 16.0" \
 -machine accel=kvm -M pc-i440fx-2.7 -m 512 -cpu kvm64 \
 -smp 2 /images/sles.raw -writeconfig /images/sles.cfg
(exited)
> cat /images/sles.cfg
# qemu config file

[drive]
  index = "0"
  media = "disk"
  file = "/images/sles_base.raw"

This way, you can effectively manage the configuration of your virtual machines' devices in a well-arranged manner.

4.2.3 Guest real-time clock #

-rtc OPTIONS

Specifies the way the RTC is handled inside a VM Guest. By default, the clock of the guest is derived from that of the host system. Therefore, we recommend that the host system clock be synchronized with an accurate external clock, for example, via an NTP service.

If you need to isolate the VM Guest clock from the host one, specify clock=vm instead of the default clock=host.

You can also specify the initial time of the VM Guest's clock with the base option:

> sudo qemu-system-x86_64 [...] -rtc clock=vm,base=2010-12-03T01:02:00

Instead of a time stamp, you can specify utc or localtime. The former instructs VM Guest to start at the current UTC value (Coordinated Universal Time, see https://en.wikipedia.org/wiki/UTC), while the latter applies the local time setting.

4.3 Using devices in QEMU #

QEMU virtual machines emulate all devices needed to run a VM Guest. QEMU supports, for example, several types of network cards, block devices (hard and removable drives), USB devices, character devices (serial and parallel ports), or multimedia devices (graphic and sound cards). This section introduces options for configuring multiple types of supported devices.

Tip

If your device, such as -drive, needs a special driver and driver properties to be set, specify them with the -device option, and identify with drive= suboption. For example:

> sudo qemu-system-x86_64 [...] -drive if=none,id=drive0,format=raw \
-device virtio-blk-pci,drive=drive0,scsi=off ...

To get help on available drivers and their properties, use -device ? and -device DRIVER,?.

4.3.1 Block devices #

Block devices are vital for virtual machines. These are fixed or removable storage media called drives. One of the connected hard disks typically holds the guest operating system to be virtualized.

Virtual Machine drives are defined with -drive. This option has many suboptions, some of which are described in this section. For the complete list, see the man page (man 1 qemu).

Suboptions for the -drive option #

file=image_fname: Specifies the path to the disk image that must be used with this drive. If not specified, an empty (removable) drive is assumed.
if=drive_interface: Specifies the type of interface to which the drive is connected. Currently, only floppy, scsi, ide, or virtio are supported by SUSE. virtio defines a paravirtualized disk driver. Default is ide.
index=index_of_connector: Specifies the index number of a connector on the disk interface (see the if option) where the drive is connected. If not specified, the index is automatically incremented.
media=type: Specifies the type of media. Can be disk for hard disks, or cdrom for removable CD-ROM drives.
format=img_fmt: Specifies the format of the connected disk image. If not specified, the format is autodetected. Currently, SUSE supports raw and qcow2 formats.
cache=method: Specifies the caching method for the drive. Possible values are unsafe, writethrough, writeback, directsync, or none. To improve performance when using the qcow2 image format, select writeback. none disables the host page cache and, therefore, is the safest option. The default for image files is writeback.

Tip

To simplify defining block devices, QEMU understands several shortcuts which you may find handy when entering the qemu-system-ARCH command line.

You can use

> sudo qemu-system-x86_64 -cdrom /images/cdrom.iso

instead of

> sudo qemu-system-x86_64 -drive format=raw,file=/images/cdrom.iso,index=2,media=cdrom

and

> sudo qemu-system-x86_64 -hda /images/imagei1.raw -hdb /images/image2.raw -hdc \
/images/image3.raw -hdd /images/image4.raw

instead of

> sudo qemu-system-x86_64 -drive format=raw,file=/images/image1.raw,index=0,media=disk \
-drive format=raw,file=/images/image2.raw,index=1,media=disk \
-drive format=raw,file=/images/image3.raw,index=2,media=disk \
-drive format=raw,file=/images/image4.raw,index=3,media=disk

Tip: Using host drives instead of images

As an alternative to using disk images (see Section 2.2, “Managing disk images with qemu-img”) you can also use existing VM Host Server disks, connect them as drives, and access them from VM Guest. Use the host disk device directly instead of disk image file names.

To access the host CD-ROM drive, use

> sudo qemu-system-x86_64 [...] -drive file=/dev/cdrom,media=cdrom

To access the host hard disk, use

> sudo qemu-system-x86_64 [...] -drive file=/dev/hdb,media=disk

A host drive used by a VM Guest must not be accessed concurrently by the VM Host Server or another VM Guest.

4.3.1.1 Freeing unused guest disk space #

A Sparse image file is a type of disk image file that grows in size as the user adds data to it, taking up only as much disk space as is stored in it. For example, if you copy 1 GB of data inside the sparse disk image, its size grows by 1 GB. If you then delete, for example, 500 MB of the data, the image size does not by default decrease as expected.

This is why the discard=on option is introduced on the KVM command line. It tells the hypervisor to automatically free the “holes” after deleting data from the sparse guest image. This option is valid only for the if=scsi drive interface:

> sudo qemu-system-x86_64 [...] -drive format=img_format,file=/path/to/file.img,if=scsi,discard=on

Important: Support status

if=scsi is not supported. This interface does not map to virtio-scsi, but rather to the lsi SCSI adapter.

4.3.1.2 IOThreads #

IOThreads are dedicated event loop threads for virtio devices to perform I/O requests to improve scalability, especially on an SMP VM Host Server with SMP VM Guests using many disk devices. Instead of using QEMU's main event loop for I/O processing, IOThreads allow spreading I/O work across multiple CPUs and can improve latency when properly configured.

IOThreads are enabled by defining IOThread objects. virtio devices can then use the objects for their I/O event loops. Many virtio devices can use a single IOThread object, or virtio devices and IOThread objects can be configured in a 1:1 mapping. The following example creates a single IOThread with ID iothread0, which is then used as the event loop for two virtio-blk devices.

> sudo qemu-system-x86_64 [...] -object iothread,id=iothread0\
-drive if=none,id=drive0,cache=none,aio=native,\
format=raw,file=filename -device virtio-blk-pci,drive=drive0,scsi=off,\
iothread=iothread0 -drive if=none,id=drive1,cache=none,aio=native,\
format=raw,file=filename -device virtio-blk-pci,drive=drive1,scsi=off,\
iothread=iothread0 [...]

The following qemu command line example illustrates a 1:1 virtio device to IOThread mapping:

> sudo qemu-system-x86_64 [...] -object iothread,id=iothread0\
-object iothread,id=iothread1 -drive if=none,id=drive0,cache=none,aio=native,\
format=raw,file=filename -device virtio-blk-pci,drive=drive0,scsi=off,\
iothread=iothread0 -drive if=none,id=drive1,cache=none,aio=native,\
format=raw,file=filename -device virtio-blk-pci,drive=drive1,scsi=off,\
    iothread=iothread1 [...]

4.3.1.3 Bio-based I/O path for virtio-blk #

For better performance of I/O-intensive applications, a new I/O path was introduced for the virtio-blk interface in kernel version 3.7. This bio-based block device driver skips the I/O scheduler, and thus shortens the I/O path in the guest and has lower latency. It is especially useful for high-speed storage devices, such as SSD disks.

The driver is disabled by default. To use it, do the following:

Append virtio_blk.use_bio=1 to the kernel command line on the guest. You can do so via YaST › System › Boot Loader.
You can also do it by editing /etc/default/grub, searching for the line that contains GRUB_CMDLINE_LINUX_DEFAULT=, and adding the kernel parameter at the end. Then run grub2-mkconfig >/boot/grub2/grub.cfg to update the grub2 boot menu.
Reboot the guest with the new kernel command line active.

Tip: Bio-based driver on slow devices

The bio-based virtio-blk driver does not help on slow devices such as spin hard disks. The reason is that the benefit of scheduling is larger than what the shortened bio path offers. Do not use the bio-based driver on slow devices.

4.3.1.4 Accessing iSCSI resources directly #

QEMU now integrates with libiscsi. This allows QEMU to access iSCSI resources directly and use them as virtual machine block devices. This feature does not require any host iSCSI initiator configuration, as is needed for a libvirt iSCSI target-based storage pool setup. Instead, it directly connects guest storage interfaces to an iSCSI target LUN via the user space library libiscsi. iSCSI-based disk devices can also be specified in the libvirt XML configuration.

Note: RAW image format

This feature is only available using the RAW image format, as the iSCSI protocol has certain technical limitations.

The following is the QEMU command-line interface for iSCSI connectivity.

Note: virt-manager limitation

The use of libiscsi-based storage provisioning is not yet exposed by the virt-manager interface, but instead it would be configured by directly editing the guest XML. This new way of accessing iSCSI based storage is to be done at the command line.

> sudo qemu-system-x86_64 -machine accel=kvm \
  -drive file=iscsi://192.168.100.1:3260/iqn.2016-08.com.example:314605ab-a88e-49af-b4eb-664808a3443b/0,\
  format=raw,if=none,id=mydrive,cache=none \
  -device ide-hd,bus=ide.0,unit=0,drive=mydrive ...

Here is an example snippet of guest domain XML which uses the protocol-based iSCSI:

<devices>
...
  <disk type='network' device='disk'>
    <driver name='qemu' type='raw'/>
    <source protocol='iscsi' name='iqn.2013-07.com.example:iscsi-nopool/2'>
      <host name='example.com' port='3260'/>
    </source>
    <auth username='myuser'>
      <secret type='iscsi' usage='libvirtiscsi'/>
    </auth>
    <target dev='vda' bus='virtio'/>
  </disk>
</devices>

Contrast that with an example which uses the host-based iSCSI initiator which virt-manager sets up:

<devices>
...
  <disk type='block' device='disk'>
    <driver name='qemu' type='raw' cache='none' io='native'/>
    <source dev='/dev/disk/by-path/scsi-0:0:0:0'/>
    <target dev='hda' bus='ide'/>
    <address type='drive' controller='0' bus='0' target='0' unit='0'/>
  </disk>
  <controller type='ide' index='0'>
    <address type='pci' domain='0x0000' bus='0x00' slot='0x01'
             function='0x1'/>
  </controller>
</devices>

4.3.1.5 Using RADOS block devices with QEMU #

RADOS Block Devices (RBD) store data in a Ceph cluster. They allow snapshotting, replication and data consistency. You can use an RBD from your KVM-managed VM Guests similarly to how you use other block devices.

For more details, refer to the SUSE Enterprise Storage Administration Guide, chapter Ceph as a Back-end for QEMU KVM Instance.

4.3.2 Graphic devices and display options #

This section describes QEMU options affecting the type of the emulated video card and the way the VM Guest graphical output is displayed.

4.3.2.1 Defining video cards #

QEMU uses -vga to define a video card used to display the VM Guest's graphical output. The -vga option understands the following values:

none: Disables video cards on VM Guest (no video card is emulated). You can still access the running VM Guest via the serial console.
std: Emulates a standard VESA 2.0 VBE video card. Use it if you intend to use a high display resolution on the VM Guest.
qxl: QXL is a paravirtual graphic card. It is VGA compatible (including VESA 2.0 VBE support). qxl is recommended when using the spice video protocol.
virtio: Paravirtual VGA graphic card.

4.3.2.2 Display options #

The following options affect the way the VM Guest's graphical output is displayed.

-display gtk

Display video output in a GTK window. This interface provides UI elements to configure and control the VM during runtime.

-display sdl

Display video output via SDL in a separate graphics window. For more information, see the SDL documentation.

-spice option[,option[,...]]

Enables the Spice remote desktop protocol.

-display vnc

Refer to Section 4.5, “Viewing a VM Guest with VNC” for more information.

-nographic

Disables QEMU's graphical output. The emulated serial port is redirected to the console.

After starting the virtual machine with -nographic, press Ctrl–A H in the virtual console to view the list of other useful shortcuts, for example, to toggle between the console and the QEMU monitor.

> sudo qemu-system-x86_64 -hda /images/sles_base.raw -nographic

C-a h    print this help
C-a x    exit emulator
C-a s    save disk data back to file (if -snapshot)
C-a t    toggle console timestamps
C-a b    send break (magic sysrq)
C-a c    switch between console and monitor
C-a C-a  sends C-a
(pressed C-a c)

QEMU 2.3.1 monitor - type 'help' for more information
(qemu)

-no-frame

Disables decorations for the QEMU window. Convenient for a dedicated desktop workspace.

-full-screen

Starts QEMU graphical output in full-screen mode.

-no-quit

Disables the close button of the QEMU window and prevents it from being closed by force.

-alt-grab, -ctrl-grab

By default, the QEMU window releases the “captured” mouse after pressing Ctrl–Alt. You can change the key combination to either Ctrl–Alt–Shift (-alt-grab), or the right Ctrl key (-ctrl-grab).

4.3.3 USB devices #

There are two ways to create USB devices usable by the VM Guest in KVM: you can either emulate new USB devices inside a VM Guest, or assign an existing host USB device to a VM Guest. To use USB devices in QEMU you first need to enable the generic USB driver with the -usb option. Then you can specify individual devices with the -usbdevice option.

4.3.3.1 Emulating USB devices in VM Guest #

SUSE currently supports the following types of USB devices: disk, host, serial, braille, net, mouse, and tablet.

Types of USB devices for the -usbdevice option #

disk

Emulates a mass storage device based on a file. The optional format option is used rather than detecting the format.

> sudo qemu-system-x86_64 [...] -usbdevice
        disk:format=raw:/virt/usb_disk.raw

host

Pass through the host device (identified by bus.addr).

serial

Serial converter to a host character device.

braille

Emulates a braille device using BrlAPI to display the braille output.

net

Emulates a network adapter that supports CDC Ethernet and RNDIS protocols.

mouse

Emulates a virtual USB mouse. This option overrides the default PS/2 mouse emulation. The following example shows the hardware status of a mouse on a VM Guest started with qemu-system-ARCH [...] -usbdevice mouse:

> sudo hwinfo --mouse
20: USB 00.0: 10503 USB Mouse
[Created at usb.122]
UDI: /org/freedesktop/Hal/devices/usb_device_627_1_1_if0
[...]
Hardware Class: mouse
Model: "Adomax QEMU USB Mouse"
Hotplug: USB
Vendor: usb 0x0627 "Adomax Technology Co., Ltd"
Device: usb 0x0001 "QEMU USB Mouse"
[...]

tablet

Emulates a pointer device that uses absolute coordinates (such as a touchscreen). This option overrides the default PS/2 mouse emulation. The tablet device is useful if you are viewing a VM Guest via the VNC protocol. See Section 4.5, “Viewing a VM Guest with VNC” for more information.

4.3.4 Character devices #

Use -chardev to create a new character device. The option uses the following general syntax:

qemu-system-x86_64 [...] -chardev BACKEND_TYPE,id=ID_STRING

where BACKEND_TYPE can be one of null, socket, udp, msmouse, vc, file, pipe, console, serial, pty, stdio, braille, tty, or parport. All character devices must have a unique identification string up to 127 characters long. It is used to identify the device in other related directives. For the complete description of all back-end's suboptions, see the man page (man 1 qemu). A brief description of the available back-ends follows:

null

Creates an empty device that outputs no data and drops any data it receives.

stdio

Connects to QEMU's process standard input and standard output.

socket

Creates a two-way stream socket. If PATH is specified, a Unix socket is created:

> sudo qemu-system-x86_64 [...] -chardev \
socket,id=unix_socket1,path=/tmp/unix_socket1,server

The SERVER suboption specifies that the socket is a listening socket.

If PORT is specified, a TCP socket is created:

> sudo qemu-system-x86_64 [...] -chardev \
socket,id=tcp_socket1,host=localhost,port=7777,server,nowait

The command creates a local listening (server) TCP socket on port 7777. QEMU does not block waiting for a client to connect to the listening port (nowait).

udp

Sends all network traffic from the VM Guest to a remote host over the UDP protocol.

> sudo qemu-system-x86_64 [...] \
-chardev udp,id=udp_fwd,host=mercury.example.com,port=7777

The command binds port 7777 on the remote host mercury.example.com and sends VM Guest network traffic there.

vc

Creates a new QEMU text console. You can optionally specify the dimensions of the virtual console:

> sudo qemu-system-x86_64 [...] -chardev vc,id=vc1,width=640,height=480 \
-mon chardev=vc1

The command creates a new virtual console called vc1 of the specified size, and connects the QEMU monitor to it.

file

Logs all traffic from the VM Guest to a file on the VM Host Server. The path is required and is automatically created if it does not exist.

> sudo qemu-system-x86_64 [...] \
-chardev file,id=qemu_log1,path=/var/log/qemu/guest1.log

By default, QEMU creates a set of character devices for serial and parallel ports, and a special console for QEMU monitor. However, you can create your own character devices and use them for the mentioned purposes. The following options may help you:

-serial CHAR_DEV

Redirects the VM Guest's virtual serial port to a character device CHAR_DEV on VM Host Server. By default, it is a virtual console (vc) in graphical mode, and stdio in non-graphical mode. The -serial understands many suboptions. See the man page man 1 qemu for a complete list of them.

You can emulate up to four serial ports. Use -serial none to disable all serial ports.

-parallel DEVICE

Redirects the VM Guest's parallel port to a DEVICE. This option supports the same devices as -serial.

Tip

With SUSE Linux Enterprise Server as a VM Host Server, you can directly use the hardware parallel port devices /dev/parportN where N is the number of the port.

You can emulate up to three parallel ports. Use -parallel none to disable all parallel ports.

-monitor CHAR_DEV

Redirects the QEMU monitor to a character device CHAR_DEV on the VM Host Server. This option supports the same devices as -serial. By default, it is a virtual console (vc) in graphical mode, and stdio in non-graphical mode.

For a complete list of available character device back-ends, see the man page (man 1 qemu).

4.4 Networking in QEMU #

Use the -netdev option in combination with -device to define a specific type of networking and a network interface card for your VM Guest. The syntax for the -netdev option is

-netdev type[,prop[=value][,...]]

Currently, SUSE supports the following network types: user, bridge, and tap. For a complete list of -netdev suboptions, see the man page (man 1 qemu).

Supported -netdev suboptions #

bridge: Uses a specified network helper to configure the TAP interface and attach it to a specified bridge. For more information, see Section 4.4.3, “Bridged networking”.
user: Specifies user-mode networking. For more information, see Section 4.4.2, “User-mode networking”.
tap: Specifies bridged or routed networking. For more information, see Section 4.4.3, “Bridged networking”.

4.4.1 Defining a network interface card #

Use -netdev together with the related -device option to add a new emulated network card:

> sudo qemu-system-x86_64 [...] \
-netdev tap1,id=hostnet0 \
-device virtio-net-pci2,netdev=hostnet0,vlan=13,\
macaddr=00:16:35:AF:94:4B4,name=ncard1

1	Specifies the network device type.
2	Specifies the model of the network card. Use `qemu-system-ARCH -device help` and search for the `Network devices:` section to get the list of all network card models supported by QEMU on your platform. Currently, SUSE supports the models `rtl8139`, `e1000` and its variants `e1000-82540em`, `e1000-82544gc` and `e1000-82545em`, and `virtio-net-pci`. To view a list of options for a specific driver, add `help` as a driver option: `>` `sudo` qemu-system-x86_64 -device e1000,help e1000.mac=macaddr e1000.vlan=vlan e1000.netdev=netdev e1000.bootindex=int32 e1000.autonegotiation=on/off e1000.mitigation=on/off e1000.addr=pci-devfn e1000.romfile=str e1000.rombar=uint32 e1000.multifunction=on/off e1000.command_serr_enable=on/off
3	Connects the network interface to VLAN number 1. You can specify your own number—it is mainly useful for identification purposes. If you omit this suboption, QEMU uses the default 0.
4	Specifies the Media Access Control (MAC) address for the network card. It is a unique identifier, and you are advised to always specify it. If not, QEMU supplies its own default MAC address and creates a possible MAC address conflict within the related VLAN.

4.4.2 User-mode networking #

The -netdev user option instructs QEMU to use user-mode networking. This is the default if no networking mode is selected. Therefore, these command lines are equivalent:

> sudo qemu-system-x86_64 -hda /images/sles_base.raw

> sudo qemu-system-x86_64 -hda /images/sles_base.raw -netdev user,id=hostnet0

This mode is useful for allowing the VM Guest to access the external network resources, such as the Internet. By default, no incoming traffic is permitted and therefore, the VM Guest is not visible to other machines on the network. No administrator privileges are required in this networking mode. The user-mode is also useful for doing a network boot on your VM Guest from a local directory on the VM Host Server.

The VM Guest allocates an IP address from a virtual DHCP server. VM Host Server (the DHCP server) is reachable at 10.0.2.2, while the IP address range for allocation starts from 10.0.2.15. You can use ssh to connect to VM Host Server at 10.0.2.2, and scp to copy files back and forth.

4.4.2.1 Command-line examples #

This section shows several examples of how to set up user-mode networking with QEMU.

Example 2: Restricted user-mode networking #

> sudo qemu-system-x86_64 [...] \
-netdev user1,id=hostnet0 \
-device virtio-net-pci,netdev=hostnet0,vlan=12,name=user_net13,restrict=yes4

1	Specifies user-mode networking.
2	Connects to VLAN number 1. If omitted, defaults to 0.
3	Specifies a human-readable name of the network stack. Useful when identifying it in the QEMU monitor.
4	Isolates VM Guest. It then cannot communicate with the VM Host Server and no network packets are routed to the external network.

Example 3: User-mode networking with a custom IP range #

> sudo qemu-system-x86_64 [...] \
-netdev user,id=hostnet0 \
-device virtio-net-pci,netdev=hostnet0,net=10.2.0.0/81,host=10.2.0.62,\
dhcpstart=10.2.0.203,hostname=tux_kvm_guest4

1	Specifies the IP address of the network that the VM Guest sees and optionally the netmask. Default is 10.0.2.0/8.
2	Specifies the VM Host Server IP address that the VM Guest sees. Default is 10.0.2.2.
3	Specifies the first of the 16 IP addresses that the built-in DHCP server can assign to a VM Guest. Default is 10.0.2.15.
4	Specifies the host name that the built-in DHCP server assigns to the VM Guest.

Example 4: User-mode networking with network-boot and TFTP #

> sudo qemu-system-x86_64 [...] \
-netdev user,id=hostnet0 \
-device virtio-net-pci,netdev=hostnet0,tftp=/images/tftp_dir1,\
bootfile=/images/boot/pxelinux.02

1	Activates a built-in TFTP (a file transfer protocol with the functionality of a basic FTP) server. The files in the specified directory are visible to a VM Guest as the root of a TFTP server.
2	Broadcasts the specified file as a BOOTP (a network protocol that offers an IP address and the network location of a boot image, often used in diskless workstations) file. When used together with `tftp`, the VM Guest can boot via the network from the local directory on the host.

Example 5: User-mode networking with host port forwarding #

> sudo qemu-system-x86_64 [...] \
-netdev user,id=hostnet0 \
-device virtio-net-pci,netdev=hostnet0,hostfwd=tcp::2222-:22

Forwards incoming TCP connections to the port 2222 on the host to the port 22 (SSH) on the VM Guest. If sshd is running on the VM Guest, enter

> ssh qemu_host -p 2222

where qemu_host is the host name or IP address of the host system, to get an SSH prompt from VM Guest.

4.4.3 Bridged networking #

With the -netdev tap option, QEMU creates a network bridge by connecting the host TAP network device to a specified VLAN of the VM Guest. Its network interface is then visible to the rest of the network. This method does not work by default and needs to be explicitly specified.

First, create a network bridge and add a VM Host Server physical network interface to it, such as eth0:

Start YaST Control Center and select System › Network Settings.
Click Add and select Bridge from the Device Type drop-down box in the Hardware Dialog window. Click Next.
Choose whether you need a dynamically or statically assigned IP address and fill in the related network settings, if applicable.
In the Bridged Devices pane, select the Ethernet device to add to the bridge.
Click Next. When asked about adapting an already configured device, click Continue.

Click OK to apply the changes. Check if the bridge has been created:

> bridge link
2: eth0 state UP : <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br0 \
 state forwarding priority 32 cost 100

4.4.3.1 Connecting to a bridge manually #

Use the following example script to connect the VM Guest to the newly created bridge interface br0. Several commands in the script are run via the sudo mechanism because they require root privileges.

Tip: Required software

To manage a network bridge, you need to have the tunctl package installed.

#!/bin/bash
bridge=br01
tap=$(sudo tunctl -u $(whoami) -b)2
sudo ip link set $tap up3
sleep 1s4
sudo ip link add name $bridge type bridge
sudo ip link set $bridge up
sudo ip link set $tap master $bridge5
qemu-system-x86_64 -machine accel=kvm -m 512 -hda /images/sles_base.raw \
 -netdev tap,id=hostnet0 \
 -device virtio-net-pci,netdev=hostnet0,vlan=0,macaddr=00:16:35:AF:94:4B,\
 ifname=$tap6,script=no7,downscript=no
sudo ip link set $tap nomaster8
sudo ip link set $tap down9
sudo tunctl -d $tap10

1	Name of the bridge device.
2	Prepare a new TAP device and assign it to the user who runs the script. TAP devices are virtual network devices often used for virtualization and emulation setups.
3	Bring up the newly created TAP network interface.
4	Make a 1-second pause to make sure the new TAP network interface is really up.
5	Add the new `TAP` device to the network bridge `br0`.
6	The `ifname=` suboption specifies the name of the TAP network interface used for bridging.
7	Before `qemu-system-ARCH` connects to a network bridge, it checks the `script` and `downscript` values. If it finds the specified scripts on the VM Host Server file system, it runs the `script` before it connects to the network bridge and `downscript` after it exits the network environment. You can use these scripts to set up and tear down the bridged interfaces. By default, `/etc/qemu-ifup` and `/etc/qemu-ifdown` are examined. If `script=no` and `downscript=no` are specified, the script execution is disabled and you need to take care of it manually.
8	Deletes the TAP interface from the network bridge `br0`.
9	Sets the state of the TAP device to `down`.
10	Tear down the TAP device.

4.4.3.2 Connecting to a bridge with qemu-bridge-helper #

Another way to connect a VM Guest to a network through a network bridge is via the qemu-bridge-helper helper program. It configures the TAP interface for you and attaches it to the specified bridge. The default helper executable is /usr/lib/qemu-bridge-helper. The helper executable is setuid root, which is only executable by members of the virtualization group (kvm). Therefore, the qemu-system-ARCH command itself does not need to be run under root privileges.

The helper is automatically called when you specify a network bridge:

qemu-system-x86_64 [...] \
 -netdev bridge,id=hostnet0,vlan=0,br=br0 \
 -device virtio-net-pci,netdev=hostnet0

You can specify your own custom helper script that takes care of the TAP device (de)configuration, with the helper=/path/to/your/helper option:

qemu-system-x86_64 [...] \
 -netdev bridge,id=hostnet0,vlan=0,br=br0,helper=/path/to/bridge-helper \
 -device virtio-net-pci,netdev=hostnet0

Tip

To define access privileges to qemu-bridge-helper, inspect the /etc/qemu/bridge.conf file. For example, the following directive

allow br0

allows the qemu-system-ARCH command to connect its VM Guest to the network bridge br0.

4.5 Viewing a VM Guest with VNC #

By default QEMU uses a GTK (a cross-platform toolkit library) window to display the graphical output of a VM Guest. With the -vnc option specified, you can make QEMU listen on a specified VNC display and redirect its graphical output to the VNC session.

Tip

When working with QEMU's virtual machine via a VNC session, it is useful to work with the -usbdevice tablet option.

Moreover, if you need to use a keyboard layout other than the default en-us, specify it with the -k option.

The first suboption of -vnc must be a display value. The -vnc option understands the following display specifications:

host:display: Only connections from host on the display number display are accepted. The TCP port on which the VNC session is then running is normally a 5900 + display number. If you do not specify a host, connections are accepted from any host.
unix:path: The VNC server listens for connections on Unix domain sockets. The path option specifies the location of the related Unix socket.
none: The VNC server functionality is initialized, but the server itself is not started. You can start the VNC server later with the QEMU monitor. For more information, see Section 3, “Virtual machine administration using QEMU monitor”.

Following the display value, there may be one or more option flags separated by commas. Valid options are:

reverse: Connect to a listening VNC client via a reverse connection.
websocket: Opens an additional TCP listening port dedicated to VNC WebSocket connections. By definition the WebSocket port is 5700+display.
password: Require that password-based authentication is used for client connections.
tls: Require that clients use TLS when communicating with the VNC server.
x509=/path/to/certificate/dir: Valid if TLS is specified. Require that x509 credentials are used for negotiating the TLS session.
x509verify=/path/to/certificate/dir: Valid if TLS is specified. Require that x509 credentials are used for negotiating the TLS session.
sasl: Require that the client uses SASL to authenticate with the VNC server.
acl: Turn on access control lists for checking of the x509 client certificate and SASL party.
lossy: Enable lossy compression methods (gradient, JPEG, ...).
non-adaptive: Disable adaptive encodings. Adaptive encodings are enabled by default.
share=[allow-exclusive|force-shared|ignore]: Set the display sharing policy.

Note

For more details about the display options, see the qemu-doc man page.

An example of VNC usage:

> sudo qemu-system-x86_64 [...] -vnc :5
# (on the client:)
> vncviewer venus:5 &

Figure 2: QEMU VNC session #

4.5.1 Secure VNC connections #

The default VNC server setup does not use any form of authentication. In the previous example, any user can connect and view the QEMU VNC session from any host on the network.

There are several levels of security that you can apply to your VNC client/server connection. You can either protect your connection with a password, use x509 certificates, use SASL authentication, or even combine several authentication methods in one QEMU command.

For more information about configuring x509 certificates on a VM Host Server and the client.

The Remmina VNC viewer supports advanced authentication mechanisms. For this example, let us assume that the server x509 certificates ca-cert.pem, server-cert.pem, and server-key.pem are located in the /etc/pki/qemu directory on the host. The client certificates can be placed in any custom directory, as Remmina asks for their path on connection start-up.

Example 6: Password authentication #

qemu-system-x86_64 [...] -vnc :5,password -monitor stdio

Starts the VM Guest graphical output on VNC display number 5, which corresponds to port 5905. The password suboption initializes a simple password-based authentication method. There is no password set by default and you need to set one with the change vnc password command in QEMU monitor:

QEMU 2.3.1 monitor - type 'help' for more information
(qemu) change vnc password
Password: ****

You need the -monitor stdio option here, because you would not be able to manage the QEMU monitor without redirecting its input/output.

Figure 3: Authentication dialog in Remmina #

Example 7: x509 certificate authentication #

The QEMU VNC server can use TLS encryption for the session and x509 certificates for authentication. The server asks the client for a certificate and validates it against the CA certificate. Use this authentication type if your company provides an internal certificate authority.

qemu-system-x86_64 [...] -vnc :5,tls,x509verify=/etc/pki/qemu

Example 8: x509 certificate and password authentication #

You can combine password authentication with TLS encryption and x509 certificate authentication to create a two-layer authentication model for clients. Remember to set the password in the QEMU monitor after you run the following command:

qemu-system-x86_64 [...] -vnc :5,password,tls,x509verify=/etc/pki/qemu \
-monitor stdio

Example 9: SASL authentication #

Simple Authentication and Security Layer (SASL) is a framework for authentication and data security in Internet protocols. It integrates several authentication mechanisms, like PAM, Kerberos, LDAP and more. SASL keeps its own user database, so the connecting user accounts do not need to exist on VM Host Server.

For security reasons, you are advised to combine SASL authentication with TLS encryption and x509 certificates:

qemu-system-x86_64 [...] -vnc :5,tls,x509,sasl -monitor stdio

5 I/O virtualization #

VM Guests not only share CPU and memory resources of the host system, but also the I/O subsystem. Because software I/O virtualization techniques deliver less performance than bare metal, hardware solutions that deliver almost “native” performance have been developed recently. SUSE Linux Enterprise Server supports the following I/O virtualization techniques:

Full virtualization

Fully Virtualized (FV) drivers emulate widely supported real devices, which can be used with an existing driver in the VM Guest. The guest is also called Hardware Virtual Machine (HVM). Since the physical device on the VM Host Server may differ from the emulated one, the hypervisor needs to process all I/O operations before handing them over to the physical device. Therefore all I/O operations need to traverse two software layers, a process that not only significantly impacts I/O performance, but also consumes CPU time.

Paravirtualization

Paravirtualization (PV) allows direct communication between the hypervisor and the VM Guest. With less overhead involved, performance is much better than with full virtualization. However, paravirtualization requires either the guest operating system to be modified to support the paravirtualization API, or availability of paravirtualized drivers. For a list of guest operating systems supporting paravirtualization, refer to the section Availability of paravirtualized drivers in the article Virtualization Limits and Support.

PVHVM

This type of virtualization enhances HVM (see Full virtualization) with paravirtualized (PV) drivers, and PV interrupt and timer handling.

VFIO

VFIO stands for Virtual Function I/O and is a new user-level driver framework for Linux. It replaces the traditional KVM PCI Pass-Through device assignment. The VFIO driver exposes direct device access to user space in a secure memory Input/Output Memory Management Unit (IOMMU) protected environment. With VFIO, a VM Guest can directly access hardware devices on the VM Host Server (pass-through), avoiding performance issues caused by emulation in performance critical paths. This method does not allow to share devices—each device can only be assigned to a single VM Guest. VFIO needs to be supported by the VM Host Server CPU, chipset and the BIOS/EFI.

Compared to the legacy KVM PCI device assignment, VFIO has the following advantages:

Resource access is compatible with UEFI Secure Boot.
Device is isolated and its memory access protected.
Offers a user space device driver with more flexible device ownership model.
Is independent of KVM technology, and not bound to x86 architecture only.

In SUSE Linux Enterprise Server the USB and PCI pass-through methods of device assignment are considered deprecated and are superseded by the VFIO model.

SR-IOV

The latest I/O virtualization technique, Single Root I/O Virtualization SR-IOV combines the benefits of the aforementioned techniques—performance and the ability to share a device with several VM Guests. SR-IOV requires special I/O devices, that are capable of replicating resources so they appear as multiple separate devices. Each such “pseudo” device can be directly used by a single guest. However, for network cards for example the number of concurrent queues that can be used is limited, potentially reducing performance for the VM Guest compared to paravirtualized drivers. On the VM Host Server, SR-IOV must be supported by the I/O device, the CPU and chipset, the BIOS/EFI and the hypervisor—.

Important: Requirements for VFIO and SR-IOV

To be able to use the VFIO and SR-IOV features, the VM Host Server needs to fulfill the following requirements:

IOMMU needs to be enabled in the BIOS/EFI.
For Intel CPUs, the kernel parameter intel_iommu=on needs to be provided on the kernel command line. For more information, see https://github.com/torvalds/linux/blob/master/Documentation/admin-guide/kernel-parameters.txt#L1951.
The VFIO infrastructure needs to be available. This can be achieved by loading the kernel module vfio_pci.

6 For more information #

For further steps in virtualization, refer to the following sources:

7 Legal Notice #

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled “GNU Free Documentation License”.

For SUSE trademarks, see https://www.suse.com/company/legal/. All other third-party trademarks are the property of their respective owners. Trademark symbols (®, ™ etc.) denote trademarks of SUSE and its affiliates. Asterisks (*) denote third-party trademarks.

All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its affiliates, the authors, nor the translators shall be held liable for possible errors or the consequences thereof.

Managing KVM Virtualization on SUSE Linux Enterprise Server

1 Setting up a KVM VM Host Server #

1.1 CPU support for virtualization #

1.2 Required software #

1.3 KVM host-specific features #

1.3.1 Using the host storage with virtio-scsi #

1.3.1.1 virtio-scsi usage #

1.3.2 Accelerated networking with vhost-net #

1.3.3 Scaling network performance with multiqueue virtio-net #

1.3.4 VFIO: secure direct access to devices #

1.3.5 VirtFS: sharing directories between host and guests #

1.3.5.1 Host Configuration #

1.3.5.1.1 Access Mode Options for virtiofs #

1.3.5.2 Guest Configuration #

1.3.5.3 Persistent mounts virtiofs across reboot #

1.3.5.4 Troubleshooting Common Issues #

1.3.6 KSM: sharing memory pages between guests #

2 Guest installation #

2.1 Basic installation with qemu-system-ARCH #

2.2 Managing disk images with qemu-img #

2.2.1 General information on qemu-img invocation #

2.2.2 Creating, converting, and checking disk images #

2.2.2.1 qemu-img create #

2.2.2.2 qemu-img convert #

2.2.2.3 qemu-img check #

2.2.2.4 Increasing the size of an existing disk image #

2.2.2.5 Advanced options for the qcow2 file format #

2.2.2.5.1 Choosing the right cache size #

2.2.2.5.2 Configuring the cache size #

2.2.2.5.3 Reducing memory usage #

2.2.3 Managing snapshots of virtual machines with qemu-img #

2.2.3.1 Listing existing snapshots #

2.2.3.2 Creating snapshots of a powered-off virtual machine #

2.2.3.3 Deleting snapshots #

2.2.4 Manipulate disk images effectively #

2.2.4.1 Base and derived images #

2.2.4.2 Creating derived images #

2.2.4.3 Rebasing derived images #

2.2.4.4 Mounting an image on a VM Host Server #

3 Virtual machine administration using QEMU monitor #

3.1 Accessing monitor console #

3.2 Getting information about the guest system #

3.3 Changing VNC password #

3.4 Managing devices #

3.5 Controlling keyboard and mouse #

3.6 Changing available memory #

3.7 Dumping virtual machine memory #

3.8 Managing virtual machine snapshots #

3.9 Suspending and resuming virtual machine execution #

3.10 Live migration #

3.11 QMP - QEMU machine protocol #

3.11.1 Access QMP via standard input/output #

3.11.2 Access QMP via telnet #

3.11.3 Access QMP via Unix socket #

3.11.4 Access QMP via libvirt's virsh command #

4 Running virtual machines with qemu-system #

4.1 Basic qemu-system-ARCH invocation #

4.2 General qemu-system-ARCH options #

4.2.1 Basic virtual hardware #

4.2.1.1 Machine type #

4.2.1.2 CPU model #

4.2.1.3 Other basic options #

4.2.2 Storing and reading the configuration of virtual devices #

4.2.3 Guest real-time clock #

4.3 Using devices in QEMU #

4.3.1 Block devices #

4.3.1.1 Freeing unused guest disk space #

4.3.1.2 IOThreads #

4.3.1.3 Bio-based I/O path for virtio-blk #

4.3.1.4 Accessing iSCSI resources directly #

4.3.1.5 Using RADOS block devices with QEMU #

4.3.2 Graphic devices and display options #

4.3.2.1 Defining video cards #

4.3.2.2 Display options #

4.3.3 USB devices #

4.3.3.1 Emulating USB devices in VM Guest #

4.3.4 Character devices #

4.4 Networking in QEMU #

4.4.1 Defining a network interface card #

4.4.2 User-mode networking #

1.3.1 Using the host storage with `virtio-scsi` #

1.3.1.1 `virtio-scsi` usage #

1.3.2 Accelerated networking with `vhost-net` #

2.1 Basic installation with `qemu-system-ARCH` #

2.2 Managing disk images with `qemu-img` #

3.11.4 Access QMP via `libvirt`'s `virsh` command #

4 Running virtual machines with `qemu-system` #

4.1 Basic `qemu-system-ARCH` invocation #

4.2 General `qemu-system-ARCH` options #