Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
documentation.suse.com / SUSE Enterprise Storage 7 Documentation / Administration and Operations Guide / Integration with Virtualization Tools / Ceph as a back-end for QEMU KVM instance
Applies to SUSE Enterprise Storage 7

27 Ceph as a back-end for QEMU KVM instance

The most frequent Ceph use case involves providing block device images to virtual machines. For example, a user may create a 'golden' image with an OS and any relevant software in an ideal configuration. Then, the user takes a snapshot of the image. Finally, the user clones the snapshot (usually many times, see Section 20.3, “Snapshots” for details). The ability to make copy-on-write clones of a snapshot means that Ceph can provision block device images to virtual machines quickly, because the client does not need to download an entire image each time it spins up a new virtual machine.

Ceph block devices can integrate with the QEMU virtual machines. For more information on QEMU KVM, see https://documentation.suse.com/sles/15-SP2/html/SLES-all/part-virt-qemu.html.

27.1 Installing qemu-block-rbd

In order to use Ceph block devices, QEMU needs to have the appropriate driver installed. Check whether the qemu-block-rbd package is installed, and install it if needed:

# zypper install qemu-block-rbd

27.2 Using QEMU

The QEMU command line expects you to specify the pool name and image name. You may also specify a snapshot name.

qemu-img command options \
rbd:pool-name/image-name@snapshot-name:option1=value1:option2=value2...

For example, specifying the id and conf options might look like the following:

qemu-img command options \
rbd:pool_name/image_name:id=glance:conf=/etc/ceph/ceph.conf

27.3 Creating images with QEMU

You can create a block device image from QEMU. You must specify rbd, the pool name, and the name of the image you want to create. You must also specify the size of the image.

qemu-img create -f raw rbd:pool-name/image-name size

For example:

qemu-img create -f raw rbd:pool1/image1 10G
Formatting 'rbd:pool1/image1', fmt=raw size=10737418240 nocow=off cluster_size=0
Important
Important

The raw data format is really the only sensible format option to use with RBD. Technically, you could use other QEMU-supported formats such as qcow2, but doing so would add additional overhead, and would also render the volume unsafe for virtual machine live migration when caching is enabled.

27.4 Resizing images with QEMU

You can resize a block device image from QEMU. You must specify rbd, the pool name, and the name of the image you want to resize. You must also specify the size of the image.

qemu-img resize rbd:pool-name/image-name size

For example:

qemu-img resize rbd:pool1/image1 9G
Image resized.

27.5 Retrieving image info with QEMU

You can retrieve block device image information from QEMU. You must specify rbd, the pool name, and the name of the image.

qemu-img info rbd:pool-name/image-name

For example:

qemu-img info rbd:pool1/image1
image: rbd:pool1/image1
file format: raw
virtual size: 9.0G (9663676416 bytes)
disk size: unavailable
cluster_size: 4194304

27.6 Running QEMU with RBD

QEMU can access an image as a virtual block device directly via librbd. This avoids an additional context switch, and can take advantage of RBD caching.

You can use qemu-img to convert existing virtual machine images to Ceph block device images. For example, if you have a qcow2 image, you could run:

qemu-img convert -f qcow2 -O raw sles12.qcow2 rbd:pool1/sles12

To run a virtual machine booting from that image, you could run:

# qemu -m 1024 -drive format=raw,file=rbd:pool1/sles12

RBD caching can significantly improve performance. QEMU’s cache options control librbd caching:

# qemu -m 1024 -drive format=rbd,file=rbd:pool1/sles12,cache=writeback

For more information on RBD caching, refer to Section 20.5, “Cache settings”.

27.7 Enabling discard and TRIM

Ceph block devices support the discard operation. This means that a guest can send TRIM requests to let a Ceph block device reclaim unused space. This can be enabled in the guest by mounting XFS with the discard option.

For this to be available to the guest, it must be explicitly enabled for the block device. To do this, you must specify a discard_granularity associated with the drive:

# qemu -m 1024 -drive format=raw,file=rbd:pool1/sles12,id=drive1,if=none \
-device driver=ide-hd,drive=drive1,discard_granularity=512
Note
Note

The above example uses the IDE driver. The virtio driver does not support discard.

If using libvirt, edit your libvirt domain’s configuration file using virsh edit to include the xmlns:qemu value. Then, add a qemu:commandline block as a child of that domain. The following example shows how to set two devices with qemu id= to different discard_granularity values.

<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
 <qemu:commandline>
  <qemu:arg value='-set'/>
  <qemu:arg value='block.scsi0-0-0.discard_granularity=4096'/>
  <qemu:arg value='-set'/>
  <qemu:arg value='block.scsi0-0-1.discard_granularity=65536'/>
 </qemu:commandline>
</domain>

27.8 Setting QEMU cache options

QEMU’s cache options correspond to the following Ceph RBD Cache settings.

Writeback:

rbd_cache = true

Writethrough:

rbd_cache = true
rbd_cache_max_dirty = 0

None:

rbd_cache = false

QEMU’s cache settings override Ceph’s default settings (settings that are not explicitly set in the Ceph configuration file). If you explicitly set RBD Cache settings in your Ceph configuration file (refer to Section 20.5, “Cache settings”), your Ceph settings override the QEMU cache settings. If you set cache settings on the QEMU command line, the QEMU command line settings override the Ceph configuration file settings.