27 Ceph as a back-end for QEMU KVM instance #
The most frequent Ceph use case involves providing block device images to virtual machines. For example, a user may create a 'golden' image with an OS and any relevant software in an ideal configuration. Then, the user takes a snapshot of the image. Finally, the user clones the snapshot (usually many times, see Section 20.3, “Snapshots” for details). The ability to make copy-on-write clones of a snapshot means that Ceph can provision block device images to virtual machines quickly, because the client does not need to download an entire image each time it spins up a new virtual machine.
Ceph block devices can integrate with the QEMU virtual machines. For more information on QEMU KVM, see https://documentation.suse.com/sles/15-SP3/html/SLES-all/part-virt-qemu.html.
27.1 Installing qemu-block-rbd #
   In order to use Ceph block devices, QEMU needs to have the appropriate
   driver installed. Check whether the qemu-block-rbd
   package is installed, and install it if needed:
  
# zypper install qemu-block-rbd27.2 Using QEMU #
The QEMU command line expects you to specify the pool name and image name. You may also specify a snapshot name.
qemu-img command options \ rbd:pool-name/image-name@snapshot-name:option1=value1:option2=value2...
For example, specifying the id and conf options might look like the following:
qemu-img command options \
rbd:pool_name/image_name:id=glance:conf=/etc/ceph/ceph.conf27.3 Creating images with QEMU #
   You can create a block device image from QEMU. You must specify
   rbd, the pool name, and the name of the image you want to
   create. You must also specify the size of the image.
  
qemu-img create -f raw rbd:pool-name/image-name size
For example:
qemu-img create -f raw rbd:pool1/image1 10G Formatting 'rbd:pool1/image1', fmt=raw size=10737418240 nocow=off cluster_size=0
    The raw data format is really the only sensible format
    option to use with RBD. Technically, you could use other QEMU-supported
    formats such as qcow2, but doing so would add additional
    overhead, and would also render the volume unsafe for virtual machine live
    migration when caching is enabled.
   
27.4 Resizing images with QEMU #
   You can resize a block device image from QEMU. You must specify
   rbd, the pool name, and the name of the image you want to
   resize. You must also specify the size of the image.
  
qemu-img resize rbd:pool-name/image-name size
For example:
qemu-img resize rbd:pool1/image1 9G Image resized.
27.5 Retrieving image info with QEMU #
   You can retrieve block device image information from QEMU. You must
   specify rbd, the pool name, and the name of the image.
  
qemu-img info rbd:pool-name/image-name
For example:
qemu-img info rbd:pool1/image1 image: rbd:pool1/image1 file format: raw virtual size: 9.0G (9663676416 bytes) disk size: unavailable cluster_size: 4194304
27.6 Running QEMU with RBD #
   QEMU can access an image as a virtual block device directly via
   librbd. This avoids an additional context switch,
   and can take advantage of RBD caching.
  
   You can use qemu-img to convert existing virtual machine
   images to Ceph block device images. For example, if you have a qcow2
   image, you could run:
  
qemu-img convert -f qcow2 -O raw sles12.qcow2 rbd:pool1/sles12
To run a virtual machine booting from that image, you could run:
# qemu -m 1024 -drive format=raw,file=rbd:pool1/sles12
   RBD caching can significantly improve performance. QEMU’s cache options
   control librbd caching:
  
# qemu -m 1024 -drive format=rbd,file=rbd:pool1/sles12,cache=writebackFor more information on RBD caching, refer to Section 20.5, “Cache settings”.
27.7 Enabling discard and TRIM #
   Ceph block devices support the discard operation. This means that a guest
   can send TRIM requests to let a Ceph block device reclaim unused space.
   This can be enabled in the guest by mounting XFS
   with the discard option.
  
   For this to be available to the guest, it must be explicitly enabled for the
   block device. To do this, you must specify a
   discard_granularity associated with the drive:
  
# qemu -m 1024 -drive format=raw,file=rbd:pool1/sles12,id=drive1,if=none \
-device driver=ide-hd,drive=drive1,discard_granularity=512The above example uses the IDE driver. The virtio driver does not support discard.
   If using libvirt, edit your libvirt domain’s
   configuration file using virsh edit to include the
   xmlns:qemu value. Then, add a qemu:commandline
   block as a child of that domain. The following example shows how
   to set two devices with qemu id= to different
   discard_granularity values.
  
<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'> <qemu:commandline> <qemu:arg value='-set'/> <qemu:arg value='block.scsi0-0-0.discard_granularity=4096'/> <qemu:arg value='-set'/> <qemu:arg value='block.scsi0-0-1.discard_granularity=65536'/> </qemu:commandline> </domain>
27.8 Setting QEMU cache options #
QEMU’s cache options correspond to the following Ceph RBD Cache settings.
Writeback:
rbd_cache = true
Writethrough:
rbd_cache = true rbd_cache_max_dirty = 0
None:
rbd_cache = false
QEMU’s cache settings override Ceph’s default settings (settings that are not explicitly set in the Ceph configuration file). If you explicitly set RBD Cache settings in your Ceph configuration file (refer to Section 20.5, “Cache settings”), your Ceph settings override the QEMU cache settings. If you set cache settings on the QEMU command line, the QEMU command line settings override the Ceph configuration file settings.