Live Migration
Live migration means moving a virtual machine to a different host without downtime.
|
How Migration Works
Each node has multiple CPU models that are labeled with different keys.
-
Primary CPU model:
host-model-cpu.node.kubevirt.io/{cpu-model}
-
Supported CPU models:
cpu-model.node.kubevirt.io/{cpu-model}
-
Supported CPU models for migration:
cpu-model-migration.node.kubevirt.io/{cpu-model}
During live migration, the system checks the value of spec.domain.cpu.model
in the VirtualMachineInstance (VMI) CR, which is derived from spec.template.spec.domain.cpu.model
in the VirtualMachine (VM) CR. If the value of spec.template.spec.domain.cpu.model
is not set, the system uses the default value host-model
.
When host-model
is used, the process fetches the value of the primary CPU model and fills spec.NodeSelectors
of the newly created pod with the label cpu-model-migration.node.kubevirt.io/{cpu-model}
.
Alternatively, you can customize the CPU model in spec.domain.cpu.model
. For example, if the CPU model is XYZ
, the process fills spec.NodeSelectors
of the newly created pod with the label cpu-model.node.kubevirt.io/XYZ
.
However, host-model
only allows migration of the virtual machine to a node with same CPU model. For more information, see Limitations.
Starting a Migration
-
Go to the Virtual Machines page.
-
Find the virtual machine that you want to migrate and select ⋮ > Migrate.
-
Choose the node to which you want to migrate the virtual machine. Click Apply.
When you have node scheduling rules configured for a virtual machine, you must ensure that the target nodes you are migrating to meet the virtual machine’s runtime requirements. The list of nodes you get to search and select from will be generated based on:
-
virtual machine scheduling rules.
-
Possibly node rules from the network configuration.
Aborting a Migration
-
Go to the Virtual Machines page.
-
Find the virtual machine in migrating status that you want to abort. Select ⋮ → Abort Migration.
Migration Timeouts
Completion Timeout
The live migration process will copy virtual machine memory pages and disk blocks to the destination. In some cases, the virtual machine can write to different memory pages or disk blocks at a higher rate than these can be copied. As a result, the migration process is prevented from being completed in a reasonable amount of time.
Live migration will be aborted if it exceeds the completion timeout of 800s per GiB of data. For example, a virtual machine with 8 GiB of memory will time out after 6400 seconds.
Limitation
host-model
only allows migration of the virtual machine to a node with same CPU model. However, specifying a CPU model is not always required. When no CPU model is specified, you must shut down the virtual machine, assign a CPU model that is supported by all nodes, and then restart the virtual machine.
Example:
-
A node:
host-model-cpu.node.kubevirt.io/XYZ
cpu-model-migration.node.kubevirt.io/XYZ
cpu-model.node.kubevirt.io/123
-
B node:
host-model-cpu.node.kubevirt.io/ABC
cpu-model-migration.node.kubevirt.io/ABC
cpu-model.node.kubevirt.io/123
Migrating a virtual machine with host-model
is not possible because the values of host-model-cpu.node.kubevirt.io
are not identical. However, both nodes support the 123
CPU model, so you can migrate any virtual machine with the 123
CPU model using either of the following methods:
-
Cluster level: Run
kubectl edit kubevirts.kubevirt.io -n harvester-system
and addspec.configuration.cpuModel: "123"
. This change also affects newly created virtual machines. -
Individual virtual machines: Modify the virtual machine configuration to include
spec.template.spec.domain.cpu.model: "123"
.
Both methods require the restarting the virtual machines. If you are certain that all nodes in the cluster support a specific CPU model, you can define this at the cluster level before creating any virtual machines. In doing so, you eliminate the need to restart the virtual machines (to assign the CPU model) during live migration.