23 Live Patching the Linux Kernel Using kGraft #
This document describes the basic principles of the kGraft live patching technology and provides usage guidelines for the SLE Live Patching service.
kGraft is a live patching technology for runtime patching of the Linux kernel, without stopping the kernel. This maximizes system uptime, and thus system availability, which is important for mission-critical systems. By allowing dynamic patching of the kernel, the technology also encourages users to install critical security updates without deferring them to a scheduled downtime.
A kGraft patch is a kernel module, intended for replacing whole functions in the kernel. kGraft primarily offers in-kernel infrastructure for integration of the patched code with base kernel code at runtime.
SLE Live Patching is a service provided on top of regular SUSE Linux Enterprise Server maintenance. kGraft patches distributed through SLE Live Patching supplement regular SLES maintenance updates. Common update stack and procedures can be used for SLE Live Patching deployment.
The information provided in this document relates to the AMD64/Intel 64 and POWER architectures. In case you use a different architecture, the procedures may differ.
23.1 Advantages of kGraft #
Live kernel patching using kGraft is especially useful for quick response in emergencies (when serious vulnerabilities are known and should be fixed when possible or there are serious system stability issues with a known fix). It is not used for scheduled updates where time is not critical.
Typical use cases for kGraft include systems like memory databases with huge amounts of RAM, where boot-up times of 15 minutes or more are not uncommon, large simulations that need weeks or months without a restart, or infrastructure building blocks providing continuous service to many consumers.
The main advantage of kGraft is that it never requires stopping the kernel, not even for a short time period.
A kGraft patch is a .ko
kernel module in a RPM
package. It is inserted into the kernel using the insmod
command when the package is installed or updated. kGraft replaces whole
functions in the kernel, even if they are being executed. An updated kGraft
module can replace an existing patch if necessary.
kGraft is also lean—it contains only a small amount of code, because it leverages other standard Linux technologies.
23.2 Low-level Function of kGraft #
kGraft uses the ftrace infrastructure to perform patching. The following describes the implementation on the AMD64/Intel 64 architecture.
To patch a kernel function, kGraft needs some space at the start of the function to insert a jump to a new function. This space is allocated during kernel compilation by GCC with function profiling turned on. In particular, a 5-byte call instruction is injected to the start of kernel functions. When such instrumented kernel is booting, profiling calls are replaced by 5-byte NOP (no operation) instructions.
After patching starts, the first byte is replaced by the INT3 (breakpoint) instruction. This ensures atomicity of the 5-byte instruction replacement. The other four bytes are replaced by the address to the new function. Finally, the first byte is replaced by the JMP (long jump) opcode.
Inter-processor non-maskable interrupts (IPI NMI) are used throughout the process to flush speculative decoding queues of other CPUs in the system. This allows switching to the new function without ever stopping the kernel, not even for a very short moment. The interruptions by IPI NMIs can be measured in microseconds and are not considered service interruptions as they happen while the kernel is running in any case.
Callers are never patched. Instead, the caller's NOPs are replaced by a JMP to the new function. JMP instructions remain forever. This takes care of function pointers, including in structures, and does not require saving any old data for the possibility of un-patching.
However, these steps alone would not be good enough: since the functions would be replaced non-atomically, a new fixed function in one part of the kernel could still be calling an old function elsewhere or vice versa. If the semantics of the function interfaces changed in the patch, chaos would ensue.
Thus, until all functions are replaced, kGraft uses an approach based on trampolines and similar to RCU (read-copy-update), to ensure a consistent view of the world to each user space thread, kernel thread and kernel interrupt. A per-thread flag is set on each kernel entry and exit. This way, an old function would always call another old function and a new function always a new one. Once all processes have the "new universe" flag set, patching is complete, trampolines can be removed and the code can operate at full speed without performance impact other than an extra-long jump for each patched function.
23.3 Installing kGraft Patches #
This section describes the activation of the SUSE Linux Enterprise Live Patching extension and the installation of kGraft patches.
23.3.1 Activation of SLE Live Patching #
To activate SLE Live Patching on your system, follow these steps:
If your SLES system is not yet registered, register it. Registration can be done during the system installation or later using the YaST
module (yast2 registration
). After registration, click to see the list of available online updates.If your SLES system is already registered, but SLE Live Patching is not yet activated, open the YaST
module (yast2 registration
) and click .Select
in the list of available extensions and click .Confirm the license terms and click
.Enter the SLE Live Patching registration code and click
.Check the
and selected . The patternLive Patching
should be selected for installation.Click
to complete the installation. This will install the base kGraft components on your system together with the initial live patch.
23.3.2 Updating System #
SLE Live Patching updates are distributed in a form that allows using standard SLE update stack for patch application. The initial live patch can be updated using
zypper patch
, YaST Online Update or equivalent method.The kernel is patched automatically during the package installation. However, invocations of the old kernel functions are not completely eliminated until all sleeping processes wake up and get out of the way. This can take a considerable amount of time. Despite this, sleeping processes that use the old kernel functions are not considered a security issue. Nevertheless, in the current version of kGraft, it is not possible to apply another kGraft patch until all processes cross the kernel-user space boundary to stop using patched functions from the previous patch.
To see the global status of patching, check the flag in
/sys/kernel/kgraft/in_progress
. The value1
signifies the existence of sleeping processes that still need to be woken (the patching is still in progress). The value0
signifies that all processes are using solely the patched functions and patching has finished already. Alternatively, use thekgr status
command to obtain the same information.The flag can be checked on a per-process basis too. Check the number in
/proc/PROCESS_NUMBER/kgr_in_progress
for each process individually. Again, the value1
signifies sleeping process that still needs to be woken. Alternatively, use thekgr blocking
command to output the list of sleeping processes.
23.4 Patch Life Cycle #
Expiration dates of live patches can be accessed with zypper
lifecycle
. Make sure that the package
lifecycle-data-sle-live-patching is installed.
tux >
zypper lifecycle
Product end of support Codestream: SUSE Linux Enterprise Server 12 2024-10-31 SUSE Linux Enterprise Server 12 SP2 n/a* Extension end of support SUSE Linux Enterprise Live Patching 2017-10-31 Package end of support if different from product: SUSEConnect Now, installed 0.2.41-18.1, update available 0.2.42-19.3.1 apache2-utils Now *) See https://www.suse.com/lifecycle for latest information
When the expiration date of a patch is reached, no further live patches for this kernel version will be supplied. Plan an update of your kernel before the end of the live patch life cycle period.
23.5 Removing a kGraft Patch #
To remove a kGraft patch, use the following procedure:
First remove the patch itself using Zypper:
zypper rm kgraft-patch-3_12_32-25-default
Then reboot the machine.
23.6 Stuck Kernel Execution Threads #
Kernel threads need to be prepared to handle kGraft. Third-party software may
not be ready for kGraft adoption and its kernel modules may spawn
kernel execution threads. These threads will block the patching process
indefinitely. As an emergency measure kGraft offers the possibility to force
finishing of the patching process without waiting for all execution threads
to cross the safety checkpoint. This can be achieved by writing
0
into
/sys/kernel/kgraft/in_progress
. Consult SUSE Support
before performing this procedure.
23.7 The kgr
Tool #
Several kGraft management tasks can be simplified with the
kgr
tool. The available commands are:
kgr status
Displays the overall status of kGraft patching (
ready
orin_progress
).kgr patches
Displays the list of loaded kGraft patches.
kgr blocking
Lists processes that are preventing kGraft patching from finishing. By default only the PIDs are listed. Specifying
-v
prints command lines if available. Another-v
displays also stack traces.
For detailed information, see man kgr
.
23.8 Scope of kGraft Technology #
kGraft is based on replacing functions. Data structure alteration can be accomplished only indirectly with kGraft. As a result, changes to kernel data structure require special care and, if the change is too large, rebooting might be required. kGraft also might not be able to handle situations where one compiler is used to compile the old kernel and another compiler is used for compiling the patch.
Because of the way kGraft works, support for third-party modules that are spawning kernel threads is limited.
23.9 Scope of SLE Live Patching #
Fixes for SUSE Common Vulnerability Scoring System (CVSS) level 7+ vulnerabilities and bug fixes related to system stability or data corruption will be shipped in the scope of SLE Live Patching. It might not be possible to produce a live patch for all kinds of fixes fulfilling the above criteria. SUSE reserves the right to skip fixes where production of a kernel live patch is unviable because of technical reasons. For more information on CVSS 3.0, which is the base for the SUSE CVSS rating, see https://www.first.org/cvss/.
23.10 Interaction with the Support Processes #
While resolving a technical difficulty with SUSE Support, you may receive a so-called Program Temporary Fix (PTF). PTFs may be issued for various packages including those forming the base of SLE Live Patching.
kGraft PTFs complying with the conditions described in the previous section can be installed as usual and SUSE will ensure that the system in question does not need to be rebooted and that future live updates are applied cleanly.
PTFs issued for the base kernel disrupt the live patching process. First, installing the PTF kernel means a reboot as the kernel cannot be replaced as a whole at runtime. Second, another reboot is needed to replace the PTF with any regular maintenance updates for which the live patches are issued.
PTFs for other packages in SLE Live Patching can be treated like regular PTFs with the usual guarantees.