17 NVMe-oF #
This chapter describes how to set up an NVMe over Fabrics host and target.
17.1 Overview #
NVM Express® (NVMe®) is an interface standard for accessing non-volatile storage, commonly SSD disks. NVMe supports much higher speeds and has a lower latency than SATA.
NVMe-oF™ is an architecture to access NVMe storage over different networking fabrics—for example, RDMA, TCP, or NVMe over Fibre Channel (FC-NVMe). The role of NVMe-oF is similar to iSCSI. To increase the fault-tolerance, NVMe-oF has a built-in support for multipathing. The NVMe-oF multipathing is not based on the traditional DM-Multipathing.
The NVMe host is the machine that connects to an NVMe target. The NVMe target is the machine that shares its NVMe block devices.
NVMe is supported on SUSE Linux Enterprise Server 15 SP5. There are Kernel modules available for the NVMe block storage and NVMe-oF target and host.
To see if your hardware requires any special consideration, refer to Section 17.4, “Special hardware configuration”.
17.2 Setting up an NVMe-oF host #
To use NVMe-oF, a target must be available with one of the supported networking methods. Supported are NVMe over Fibre Channel, TCP, and RDMA. The following sections describe how to connect a host to an NVMe target.
17.2.1 Installing command line client #
To use NVMe-oF, you need the nvme
command line
tool. Install it with zypper
:
>
sudo
zypper in nvme-cli
Use nvme --help
to list all available subcommands.
Man pages are available for nvme
subcommands.
Consult them by executing man
nvme-SUBCOMMAND
. For example, to
view the man page for the discover
subcommand, execute
man nvme-discover
.
17.2.2 Discovering NVMe-oF targets #
To list available NVMe subsystems on the NVMe-oF target, you need the discovery controller address and service ID.
>
sudo
nvme discover -t TRANSPORT -a DISCOVERY_CONTROLLER_ADDRESS -s SERVICE_ID
Replace TRANSPORT with the underlying
transport medium: loop
, rdma
,
tcp
, or fc
. Replace
DISCOVERY_CONTROLLER_ADDRESS with the
address of the discovery controller. For RDMA and TCP, this should be
an IPv4 address. Replace SERVICE_ID with the
transport service ID. If the service is IP based, like RDMA or TCP,
service ID specifies the port number. For Fibre Channel, the service ID
is not required.
The NVMe hosts only see the subsystems they are allowed to connect to.
Example:
>
sudo
nvme discover -t tcp -a 10.0.0.3 -s 4420
For the FC, the example looks as follows:
>
sudo
nvme discover --transport=fc \ --traddr=nn-0x201700a09890f5bf:pn-0x201900a09890f5bf \ --host-traddr=nn-0x200000109b579ef6:pn-0x100000109b579ef6
For more details, see man nvme-discover
.
17.2.3 Connecting to NVMe-oF targets #
After you have identified the NVMe subsystem, you can connect it with
the nvme connect
command.
>
sudo
nvme connect -t transport -a DISCOVERY_CONTROLLER_ADDRESS -s SERVICE_ID -n SUBSYSTEM_NQN
Replace TRANSPORT with the underlying
transport medium: loop
, rdma
,
tcp
or fc
. Replace
DISCOVERY_CONTROLLER_ADDRESS with the
address of the discovery controller. For RDMA and TCP this should be an
IPv4 address. Replace SERVICE_ID with the
transport service ID. If the service is IP based, like RDMA or TCP,
this specifies the port number. Replace
SUBSYSTEM_NQN with the NVMe qualified name
of the desired subsystem as found by the discovery command.
NQN is the abbreviation for NVMe
Qualified Name. The NQN must be unique.
Example:
>
sudo
nvme connect -t tcp -a 10.0.0.3 -s 4420 -n nqn.2014-08.com.example:nvme:nvm-subsystem-sn-d78432
For the FC, the example looks as follows:
>
sudo
nvme connect --transport=fc \ --traddr=nn-0x201700a09890f5bf:pn-0x201900a09890f5bf \ --host-traddr=nn-0x200000109b579ef6:pn-0x100000109b579ef6 \ --nqn=nqn.2014-08.org.nvmexpress:uuid:1a9e23dd-466e-45ca-9f43-a29aaf47cb21
Alternatively, use nvme connect-all
to connect to
all discovered namespaces. For advanced usage, see man
nvme-connect
and man nvme-connect-all
.
In case of a path loss, the NVMe subsystem tries to reconnect for a
time period, defined by the ctrl-loss-tmo
option of
the nvme connect
command. After this time (default
value is 600s), the path is removed and the upper layers of the block
layer (file system) are notified. By default, the file system is then
mounted read-only, which usually is not the expected behavior.
Therefore, it is recommended to set the
ctrl-loss-tmo
option so that the NVMe subsystem
keeps trying to reconnect without a limit. To do so, run the following
command:
>
sudo
nvme connect --ctrl-loss-tmo=-1
To make an NVMe over Fabrics subsystem available at boot, create a
/etc/nvme/discovery.conf
file on the host with the
parameters passed to the discover
command (as
described in
Section 17.2.2, “Discovering NVMe-oF targets”. For
example, if you use the discover
command as follows:
>
sudo
nvme discover -t tcp -a 10.0.0.3 -s 4420
Add the parameters of the discover
command to the
/etc/nvme/discovery.conf
file:
echo "-t tcp -a 10.0.0.3 -s 4420" | sudo tee -a /etc/nvme/discovery.conf
Then enable the
service:>
sudo
systemctl enable nvmf-autoconnect.service
17.2.4 Multipathing #
NVMe native multipathing is enabled by default. If the
CMIC
option in the controller identity settings is
set, the NVMe stack recognizes an NVME drive as a multipathed device by
default.
To manage the multipathing, you can use the following:
nvme list-subsys
Prints the layout of the multipath devices.
multipath -ll
The command has a compatibility mode and displays NVMe multipath devices. Bear in mind that you need to enable the
enable_foreign
option to use the command. For details, refer to Section 18.13, “Miscellaneous options”.nvme-core.multipath=N
When the option is added as a boot parameter, the NVMe native multipathing will be disabled.
17.3 Setting up an NVMe-oF target #
17.3.1 Installing command line client #
To configure an NVMe-oF target, you need the
nvmetcli
command line tool. Install it with
zypper
:
>
sudo
zypper in nvmetcli
The current documentation for nvmetcli
is available
at
https://git.infradead.org/users/hch/nvmetcli.git/blob_plain/HEAD:/Documentation/nvmetcli.txt.
17.3.2 Configuration steps #
The following procedure provides an example of how to set up an NVMe-oF target.
The configuration is stored in a tree structure. Use the command
cd
to navigate. Use ls
to list
objects. You can create new objects with create
.
Start the
nvmetcli
interactive shell:>
sudo
nvmetcli
Create a new port:
(nvmetcli)>
cd ports
(nvmetcli)>
create 1
(nvmetcli)>
ls 1/
o- 1 o- referrals o- subsystemsCreate an NVMe subsystem:
(nvmetcli)>
cd /subsystems
(nvmetcli)>
create nqn.2014-08.org.nvmexpress:NVMf:uuid:c36f2c23-354d-416c-95de-f2b8ec353a82
(nvmetcli)>
cd nqn.2014-08.org.nvmexpress:NVMf:uuid:c36f2c23-354d-416c-95de-f2b8ec353a82/
(nvmetcli)>
ls
o- nqn.2014-08.org.nvmexpress:NVMf:uuid:c36f2c23-354d-416c-95de-f2b8ec353a82 o- allowed_hosts o- namespacesCreate a new namespace and set an NVMe device to it:
(nvmetcli)>
cd namespaces
(nvmetcli)>
create 1
(nvmetcli)>
cd 1
(nvmetcli)>
set device path=/dev/nvme0n1
Parameter path is now '/dev/nvme0n1'.Enable the previously created namespace:
(nvmetcli)>
cd ..
(nvmetcli)>
enable
The Namespace has been enabled.Display the created namespace:
(nvmetcli)>
cd ..
(nvmetcli)>
ls
o- nqn.2014-08.org.nvmexpress:NVMf:uuid:c36f2c23-354d-416c-95de-f2b8ec353a82 o- allowed_hosts o- namespaces o- 1Allow all hosts to use the subsystem. Only do this in secure environments.
(nvmetcli)>
set attr allow_any_host=1
Parameter allow_any_host is now '1'.Alternatively, you can allow only specific hosts to connect:
(nvmetcli)>
cd nqn.2014-08.org.nvmexpress:NVMf:uuid:c36f2c23-354d-416c-95de-f2b8ec353a82/allowed_hosts/
(nvmetcli)>
create hostnqn
List all created objects:
(nvmetcli)>
cd /
(nvmetcli)>
ls
o- / o- hosts o- ports | o- 1 | o- referrals | o- subsystems o- subsystems o- nqn.2014-08.org.nvmexpress:NVMf:uuid:c36f2c23-354d-416c-95de-f2b8ec353a82 o- allowed_hosts o- namespaces o- 1Make the target available via TCP. Use
trtype=rdma
for RDMA:(nvmetcli)>
cd ports/1/
(nvmetcli)>
set addr adrfam=ipv4 trtype=tcp traddr=10.0.0.3 trsvcid=4420
Parameter trtype is now 'tcp'. Parameter adrfam is now 'ipv4'. Parameter trsvcid is now '4420'. Parameter traddr is now '10.0.0.3'.Alternatively, you can make it available with Fibre Channel:
(nvmetcli)>
cd ports/1/
(nvmetcli)>
set addr adrfam=fc trtype=fc traddr=nn-0x1000000044001123:pn-0x2000000055001123 trsvcid=none
Link the subsystem to the port:
(nvmetcli)>
cd /ports/1/subsystems
(nvmetcli)>
create nqn.2014-08.org.nvmexpress:NVMf:uuid:c36f2c23-354d-416c-95de-f2b8ec353a82
Now you can verify that the port is enabled using
dmesg
:#
dmesg ... [ 257.872084] nvmet_tcp: enabling port 1 (10.0.0.3:4420)
17.3.3 Back up and restore target configuration #
You can save the target configuration in a JSON file with the following commands:
>
sudo
nvmetcli
(nvmetcli)>
saveconfig nvme-target-backup.json
To restore the configuration, use:
(nvmetcli)>
restore nvme-target-backup.json
You can also wipe the current configuration:
(nvmetcli)>
clear
17.4 Special hardware configuration #
17.4.1 Overview #
Some hardware needs special configuration to work correctly. Skim the titles of the following sections to see if you are using any of the mentioned devices or vendors.
17.4.2 Broadcom #
If you are using the Broadcom Emulex LightPulse Fibre Channel
SCSI driver, add a Kernel configuration parameter on the
target and host for the lpfc
module:
>
sudo
echo "options lpfc lpfc_enable_fc4_type=3" > /etc/modprobe.d/lpfc.conf
Make sure that the Broadcom adapter firmware has at least version 11.4.204.33. Also make sure that you have the current versions of nvme-cli, nvmetcli and the Kernel installed.
To enable a Fibre Channel port as an NVMe target, an additional
module parameter needs to be configured:
lpfc_enable_nvmet=
COMMA_SEPARATED_WWPNS
. Enter the WWPN with a
leading 0x
, for example
lpfc_enable_nvmet=0x2000000055001122,0x2000000055003344
.
Only listed WWPNs will be configured for target mode. A Fibre Channel
port can either be configured as target or as
initiator.
17.4.3 Marvell #
FC-NVMe is supported on QLE269x and QLE27xx adapters. FC-NVMe support is enabled by default in the Marvell® QLogic® QLA2xxx Fibre Channel driver.
To confirm NVMe is enabled, run the following command:
>
cat /sys/module/qla2xxx/parameters/ql2xnvmeenable
A resulting 1
means NVMe is enabled, a
0
indicates it is disabled.
Next, ensure that the Marvell adapter firmware is at least version 8.08.204 by checking the output of the following command:
>
cat /sys/class/scsi_host/host0/fw_version
Last, ensure that the latest versions available for SUSE Linux Enterprise Server of nvme-cli, QConvergeConsoleCLI, and the Kernel are installed. You may, for example, run
#
zypper lu && zypper pchk
to check for updates and patches.
For more details on installation, please refer to the FC-NVMe sections in the following Marvell user guides:
17.5 Booting from NVMe-oF over TCP #
SLES supports booting from NVMe-oF over TCP according to the NVM Express® Boot Specification 1.0.
The UEFI pre-boot environment can be configured to attempt NVMe-oF over TCP connections to remote storage servers and use these for booting. The pre-boot environment creates an ACPI table—NVMe Boot Firmware Table (NBFT) to store information about the NVMe-oF configuration used for booting. The operating system uses this table at a later boot stage to set up networking and NVMe-oF connections to access the root file system.
17.5.1 System requirements #
To boot the system from NVMe-oF over TCP, the following requirements must be met:
SLES15 SP5 or later.
A SAN storage array supporting NVMe-oF over TCP
A host system with a BIOS that supports booting from NVMe-oF over TCP. Contact your hardware vendor for information about support for this feature. Booting from NVMe-oF over TCP is currently only supported on UEFI platforms.
17.5.2 Installation #
To install SLES from NVMe-oF over TCP, proceed as follows:
Use the host system's UEFI setup menus to configure NVMe-oF connections to be established at boot time. Typically, you need to configure both networking (local IP addresses, gateways, etc.) and NVMe-oF targets (remote IP address, subsystem NQN or discovery NQN). Refer to the hardware documentaion for the configuration description. Your hardware vendor may provide means to manage the BIOS configuration centrally and remotely. Please contact your hardware vendor for additional information.
Prepare the installation as described in Einführung zur Bereitstellung.
Start the system installation using any supported installation method. You do not need to use any specific boot parameters to enable installation on NVMe-oF over TCP.
If the BIOS has been configured correctly, the disk partitioning dialog in YaST will show NVMe namespaces exported by the subsystems configured in the BIOS. They will be displayed as NVMe devices, where the
tcp
string indicates that the devices are connected via the TCP transport. Install the operating system (in particular the EFI boot partition and the root file system) on these namespaces.Complete the installation.
After installation, the system should boot from NVMe-oF over TCP automatically. If it does not, check if the boot priority is set correctly in the BIOS setup.
The network interfaces used for booting are named
nbft0
, nbft1
and so on. To get
information about the NVMe-oF boot, run the command:
#
nvme nbft show
17.6 More information #
For more details about the abilities of the nvme
command, refer to nvme nvme-help
.
The following links provide a basic introduction to NVMe and NVMe-oF: