Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
documentation.suse.com / Documentación de SUSE Linux Enterprise High Availability Extension / Administration Guide / Storage and data replication / DRBD
Applies to SUSE Linux Enterprise High Availability 15 SP5

23 DRBD

The distributed replicated block device (DRBD*) allows you to create a mirror of two block devices that are located at two different sites across an IP network. When used with Corosync, DRBD supports distributed high-availability Linux clusters. This chapter shows you how to install and set up DRBD.

23.1 Conceptual overview

DRBD replicates data on the primary device to the secondary device in a way that ensures that both copies of the data remain identical. Think of it as a networked RAID 1. It mirrors data in real-time, so its replication occurs continuously. Applications do not need to know that in fact their data is stored on different disks.

DRBD is a Linux Kernel module and sits between the I/O scheduler at the lower end and the file system at the upper end, see Figure 23.1, “Position of DRBD within Linux”. To communicate with DRBD, users use the high-level command drbdadm. For maximum flexibility DRBD comes with the low-level tool drbdsetup.

Position of DRBD within Linux
Figure 23.1: Position of DRBD within Linux
Important
Important: Unencrypted data

The data traffic between mirrors is not encrypted. For secure data exchange, you should deploy a Virtual Private Network (VPN) solution for the connection.

DRBD allows you to use any block device supported by Linux, usually:

  • partition or complete hard disk

  • software RAID

  • Logical Volume Manager (LVM)

  • Enterprise Volume Management System (EVMS)

By default, DRBD uses the TCP ports 7788 and higher for communication between DRBD nodes. Make sure that your firewall does not prevent communication on the used ports.

You must set up the DRBD devices before creating file systems on them. Everything pertaining to user data should be done solely via the /dev/drbdN device and not on the raw device, as DRBD uses the last part of the raw device for metadata. Using the raw device will cause inconsistent data.

With udev integration, you will also get symbolic links in the form /dev/drbd/by-res/RESOURCES which are easier to use and provide safety against remembering the wrong minor number of the device.

For example, if the raw device is 1024 MB in size, the DRBD device has only 1023 MB available for data, with about 70 KB hidden and reserved for the metadata. Any attempt to access the remaining kilobytes via raw disks fails because it is not available for user data.

23.2 Installing DRBD services

Install the High Availability pattern on both SUSE Linux Enterprise Server machines in your networked cluster as described in Part I, “Installation and setup”. Installing the pattern also installs the DRBD program files.

If you do not need the complete cluster stack but only want to use DRBD, install the packages drbd, drbd-kmp-FLAVOR, drbd-utils, and yast2-drbd.

23.3 Setting up the DRBD service

Note
Note: Adjustments needed

The following procedures use the server names alice and bob, and the DRBD resource name r0. It sets up alice as the primary node and /dev/disk/by-id/example-disk1 for storage. Make sure to modify the instructions to use your own nodes and file names.

The following procedures assume that the cluster nodes use the TCP port 7788. Make sure this port is open in your firewall.

To set up DRBD, perform the following procedures:

23.3.1 Preparing your system to use DRBD

Before you start configuring DRBD, you might need to perform some or all of the following steps:

Procedure 23.1: Preparing your system to use DRBD
  1. Make sure the block devices in your Linux nodes are ready and partitioned (if needed).

  2. If your disk already contains a file system that you do not need anymore, destroy the file system structure with the following command:

    # dd if=/dev/zero of=YOUR_DEVICE count=16 bs=1M

    If you have more file systems to destroy, repeat this step on all devices you want to include into your DRBD setup.

  3. If the cluster is already using DRBD, put your cluster in maintenance mode:

    # crm maintenance on

    If you skip this step when your cluster already uses DRBD, a syntax error in the live configuration leads to a service shutdown. Alternatively, you can also use drbdadm -c FILE to test a configuration file.

23.3.2 Configuring DRBD manually

Note
Note: Restricted support for auto promote feature

The DRBD9 auto-promote feature can automatically promote a resource to the primary role when one of its devices is mounted or opened for writing.

The auto promote feature has currently restricted support. With DRBD 9, SUSE supports the same use cases that were also supported with DRBD 8. Use cases beyond that, such as setups with more than two nodes, are not supported.

To set up DRBD manually, proceed as follows:

Procedure 23.2: Manually configuring DRBD

Beginning with DRBD version 8.3, the former configuration file is split into separate files, located under the directory /etc/drbd.d/.

  1. Open the file /etc/drbd.d/global_common.conf. It already contains global, pre-defined values. Go to the startup section and insert these lines:

    startup {
        # wfc-timeout degr-wfc-timeout outdated-wfc-timeout
        # wait-after-sb;
        wfc-timeout 100;
        degr-wfc-timeout 120;
    }

    These options are used to reduce the timeouts when booting, see https://docs.linbit.com/docs/users-guide-9.0/#ch-configure for more details.

  2. Create the file /etc/drbd.d/r0.res. Change the lines according to your situation and save it:

    resource r0 { 1
      device /dev/drbd0; 2
      disk /dev/disk/by-id/example-disk1; 3
      meta-disk internal; 4
      on alice { 5
        address  192.168.1.10:7788; 6
        node-id 0; 7
      }
      on bob { 5
        address 192.168.1.11:7788; 6
        node-id 1; 7
      }
      disk {
        resync-rate 10M; 8
      }
      connection-mesh { 9
        hosts alice bob;
      }
      net {
        protocol C; 10
        fencing resource-and-stonith; 11
       }
      handlers { 12
        fence-peer "/usr/lib/drbd/crm-fence-peer.9.sh";
        after-resync-target "/usr/lib/drbd/crm-unfence-peer.9.sh";
       }
    }

    1

    DRBD resource name that allows some association to the service that needs them. For example, nfs, http, mysql_0, postgres_wal, etc. Here a more general name r0 is used.

    2

    The device name for DRBD and its minor number.

    In the example above, the minor number 0 is used for DRBD. The udev integration scripts will give you a symbolic link /dev/drbd/by-res/nfs/0. Alternatively, omit the device node name in the configuration and use the following line instead:

    drbd0 minor 0 (/dev/ is optional) or /dev/drbd0

    3

    The raw device that is replicated between nodes. Note, in this example the devices are the same on both nodes. If you need different devices, move the disk parameter into the on host.

    4

    The meta-disk parameter usually contains the value internal, but it is possible to specify an explicit device to hold the metadata. See https://docs.linbit.com/docs/users-guide-9.0/#s-metadata for more information.

    5

    The on section states which host this configuration statement applies to.

    6

    The IP address and port number of the respective node. Each resource needs an individual port, usually starting with 7788. Both ports must be the same for a DRBD resource.

    7

    The node ID is required when configuring more than two nodes. It is a unique, non-negative integer to distinguish the different nodes.

    8

    The synchronization rate. Set it to one third of the lower of the disk- and network bandwidth. It only limits the resynchronization, not the replication.

    9

    Defines all nodes of a mesh. The hosts parameter contains all host names that share the same DRBD setup.

    10

    The protocol to use for this connection. Protocol C is the default option. It provides better data availability and does not consider a write to be complete until it has reached all local and remote disks.

    11

    Specifies the fencing policy resource-and-stonith at the DRBD level. This policy immediately suspends active I/O operations until STONITH completes.

    12

    Enables resource-level fencing to prevent Pacemaker from starting a service with outdated data. If the DRBD replication link becomes disconnected, the crm-fence-peer.9.sh script stops the DRBD resource from being promoted to another node until the replication link becomes connected again and DRBD completes its synchronization process.

  3. Check the syntax of your configuration files. If the following command returns an error, verify your files:

    # drbdadm dump all
  4. Copy the DRBD configuration files to all nodes:

    # csync2 -xv

    By default, the DRBD configuration file /etc/drbd.conf and the directory /etc/drbd.d/ are already included in the list of files that Csync2 synchronizes.

23.3.3 Configuring DRBD with YaST

YaST can be used to start with an initial setup of DRBD. After you have created your DRBD setup, you can fine-tune the generated files manually.

However, when you have changed the configuration files, do not use the YaST DRBD module anymore. The DRBD module supports only a limited set of basic configuration. If you use it again, it is likely that the module will not show your changes.

To set up DRBD with YaST, proceed as follows:

Procedure 23.3: Using YaST to configure DRBD
  1. Start YaST and select the configuration module High Availability › DRBD. If you already have a DRBD configuration, YaST warns you. YaST will change your configuration and save your old DRBD configuration files as *.YaSTsave.

  2. Leave the booting flag in Start-up Configuration › Booting as it is (by default it is off); do not change that as Pacemaker manages this service.

  3. If you have a firewall running, enable Open Port in Firewall.

  4. Go to the Resource Configuration entry. Select Add to create a new resource (see Figure 23.2, “Resource configuration”).

    Resource configuration
    Figure 23.2: Resource configuration

    The following parameters need to be set:

    Resource Name

    The name of the DRBD resource (mandatory)

    Name

    The host name of the relevant node

    Address:Port

    The IP address and port number (default 7788) for the respective node

    Device

    The block device path that is used to access the replicated data. If the device contains a minor number, the associated block device is usually named /dev/drbdX, where X is the device minor number. If the device does not contain a minor number, make sure to add minor 0 after the device name.

    Disk

    The raw device that is replicated between both nodes. If you use LVM, insert your LVM device name.

    Meta-disk

    The Meta-disk is either set to the value internal or specifies an explicit device extended by an index to hold the metadata needed by DRBD.

    A real device may also be used for multiple DRBD resources. For example, if your Meta-Disk is /dev/disk/by-id/example-disk6[0] for the first resource, you may use /dev/disk/by-id/example-disk6[1] for the second resource. However, there must be at least 128 MB space for each resource available on this disk. The fixed metadata size limits the maximum data size that you can replicate.

    All these options are explained in the examples in the /usr/share/doc/packages/drbd/drbd.conf file and in the man page of drbd.conf(5).

  5. Click Save.

  6. Click Add to enter the second DRBD resource and finish with Save.

  7. Close the resource configuration with Ok and Finish.

  8. If you use LVM with DRBD, it is necessary to change some options in the LVM configuration file (see the LVM Configuration entry). This change can be done by the YaST DRBD module automatically.

    The disk name of localhost for the DRBD resource and the default filter will be rejected in the LVM filter. Only /dev/drbd can be scanned for an LVM device.

    For example, if /dev/disk/by-id/example-disk1 is used as a DRBD disk, the device name will be inserted as the first entry in the LVM filter. To change the filter manually, click the Modify LVM Device Filter Automatically check box.

  9. Save your changes with Finish.

  10. Copy the DRBD configuration files to all nodes:

    # csync2 -xv

    By default, the DRBD configuration file /etc/drbd.conf and the directory /etc/drbd.d/ are already included in the list of files that Csync2 synchronizes.

23.3.4 Initializing and formatting DRBD resources

After you have prepared your system and configured DRBD, initialize your disk for the first time:

  1. On both nodes (alice and bob), initialize the metadata storage:

    # drbdadm create-md r0
    # drbdadm up r0
  2. To shorten the initial resynchronization of your DRBD resource check the following:

    • If the DRBD devices on all nodes have the same data (for example, by destroying the file system structure with the dd command as shown in Section 23.3, “Setting up the DRBD service”), then skip the initial resynchronization with the following command (on both nodes):

      # drbdadm new-current-uuid --clear-bitmap r0/0

      The state will be Secondary/Secondary UpToDate/UpToDate

    • Otherwise, proceed with the next step.

  3. On the primary node alice, start the resynchronization process:

    # drbdadm primary --force r0
  4. Check the status with:

    # drbdadm status r0
    r0 role:Primary
      disk:UpToDate
      bob role:Secondary
      peer-disk:UpToDate
  5. Create your file system on top of your DRBD device, for example:

    # mkfs.ext3 /dev/drbd0
  6. Mount the file system and use it:

    # mount /dev/drbd0 /mnt/

23.3.5 Creating cluster resources for DRBD

After you have initialized your DRBD device, create a cluster resource to manage the DRBD device, and a promotable clone to allow this resource to run on both nodes:

Procedure 23.4: Creating cluster resources for DRBD
  1. Start the crm interactive shell:

    # crm configure
  2. Create a primitive for the DRBD resource r0:

    crm(live)configure# primitive drbd-r0 ocf:linbit:drbd \
      params drbd_resource="r0" \
      op monitor interval=15 role=Promoted \
      op monitor interval=30 role=Unpromoted
  3. Create a promotable clone for the drbd-r0 primitive:

    crm(live)configure# clone cl-drbd-r0 drbd-r0 \
      meta promotable="true" promoted-max="1" promoted-node-max="1" \
      clone-max="2" clone-node-max="1" notify="true" interleave=true
  4. Commit this configuration:

    crm(live)configure# commit
  5. Exit the interactive shell:

    crm(live)configure# quit

If you put the cluster in maintenance mode before configuring DRBD, you can now move it back to normal operation with crm maintenance off.

23.4 Creating a stacked DRBD device

A stacked DRBD device contains two other devices of which at least one device is also a DRBD resource. In other words, DRBD adds an additional node on top of an already existing DRBD resource (see Figure 23.3, “Resource stacking”). Such a replication setup can be used for backup and disaster recovery purposes.

Resource stacking
Figure 23.3: Resource stacking

Three-way replication uses asynchronous (DRBD protocol A) and synchronous replication (DRBD protocol C). The asynchronous part is used for the stacked resource whereas the synchronous part is used for the backup.

Your production environment uses the stacked device. For example, if you have a DRBD device /dev/drbd0 and a stacked device /dev/drbd10 on top, the file system will be created on /dev/drbd10, see Example 23.1, “Configuration of a three-node stacked DRBD resource” for more details.

Example 23.1: Configuration of a three-node stacked DRBD resource
# /etc/drbd.d/r0.res
resource r0 {
  protocol C;
  device    /dev/drbd0;
  disk      /dev/disk/by-id/example-disk1;
  meta-disk internal;

  on amsterdam-alice {
    address    192.168.1.1:7900;
  }

  on amsterdam-bob {
    address    192.168.1.2:7900;
  }
}

resource r0-U {
  protocol A;
  device     /dev/drbd10;

  stacked-on-top-of r0 {
    address    192.168.2.1:7910;
  }

  on berlin-charlie {
    disk       /dev/disk/by-id/example-disk10;
    address    192.168.2.2:7910; # Public IP of the backup node
    meta-disk  internal;
  }
}

23.5 Testing the DRBD service

If the install and configuration procedures worked as expected, you are ready to run a basic test of the DRBD functionality. This test also helps with understanding how the software works.

  1. Test the DRBD service on alice.

    1. Open a terminal console, then log in as root.

    2. Create a mount point on alice, such as /srv/r0:

      # mkdir -p /srv/r0
    3. Mount the drbd device:

      # mount -o rw /dev/drbd0 /srv/r0
    4. Create a file from the primary node:

      # touch /srv/r0/from_alice
    5. Unmount the disk on alice:

      # umount /srv/r0
    6. Downgrade the DRBD service on alice by typing the following command on alice:

      # drbdadm secondary r0
  2. Test the DRBD service on bob.

    1. Open a terminal console, then log in as root on bob.

    2. On bob, promote the DRBD service to primary:

      # drbdadm primary r0
    3. On bob, check to see if bob is primary:

      # drbdadm status r0
    4. On bob, create a mount point such as /srv/r0:

      # mkdir /srv/r0
    5. On bob, mount the DRBD device:

      # mount -o rw /dev/drbd0 /srv/r0
    6. Verify that the file you created on alice exists:

      # ls /srv/r0/from_alice

      The /srv/r0/from_alice file should be listed.

  3. If the service is working on both nodes, the DRBD setup is complete.

  4. Set up alice as the primary again.

    1. Dismount the disk on bob by typing the following command on bob:

      # umount /srv/r0
    2. Downgrade the DRBD service on bob by typing the following command on bob:

      # drbdadm secondary r0
    3. On alice, promote the DRBD service to primary:

      # drbdadm primary r0
    4. On alice, check to see if alice is primary:

      # drbdadm status r0
  5. To get the service to automatically start and fail over if the server has a problem, you can set up DRBD as a high availability service with Pacemaker/Corosync. For information about installing and configuring for SUSE Linux Enterprise 15 SP5 see Part II, “Configuration and administration”.

23.6 Monitoring DRBD devices

DRBD comes with the utility drbdmon which offers real time monitoring. It shows all the configured resources and their problems.

Showing a good connection by drbdmon
Figure 23.4: Showing a good connection by drbdmon

In case of problems, drbdadm shows an error message:

Showing a bad connection by drbdmon
Figure 23.5: Showing a bad connection by drbdmon

23.7 Tuning DRBD

There are several ways to tune DRBD:

  1. Use an external disk for your metadata. This might help, at the cost of maintenance ease.

  2. Tune your network connection, by changing the receive and send buffer settings via sysctl.

  3. Change the max-buffers, max-epoch-size or both in the DRBD configuration.

  4. Increase the al-extents value, depending on your IO patterns.

  5. If you have a hardware RAID controller with a BBU (Battery Backup Unit), you might benefit from setting no-disk-flushes, no-disk-barrier and/or no-md-flushes.

  6. Enable read-balancing depending on your workload. See https://www.linbit.com/en/read-balancing/ for more details.

23.8 Troubleshooting DRBD

The DRBD setup involves many components and problems may arise from different sources. The following sections cover several common scenarios and recommend various solutions.

23.8.1 Configuration

If the initial DRBD setup does not work as expected, there is probably something wrong with your configuration.

To get information about the configuration:

  1. Open a terminal console, then log in as root.

  2. Test the configuration file by running drbdadm with the -d option. Enter the following command:

    # drbdadm -d adjust r0

    In a dry run of the adjust option, drbdadm compares the actual configuration of the DRBD resource with your DRBD configuration file, but it does not execute the calls. Review the output to make sure you know the source and cause of any errors.

  3. If there are errors in the /etc/drbd.d/* and drbd.conf files, correct them before continuing.

  4. If the partitions and settings are correct, run drbdadm again without the -d option.

    # drbdadm adjust r0

    This applies the configuration file to the DRBD resource.

23.8.2 Host names

For DRBD, host names are case-sensitive (Node0 would be a different host than node0), and compared to the host name as stored in the Kernel (see the uname -n output).

If you have several network devices and want to use a dedicated network device, the host name will likely not resolve to the used IP address. In this case, use the parameter disable-ip-verification.

23.8.3 TCP port 7788

If your system cannot connect to the peer, this might be a problem with your local firewall. By default, DRBD uses the TCP port 7788 to access the other node. Make sure that this port is accessible on both nodes.

23.8.4 DRBD devices broken after reboot

In cases when DRBD does not know which of the real devices holds the latest data, it changes to a split-brain condition. In this case, the respective DRBD subsystems come up as secondary and do not connect to each other. In this case, the following message can be found in the logging data:

Split-Brain detected, dropping connection!

To resolve this situation, enter the following commands on the node which has data to be discarded:

# drbdadm disconnect r0
# drbdadm secondary r0
# drbdadm connect --discard-my-data r0

On the node which has the latest data, enter the following commands:

# drbdadm disconnect r0
# drbdadm connect r0

That resolves the issue by overwriting one node's data with the peer's data, therefore getting a consistent view on both nodes.

23.9 For more information

The following open source resources are available for DRBD: