Thursday, November 8, 2012

High-Availability storage using DRBD with OCFS2 on Ubuntu 12.04

This howto is designed to solve the following problems:
  • Files in sync over multiple servers in case of
    • Power outage in a single location
    • Hardware failure
    • OS lockup
  • Optimized read performance
  • Mounted filesystem on multiple servers
Other considerations specific to my solution:
  • Limited amount of writes
  • Low latency and high bandwidth between locations

Prerequisites

  • Two servers
  • Ubuntu 12.04
  • Equally sized raw harddrive partition on both servers
  • NTP client installed and running

DRBD set up to act as raid 1 (mirroring)

All steps should be done on both nodes unless stated otherwise

Install and activate DRBD tools and kernel module:
apt-get install drbd8-utils
modprobe drbd


Set up a resource config:
sudo nano /etc/drbd.d/disk.res
# Config by Jon Skarpeteig -- 06.11.2012
resource r0 {
        protocol C;
        syncer { rate 1000M; }
        startup {
                wfc-timeout  15;
                degr-wfc-timeout 60;
                become-primary-on both;
        }
        net {
# allow-two-primaries - Generally, DRBD has a primary and a secondary node.
# In this case, we will allow both nodes to have the filesystem mounted at
# the same time. Do this only with a clustered filesystem. If you do this
# with a non-clustered filesystem like ext2/ext3/ext4 or reiserfs, you will
# have data corruption.
                allow-two-primaries;

# after-sb-0pri discard-zero-changes - DRBD detected a split-brain scenario,
# but none of the nodes think they're a primary. DRBD will take the newest
# modifications and apply them to the node that didn't have any changes.
                after-sb-0pri discard-zero-changes;

# after-sb-1pri discard-secondary - DRBD detected a split-brain scenario,
# but one node is the primary and the other is the secondary. In this case,
# DRBD will decide that the secondary node is the victim and it will sync data
# from the primary to the secondary automatically.
                after-sb-1pri discard-secondary;

# after-sb-2pri disconnect - DRBD detected a split-brain scenario, but it can't
# figure out which node has the right data. It tries to protect the consistency
# of both nodes by disconnecting the DRBD volume entirely. You'll have to tell
# DRBD which node has the valid data in order to reconnect the volume.
                after-sb-2pri disconnect;

                cram-hmac-alg sha1;
                shared-secret "secret";
        }
        on server1 {
                device /dev/drbd0;
                disk /dev/sdb1;
                address 10.0.0.101:7788;
                meta-disk internal;
        }
        on server2 {
                device /dev/drbd0;
                disk /dev/sdb1;
                address 10.0.0.102:7788;
                meta-disk internal;
        }
}

Note: Here, the server1 and server2 are from the output of 'uname -n', and must be resolvable in DNS with both A and PTR records. For good measure, we can add these in /etc/hosts directly as well, not to break the DRBD link on DNS issues.

Initialize DRBD volume, and start drbd daemon:
drbdadm create-md r0
/etc/init.d/drbd start 
Then make Server1 the primary node (run this on server1 only!)
drbdadm -- --overwrite-data-of-peer primary all

At this point, it should synchronize the disks from Server1 => Server2. You can view the progress using
/etc/init.d/drbd status

Once the sync is complete, you can make server2 primary as well (run on server2 only):
drbdadm primary r0

OCFS2 to allow for file system to be mounted more than one place

All steps should be done on both nodes unless stated otherwise
 
First get management tools:
apt-get install ocfs2-tools

My config in /etc/ocfs2/cluster.conf
cluster:
        node_count = 2
        name = www

node:
        ip_port = 7777
        ip_address = 10.0.0.101
        number = 1
        name = server1
        cluster = www

node:
        ip_port = 7777
        ip_address = 10.0.0.102
        number = 2
        name = server2
        cluster = www

Note: The 'name' parameter must match exactly the 'on' parameter from drbd, and resolve to the local ip

Configure ocfs2 cluster:
sudo dpkg-reconfigure ocfs2-tools

Now, you can create the filesystem (on one server only):
mkfs.ocfs2 -L "www" /dev/drbd0

And mounting:
mkdir /var/www
echo "/dev/drbd0  /var/www  ocfs2  noauto,noatime,nodiratime,_netdev  0 0" >> /etc/fstab
mount /dev/drbd0

Now you should be all set up! Go ahead and test how it behaves by creating files, and removing them - and notice how all the changes are replicated.

8 comments:

  1. Re the problem of mounting at boot, you need to add '_netdev' option in fstab, so that mount waits until the network is established before mounting the device

    See answer to Q42 at https://oss.oracle.com/projects/ocfs2/dist/documentation/ocfs2_faq.html#MOUNT

    ReplyDelete
  2. For anyone trying to configure this on a virtual image (rackspace), you'll need to install drbd and ocfs2, you can get these from the kernel extras package:

    apt-get install linux-image-extra-3.2.0-24-virtual (change 3.2.0-24 to your kernel version "uname -r").

    If drbd doesn't work, you may need to install and compile manually:

    apt-get install -y linux-headers-server drbd8-utils build-essential psmisc bison flex
    apt-get install linux-headers-3.2.0-24-virtual (change 3.2.0-24 to your kernel version "uname -r").
    wget http://oss.linbit.com/drbd/8.4/drbd-8.4.1.tar.gz
    tar xfvz drbd-8.4.1.tar.gz
    cd drbd-8.4.1/
    ./configure --prefix=/usr --localstatedir=/var --sysconfdir=/etc --with-km

    make KDIR=/lib/modules/3.2.0-24-virtual/build
    make install

    ReplyDelete
  3. Can i use different port for DRBD? Eg:

    on server1 {
    device /dev/drbd0;
    disk /dev/sdb1;
    address 10.0.0.101:7788;
    meta-disk internal;
    }
    on server2 {
    device /dev/drbd0;
    disk /dev/sdb1;
    address 10.0.0.102:7799;
    meta-disk internal;
    }

    ReplyDelete
  4. If you trying to install DRBD on Ubuntu 12.4.03 LTS and facing issues, that drbd take to long, to respond to the commands,
    or that drbd doesnt work as expected and always fail.

    than: download the drbd source package directly from the drbd website.
    install gcc and make (if not present) (apt-get install gcc make flex)
    than decompress the source file (tar xvfz drbd......tar.gz)
    compile the source and install em:
    ./configure
    ./make
    ./make install (sudo required if you are not logged in as root)

    Note: the default installation will be placed on:
    /usr/local/
    /usr/local/etc
    /usr/local/etc/init.d

    ReplyDelete
  5. Auto mounting at boot not works. I had setup DRBD following this guid in Amazon EC2 instance.
    My fstab entry as following,

    /dev/drbd0 /var/www ocfs2 noauto,noatime,nodiratime,_netdev 0 0

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
    2. This will not work, i was figuring out this evening ...

      Use the label then it will work ...

      in /etc/fstab

      Instead of /dev/drbd0 ...
      LABEL=(LABEL_OF_DRBD0) /var/www ocfs2 _netdev,noatime 0 0

      Delete