Heartbeat and DRBD

Heartbeat and DRBD are high-availability solutions. Heartbeat (part of the Linux-HA-Project) manages a cluster of servers and makes sure all tasks are worked on, and DRBD (Distributed Replicated Block Device) is the storage analogy, making sure all data gets sent around and is always online. Together, they can help you with making the damage after a hardware or software fail as small as possible.

The drawback of DRBD currently is that you can only write and read from the primary(master) node, the partition you mirror can not be mounted on the secondary(slave) node, you have to unmount it on the primary node(node1), then inform the drbd client on the secondary node(node2) to run as primary and then mount the partition to read and write the data.

To be able to write on both nodes simultaneously would need a cluster file system f.e. GFS2 or OCFS2, and a different DRBD node setup. But this is explained in another document.

Assumptions and starting configuration
It is assumed you have two identical Gentoo installations. If you are only here for the information and are not setting up two or more physical boxes, you can run these in a VM like VMWare. Both installations have a public static IP address, and the secondary nic should have some type of private IP address. You will also need an additional public static IP address that will be used for the "service" IP address. Everything relying on the cluster as a whole should use this IP address for services.

System Configuration
Start by tweaking the network devices. It might be that your current configuration already works.

Preparing your HD for DRBD
If you want to use DRBD for mirroring, you should create an extra partition where you save the data you want to mirror to other nodes(e.g. /var/lib/postgresql for postgresql or /var/www for apache). Additionally to the mirrored data DRBD needs at least 128MB to save meta-data. For example, here's how to create an additional virtual disk and put 2 partitions on it, one for Apache and the other for MySQL.

Partition table for DRBD

To specify a exact partition size, you should change the units to sectors by issuing the command "u"(As explained in the command snippet above). Then create the partitions as explained in the gentoo handbook.

Kernel Configuration
Activate the following options:

Installing, configuring and running DRBD
Please note that you need to do the following one each cluster. Install DRBD:

emerge -av drbd

After you've successfully installed DRBD you'll need to create the configuration file. The following is the complete configuration.

It should be noted that "testcluster1" and "testcluster2" must match the hostname of your boxes.

Don't forget to copy this file to both node locations.

Now it's time to setup DRBD. Run the following commands on both nodes.

Now on the primary node run the following to synchronize both drbd disks:

At this point a full synchronization should be occurring. You can monitor the progress with the following command.

monitor the synchronization progress

Depending on your hardware and the size of the partition, this could take some time. Later, when everything is synced, the mirroring will be very fast. See DRBD-Performance for more information.

You can now use the /dev/drbd0 and /dev/drbd1 as normal disks even before syncing has finished. So lets go ahead and format the disks. Use what ever format you want. Do this on first node. In this example we use ext3.

Formatting the disks:

Now setup the primary and secondary nodes. Notice these commands are different for each node.

Make sure your add the mount points to the fstab and they are set to noauto. Again this needs to be done on both nodes.

Time to create mount points, both nodes again.

You can mount them on the first node:

MySQL should already be installed but we need to configure it to use the DRBD device. We do that by simply putting all the databases and logs in /wwwjail/mysql. In a production environment, you'd probably break out logs, database and index files onto different devices. Since this is an experimental system, we'll just put everything into one resource.

Make sure no bind address is set because we need to bind to all interfaces and then limit access with iptables if need be. This needs to go on both nodes.

Now we need to install a mysql database to the shared drive. Issue the following command on both nodes.

If everything has gone well up to this point you can add DRBD to the default runlevel.

After syncing (unless your brave) you should be able to start DRBD normally:

The DRBD service should automatically load the drbd-kernel module automatically:

kernel modules listing

You can revert the roles again to verify if it is syncing in both ways, when you're fast enough, you can issue these commands in a few seconds but you'll only see that drbd was faster ;-)

When you start a reboot during testing, you will have to issue

on the node were you want the data to be retrieved from, as DRBD does not remember the roles(for drbd both nodes are equal). We will do this automatically with heartbeat later.

If you have different data across each node then you may have a split brain. To fix this run the following command. Note that this assumes that testcluster1 is more up to date than testcluster2. If the opposite is true reverse the commands for each.

Installing and Configuring Heartbeat
Heartbeat is based on init scripts so setting it up todo advanced things is not that difficult but its not going to be covered in this doc.

Again most of the information here needs to be run on both nodes. Go ahead and emerge heartbeat.

emerging heartbeat

All of the heartbeat config is done in /etc/ha.d/. Again most of the important config files were not included by default on install so you will need to create them.

The haresources is probably the most important file to configure. This lists what init scripts need to be run and the parameters to pass to the script. The path for scripts are /etc/ha.d/resource.d/ followed by /etc/init.d.

Please note the init scripts need to follow Linux Standard Base Core Specification specifically with the function return codes.

For example the IPaddr::192.168.0.100 will run the /etc/ha.d/resource.d/IPaddr script that will create a IP Alias on eth0 with the ipaddress 192.168.0.100.

drbddisk will run the drbdadmin primary drbd0 and Filesystem is basically just a mount.

The last file tells heartbeat how to communicate with the other nodes. Because this example was based on a simulated crossover cable setup to connect the 2 nodes we will just use a crc check.

If you plan on sending the heartbeat across the network you should use something a little stronger than crc. The follow is configuration for sha1.

Finally because the /etc/ha.d/authkeys file may contain a plain text password please setup permissions on both nodes.

Once the heartbeat configuration is complete, you can add it to the default runlevel and start it.