Active-active DRBD with OCFS2

Introduction
This is a rewrite of the orginal article by Mathijs and edited by Mars105, most of the credits goes to them.

The main change is the re-appearance of the ocfs2 ebuild in the main portage tree and the new built in drbd kernel module (>=2.6.33).

System Configuration
This setup is about running DRBD on 2 machines in active-active mode with OCFS2 as the cluster filesystem to make this possible. The following howto should be followed identical on both nodes, config files should also be identical. Except for the parts where it's explicitly stated.

The author used two nodes (Tweedledum and Tweedledee) to build a common storage for Xen. The drbd device is built on top of a lvm volume. This is a choice made by the author but you may install on any block device you want.

At the time of writing the author was using =sys-kernel/xen-sources-2.6.34 as kernel and the xen USE flag was on. Others kernels may provide different output for theses commands.

Kernel Configuration
Activate the following options:

You may want to leave "OCFS2 expensive checks" unchecked. For the other items, read the help.

Alternatively, you can compile these options as modules and then add the following modules to /etc/conf.d/modules: ocfs2 ocfs2_dlmfs configfs ocfs2_dlm

DRBD
Since kernel 2.6.34, we have builtin DRBD support, so we need >=--8.3.8. Older ebuilds will try to pull the kernel module (sys-cluster/drbd-kernel).

After building your kernel with the drbd module, determine the module version and install the corresponding drbd package.

You may need to unmask the ebuild. (Replace with your platform)

Install the drbd tools.

OCFS2
Unmasking the needed ebuilds. (Replace with your platform)

Emerging ocfs2-tools

DRBD
Now we tell drbd how it must operate. Here we set two nodes (Tweedledum and Tweedledee) on which we created before a disk located at. Our drbd device will be on both nodes, the resource name is drbd_xen.

OCFS2
Some filesystems must be added to our fstab. The first two are needed by ocfs2 to work. The last is our drbd, this one is mandatory only for automatic boot time mount.

It will be a good idea to mount them now.

Configuration of ocfs2, same nodes as the drbd configuration.

Now we must tell the init script which cluster must be started.

Activating the DRBD resource
Now we are going to enable the DRBD resource for the first time. Theses commands must be issued on both nodes.

Create the device metadata. This step must be completed only on initial device creation. It initializes DRBD's metadata:

Enable the resource. This step associates the resource with its backing device (or devices, in case of a multi-volume resource), sets replication parameters, and connects the resource to its peer:

Result may be checked by reading

version: 8.3.7 (api:88/proto:86-92) built-in

1: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r   ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:125825244

Syncing the nodes
By now, DRBD has successfully allocated both disk and network resources and is ready for operation. What it does not know yet is which of your nodes should be used as the source of the initial device synchronization.

Start the initial full synchronization.

After issuing this command, the initial full synchronization will commence. You will be able to monitor its progress via /proc/drbd. It may take some time depending on the size of the device.

version: 8.3.7 (api:88/proto:86-92) built-in

1: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r   ns:20502072 nr:0 dw:0 dr:20510092 al:0 bm:1250 lo:146 pe:2782 ua:2048 ap:0 ep:1 wo:b oos:105334296 [==>.................] sync'ed: 16.3% (102864/122876)M finish: 0:35:27 speed: 49,468 (60,444) K/sec

After this is finished and you see consistent state at both sides, let's create the ocfs2 filesystem on the drbd device.

Building the OCFS2 filesystem
We can start drbd and ocfs2.

Now, use OCFS2's mkfs implementation to create the file system:

mkfs.ocfs2 1.4.4 Cluster stack: classic o2cb Label: ocfs2_xen Features: sparse backup-super unwritten inline-data strict-journal-super Block size: 4096 (12 bits) Cluster size: 4096 (12 bits) Volume size: 128845049856 (31456311 clusters) (31456311 blocks) Cluster groups: 976 (tail covers 6711 clusters, rest cover 32256 clusters) Extent allocator size: 67108864 (16 groups) Journal size: 268435456 Node slots: 2 Creating bitmaps: done Initializing superblock: done Writing system files: done Writing superblock: done Writing backup superblock: 4 block(s) Formatting Journals: done Growing extent allocator: done Formatting slot map: done Writing lost+found: done mkfs.ocfs2 successful This will create an OCFS2 file system with two node slots on, and set the filesystem label to ocfs2_xen.

Switch the DRBD in active-active mode
The final thing we have to do is configure DRBD for active-active mode.

Update your by adding the startup and net sections to your new resource.

Restart DRBD

Check the state of the cluster

version: 8.3.7 (api:88/proto:86-92) built-in

1: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r   ns:0 nr:0 dw:0 dr:408 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

If it looks fine and active-active (Primary/Primary), you can mount your new OCFS2 filesystem:

Cluster can be tested by creating a file on one node and reading on the other.

We can safely add drbd and ocfs2 on default run level.

Congratulations! You should have a working cluster now.

Resources
This init script come from the package submited by Mathijs in is original article. The only modification made here is to remove the checks for modules, as it will fail when DRBD is built in the kernel. And the mount check will presumably fail if the modules are not loaded.

Don't forget to make it executable.