RAID/Software

About the Install
Software RAID is compatible with a dual boot environment involving Windows but they won't be able to mount or read any partition involved in the pure software RAID. All pseudo-hardware RAID controllers must be turned off.

This HOWTO assumes you are using SATA drives but it should work equally well with IDE drives. If you are using IDE drives, for maximum performance make sure that each drive is a master on its own separate channel.

WARNING
Before you start thinking about partitioning you should be aware that there are different styles of RAID metadata (superblock), some of which limit your possibilities. This is an excerpt from man mdadm (see later for the installation of it):
 * 0, 0.90: Use the  original  0.90  format superblock.  This format limits arrays to 28 component devices and limits component devices of levels 1 and greater to 2 terabytes.
 * 1, 1.0, 1.1, 1.2 default: Use the new version-1 format superblock. This has few  restrictions. The  different sub-versions store the superblock at different locations on the device, either at the end (for 1.0), at the start (for 1.1) or 4K from the start (for 1.2). "1"  is  equivalent to "1.0". "default" is equivalent to "1.2".
 * ddf: Use the "Industry Standard" DDF (Disk Data Format) format defined by SNIA. When creating a DDF array a CONTAINER will be created, and normal arrays can be created  in that container.
 * imsm: Use the Intel(R) Matrix Storage Manager metadata format. This creates a CONTAINER which is managed in a similar manner to DDF, and is supported by an option-rom on some platforms: http://www.intel.com/design/chipsets/matrixstorage_sb.htm

As you can see if you take a closer look and read up on these styles, it might be a good idea to go with the newer ones, i.e. 1.x. There are a few points you should be aware of:
 * Only <= 0.9 has autodetection on startup -> YES, you DO need an initramfs if you choose 1.x - genkernel is your friend!
 * 1.x will in all likelihood make it hard if not impossible (for me it was) to use "Grub Legacy" (versions <=0.97*) as your bootloader - it just doesn't want to get installed into the MBR. Some errors that may occur are related to cross-disk install - just google it.
 * 1.x will work with Grub2 (>=1.98) in combination with the GPT partition table and the grub_bios partition
 * 1.x didn't work for me with any of the Grub2 versions including the SVN one when trying to install into the MBR and using a classic msdos partition table!
 * If you choose 1.x, Grub2 and GPT, genkernel might have bugs that are only fixed in the most recent versions, perhaps even only in the SVN version (05.05.2011: it works just fine for me!)

If you don't want to stick with the old <=0.9 RAID metadata styles, STOP HERE and go reading up on GPT: http://www.rodsbooks.com/gdisk/

Upon reading all of this, you will know, that you want to have a partition at the beginning of your disk of roughly 1MB (yes it can also be smaller - but better safe than sorry) of type bios_grub, so that gdisk -l gives you an output like this: Number Start (sector)    End (sector)  Size       Code  Name 1           2048            4095   1024.0 KiB  EF02  bios_grub

Aside from some fun with genkernel and Grub2, you can follow the guide as is!

Using mdadm RAID metadata <=0.90 (you need to make adjustments to this if using 1.x)
To partition drives similarly to how the gentoo install docs suggest: device        mount         size /dev/sda1     /boot         32MB /dev/sda2     swap          >=2*RAM /dev/sda3     /             10GB /dev/sda4     /home         180GB (This partition is optional but recommended)

When you partition your disks, make sure that your partitions use fd (Linux RAID autodetect) as Partition Type instead of the default 83 (Linux native) or 82 (swap).

would be best chosen as a RAID1. Recall that in RAID1, data is mirrored on multiple disks, so if there is a problem with your RAID somehow, GRUB/LILO could point to any of the copies of the kernel on any of the partitions in the RAID1 and a normal boot will occur.

In this HOWTO,, , and the swap partition will be RAID 1 (mirror). Other than for the /boot partition, for better performance you could use RAID10 in the far layout instead (raid10,f2), this is a direct replacement with the enhanced raid10 driver, which gives double the sequential read speed compared to raid1. Use "--level=10 -p f2" as additional parameters when creating arrays with mdadm.

If you do not place your swap partition on RAID and a drive containing your swap partition dies, your system will likely die when your system tries to access the swap partition.

Load Kernel Modules
Load the appropriate RAID module.

For RAID-1, RAID-0 and RAID-5 respectively.

Setup Partitions
You can partition your drives with tools such as fdisk or cfdisk. There is nothing different here except to make sure:


 * 1) Your partitions are the same size on each drive. See below for instructions on copying a partition map.
 * 2) Your partitions to be included in the RAID are set to partition type fd, Linux RAID auto-detect. If not set to fd, the partitions will fail to be added to the RAID on reboot.

This might be a good time to play with the hdparm tool. It allows you to change hard drive access parameters, which might speed up disk access. Another use is if you are using a whole disk as a hot spare. You may wish to change its spin down time so that it spends most of its time in standby, thus extending its life.

You can also setup the first disk partitions and then copy the entire partition table to the second disk with the following command:

Large Partitions
If you use partitions larger than 2 TB you must use GPT (GUID Partition Table). E.g. fdisk warning message:

WARNING: The size of this disk is 3.0 TB (3000592982016 bytes). DOS partition table format can not be used on drives for volumes larger than (2199023255040 bytes) for 512-byte sectors. Use parted(1) and GUID partition table format (GPT). WARNING: DOS-compatible mode is deprecated. It's strongly recommended to         switch off the mode (command 'c') and change display units to          sectors (command 'u').

Include in kernel it.

Install parted

Use it to make partition table. E.g.:

Setup RAID
After partitioning, create the /etc/mdadm.conf file (yes, indeed, on the Installation CD environment) using mdadm, an advanced tool for RAID management. For instance, to have the boot, swap and root partition mirrored (RAID-1) covering and, the following commands can be used:

Or if you are lazy:

On the other hand if you want to put 4 partitions (sdc1, sdd1, sde1, sdf1) into a single RAID-5 then try this command:

Optional: Creating degraded arrays from the beginning
Migrating from one drive to multiple drives? Do you have only some of your drives available and want to add the others later? You can specify 'missing' instead of them! After you plug them in, just --add them to the array and it will sync automatically. However, you have to be aware of:
 * disk drive numbering (both in your system (sda/sdb/hda) and in your bootloader!)
 * precise sizes of the drives; you should generally make arrays slightly smaller than your whole device is, because the newer ones may be a couple of megabytes smaller, even from the same manufacturer/series

Mixing different vendors/models in redundant arrays is a good practice in case a whole series of drives is found to have issues. This has already happened many times, with both HDDs and SSDs!

Syncing process
You may check to see if the RAID devices are done syncing:

you can use also:

which refreshes the output of 10 times a second. You can cancel the output with:

It should look something like this (showing one array syncing and the other one already completed): Personalities : [raid1] md2 : active raid1 sdb3[1] sda3[0] 184859840 blocks [2/2] [UU] [======>..............] resync = 33.1% (61296896/184859840) finish=34.3min  speed=59895K/sec md1 : active raid1 sdb1[1] sda1[0] 10000320 blocks [2/2] [UU] unused devices: If an array is still syncing, you may still proceed to creating filesystems, because the sync operation is completely transparent to the file system. (Note: if a drive happens to fail before the RAID sync finishes, then you're in trouble.)

Creating filesystems
or

stride=stride-size Configure the filesystem for a RAID array with stride-size filesystem blocks. This is the number of blocks read or written to disk before moving to next disk. This mostly affects placement of filesystem metadata like bitmaps at mke2fs(2) time to avoid placing them on a single disk, which can hurt the  performance. It may also be used by block allocator.

stripe_width=stripe-width Configure the filesystem for a RAID array with stripe-width filesystem blocks per stripe. This is typically be stride-size * N, where N is the number of data disks in the RAID (e.g. RAID 5 N+1, RAID 6 N+2). This allows the block allocator to prevent read-modify-write of the parity in a RAID stripe if possible when the data is written.

Create the Swap Partition
As described above, we earlier used RAID-0 for our swap partition. But if one of your discs dies, the system will most likely crash (since in a RAID-0 the swap data will be split over all discs). So now we use a mirrored array type:

Your fstab could look like:

There is no performance reason to use RAID for swap. The kernel itself can stripe swapping on several devices if you give them the same priority in the file. Using a mirrored raid type such as raid1 or raid10,f2 will make writing in the swap area half the speed, as data is written twice.

A striped looks like:

For reliability reasons, you may choose to use RAID for swap. With a non-RAID configuration as shown above, a drive failure on any of the swap can crash your system. Also, the above configuration, while it may be faster than using a single drive for swap, it is also 2 times more likely for a drive to fail and take your system with it.

Mount Partitions
Turn the swap on:

Mount the, and  RAIDs:

Copy RAID configuration

Make chrooted environment like real ;-)

Continue the Install
Continue with the Gentoo Handbook starting with the section entitled "Installing the Gentoo Installation Files". Use for the boot partition,  for the root partition and  for the home partition.

When you're configuring your kernel, make sure you have the appropriate RAID support in your kernel and not as module.

When installing extra tools, emerge mdadmin as well.

otherwise mdadm will not be loaded at boot time.

When configuring your bootloader, make sure it gets installed in the MBR of both disks if you use mirroring (RAID 1).

Installing Grub onto both MBRs
Since the partition is a RAID, grub cannot read it to get the bootloader. It can only access physical drives. Thus, you still use (hd0,0) in this step.

Run grub:

You must see GRUB prompt:

grub>

If you are using a RAID 1 mirror disk system, you will want to install grub on all the disks in the system, so that when one disk fails, you are still able to boot. The find command above will list the disks, e.g.

grub> find /boot/grub/stage1 (hd0,0) (hd1,0) grub>

Now, if your disks are and, do the following to install GRUB on  MBR:

device (hd0) /dev/sda root (hd0,0) setup (hd0)

This will install grub into the MBR:

device (hd0) /dev/sdb root (hd0,0) setup (hd0)

The device command tells grub to assume the drive is (hd0), i.e. the first disk in the system, when it is not necessarily the case. If your first disk fails, however, your second disk will then be the first disk in the system, and so the MBR will be correct.

The grub.conf does change from the normal install. The difference is in the specified root drive, it is now a RAID drive and no longer a physical drive. For example it would look like:

For genkernel:

Where md3 is your /. However, md3, or any other md might be renamed by the system. When I rebooted after completing the install, my md1,2 and 3 were renamed to md125,125 and 127. If, when you restart your system without the install disk, it tells you your / file system is broken and cannot be repaired, give your password and cd to /dev and ls md* to see if your system has renamed your md's to and then rename your /etc/fstab entries and your real_root entry here, accordingly. You might also want to double check /etc/mtab to be sure that they share the same names you found in /dev

Another fix for the issue where the arrays are renamed to "/dev/md125", "/dev/md126", and "/dev/md127" (/dev/md1, /dev/md2, and /dev/md3 respectively) is to boot up off of the live CD, stop the array, and re-assemble it with the correct device ID: mdadm --stop /dev/md125 mdadm --assemble /dev/md1 /dev/sda1 /dev/sdb1 mdadm --stop /dev/md126 mdadm --assemble /dev/md2 /dev/sda2 /dev/sdb2 mdadm --stop /dev/md127 mdadm --assemble /dev/md3 /dev/sda3 /dev/sdb3

Then update your configuration with: mdadm --detail --scan >> /etc/mdadm.conf

Installing Grub2 onto both disks
If you chose the RAID metadata style 1.x, you will most certainly be working with Grub2 now. First things first: The configuration files have changed! There is no longer a /boot/grub/grub.conf! Grub2 has its configuration scripts in /etc/grub.d. There are also the autodetection scripts, which will fail in this case as they won't include the necessary flags for genkernel and your raid device won't be activated. So you need to edit

Here we added dolvm domdadm. They tell genkernel to search for lvms and mdadm software raids and activate them. Otherwise, even if you chose the options in /etc/genkernel.conf, it won't get automatically activated. You choose the capabilities of your genkernel with /etc/genkernel.conf, but the choice of what gets activated you have to pass along with the boot arguments.

The method above is out-dated.You dont't need to hack the grub scripts, just edit'''

Now you need to run genkernel and get your kernel in the /boot directory. Running grub-mkconfig will create your configuration according to your files in /etc/grub.d. Most likely, the result will be to your liking and you can write the output to your grub configuration file /boot/grub/grub.cfg:

A quick (you have to substitute /dev/sdX with your actual raid members, one command for each!)

should install Grub2 into your disks.

Misc RAID stuff
To see if RAID is functioning properly after reboot do:

There should be one entry per RAID drive. The RAID 1 drives should have a "[UU]" in the entry, letting you know that the two hard drives are "up, up". If one goes down you will see "[U_]". If this ever happens your system will still run fine, but you should replace that hard drive as soon as possible.

To rebuild a RAID 1:
 * 1) Power down the system
 * 2) Replace the failed disk
 * 3) Power up the system once again
 * 4) Create identical partitions on the new disk - i.e copy the partition scheme from the drive that is still online.
 * 5) Remove the old partition from the array and add the new partition back

You can copy a partition map from one disk to another with sfdisk. Additionally, since the target drive is not in use we can rewrite partition map with fdisk to force the partition map to be re-read by the kernel:

To remove the failed partition and add the new partition:

Watch the automatic reconstruction run with:

If one of the partitions is a boot partition, don't forget to re-run grub on that partition so that grub boots from the new disk, not from the disk you copied the partition from using sfdisk. This is important if you ever have to replace that disk!

Notification
If you want to receive e-mail alerts about your RAID system mdadmin must be configured with your e-mail address.

Make sure you can send mail from your machine. If all you need is basic SMTP support, you may wish to consider installing nail. This is a version of mail that can be compiled with SMTP support.

Make sure that the next line is in the with the correct To e-mail address:

To verify that e-mail notification works, use this test command:

Finally add the mdadm script to your default RC, and start it to begin monitoring:

You may also want to specify a dependency on net for mdadm, as else it may try to mail before network in available, resulting in no mail. Now if one of your disks fails you will be notified at the address supplied.

Write-intent bitmap
A write-intent bitmap is used to record which areas of a RAID component have been modified since the RAID array was last in sync. Basically, the RAID driver periodically writes out a small table recording which portions of a RAID component have changed. Therefore, if you lose power before all drives are in sync, when the array starts up a full re-sync is not needed. Only the changed portions need to be re-synced.

To turn on write-intent bitmapping
Install a modern mdadm: >=sys-fs/mdadm-2.4.1 Install a modern kernel: >=2.6.16

Your RAID volume must be configured with a persistent superblock and has to be fully synchronized. Use the following command to verify whether these conditions have been met:

Make sure it says:

State      : active Persistence : Superblock is persistent

Add a bitmap with the following command:

You can monitor the status of the bitmap as you write to your array with:

To turn off write-intent bitmapping
Remove the bitmap with the following command:

Data Scrubbing
In short: Especially if you run a RAID5 array, trigger an active bad block check on a regular basis, or there is a high chance of hidden bad blocks making your RAID unusable during reconstruction.

Normally, RAID passively detects bad blocks. If a read error occurs, the data is reconstructed from the rest of the array, and the bad block is rewritten. If the block can not be rewritten, the defective disk is kicked out of the active array.

Once the defective drive is replaced, reconstruction will cause all blocks of the remaining drives to be read. If this process runs across a previously undetected bad block on the remaining drives, another drive will be marked as failed, making RAID5 unusable. The larger the disks, the higher the odds that passive bad block detection will be inadaquate. Therefore, with today's large disks it is important to actively perform data scrubbing on your array.

With a modern (>=2.6.16) kernel, this command will initiate a data consistency and bad block check, reading all blocks, checking them for consistency, and attempting to rewrite inconsistent blocks and bad blocks.

You can monitor the progress of the check with:

(The “0.5” is localized, and you may have to write it as “0,5”. Also, too quick polling will reduce the speed.)

You should have your array checked daily or weekly by adding the appropriate command to /etc/crontab.

If you find yourself needlessly checking your array (like I was) and want to stop it safely, you can either stop the entire array, or: