RAID5 + LVM2 + recovery + resize HOWTO

I was looking forward to creating a big fileserver with disk crash recovery capabilities. LVM2 with reiserfs partitions couldn’t do the trick for me. I had 3 200Gb disks “united” under a logical volume, and formated them with reiserfs and I want to test what would happen if one disk “crashed”. So I created a fake crash..I shut the machine down, pulled the plugs of a disk and rebooted. I managed to see the logical volume using the latest lvm2 sources and the latest version of the device mapper:

# lvm version
version LVM version: 2.00.24 (2004-09-16)
version Library version: 1.00.19-ioctl (2004-07-03)
version Driver version: 4.1.0

Unfortunately I had no luck in reading the reiserfs partition. The superblock was corrupted and the reiserfsck –rebuild-sb /device did not work… Salvation was impossible.
While googling the web and trying to find out possible solutions I came up to the wonderful idea of creating a software raid5 array of the 3 disks and have LVM2 on top of the raid. I would lose 1 disk in “space”…but I gained the ability to recover after an error and to be able to add more disks if that was necessary.

Before we continue I must say that it’s necessary that you HAVE worked before with raid and lvm so some commands are familiar to you. This is NOT a step by step guide…but more like a draft of how things are done.I am not going to explain every little detail…man pages and google are always around if you have any questions.

Enough of this…let’s start.

  • Initialization
  • First of all let’s say that we got our 3 disks on /dev/hde, /dev/hdg, /dev/hdi
    1) We create 1 partition on each one covering the total space using our favorite disk managment software (fdisk, cfdisk,etc). (btw, drives MUST be IDENTICAL).
    2) Then it’s time to create the /etc/raidtab file. Our contents should look like:

    raiddev /dev/md0
    raid-level 5
    nr-raid-disks 3
    nr-spare-disks 0
    persistent-superblock 1
    chunk-size 32
    parity-algorithm right-symmetric
    device /dev/hde1
    raid-disk 0
    device /dev/hdg1
    raid-disk 1
    device /dev/hdi1
    raid-disk 2

    3) Now let’s create our array:

    mkraid /dev/md0

    4) It’s time for LVM2 now…let’s edit the /etc/lvm/lvm.conf so that we add support for raid devices. My filter line looks like this:

    filter =[ “a|loop|”, “a|/dev/md0|”, “r|.*|” ]

    5) Start initializing the LVM:

    pvcreate /dev/md0 (you can issue a pvdisplay to see if all things are correct)
    vgcreate test /dev/md0 (you can issue a vgdisplay to see if all things are correct)

    6) Time to create a small logical volume just for testing:

    lvcreate -L15000 -nbig test

    (you can issue a lvdisplay to see if all things are correct)
    7) Now there’s something that’s distro-specific. “Usually” lvm is started on init script before software raid. But in our case, when a reboot occurs, we want a) start the raid b) start the lvm. I am using gentoo as a distro and gentoo had these things the other way round…It first started the lvm and then the raid, which resulted in errors during the boot process. This case is easily solved in gentoo by editing /etc/init.d/checkfs and moving the part about the LVM below the part about the software raid. The config file is really easy to read so I don’t think anyone might have a problem on that…
    8) Let’s test what we’ve done so far…Let’s format that logical volume we’ve created with ext3.

    mke2fs -j /dev/test/big

    9) Make an entry inside your /etc/fstab to point to a place you want to mount that logical volume…and then issue a:

    mount /dev/test/big

    10) You are now ready to start copying data onto that volume…I’d suggest that you copy 5-10Gb out of the first 15Gb that we’ve created (remember that -L15000 ?).

  • Now it’s time to simulate a crash! 🙂
  • 11) We first stop the raid device (after unmounting it and changing the activation of the logical volume, lvchange -a n /dev/test/big):

    raidstop /dev/md0

    12) Let’s destroy one disk. Open up again your favorite disk managment tool and pick up one disk to destroy…let’s say /dev/hdi. Delete the partition it already has…and create a new one. All previous data is now lost!
    13) If you want to make sure that you are on the right path of destroying everything…reboot your machine. Upon reboot you should get errors on the software raid and on the LVM not being able to activate the volume group “test”.
    14) Upon the root prompt issue:

    raidstart /dev/md0

    and then do a: cat /proc/mdstat
    You should probably see something similar to this:

    cat /proc/mdstat
    Personalities : [linear] [raid0] [raid1] [raid5] [multipath]
    md0 : active raid5 hdi1[2] hdg1[1] hde1[0]
    390716672 blocks level 5, 32k chunk, algorithm 3 [3/3] [UUU]
    [========>…………] resync = 43.9% (85854144/195358336) finish=115.9min speed=15722K/sec

    15) When that is finished, it will mean that raid5 has rebuilt the array after recovering from the “faulty” disk, that we’ve created, and the placement of the “new” drive. (both destruction and the new disk placement was done on step 12)
    16) Issue: vgscan
    It will make the volume group active again.

  • Resizing the Logical Volume
  • 17) Say that you need more space to that logical volume you had created…15Gb is not that much after all…

    lvextend -L100G /dev/test/big

    We’ve now made that previous 15Gb logical volume to a 100Gb one…already feels much better…doesn’t it ?
    18) But that’s not all, we now need to extend the ext3 partition to cover up all that “new space”

    e2fsck -f /dev/test/big ; resize2f /dev/test/big

    We first check that the partition is ok…and then resize it to the full extends of the logical volume.
    19) We are set! We just need to mount our new partition…and we now have 100Gb of space! You can now extend that even further or create more logical volumes to satisfy your needs.

  • Extend the raid5 array
  • This section is to come in a few days…stay tuned.

    I hope that all the abobe helped you to create a better and more secure fileserver. Comments are much appreciated.