Friday, November 28, 2008

Setting up LVM on an already-setup box

I have this box at work that someone else was nice enough to set up with Debian Lenny and a great big honkin' RAID5 array. We've already got the basic filesystem structure set up on the box, but we'd like to add the RAID as well.

I'm going to set up the RAID as its own volume group under LVM. This allows us, should the OS drive fail, to slap another drive with an OS on it into the machine, boot, and remount the RAID. It also allows us to dynamically add storage, say some sort of SAN, to the physical volume, then resize the logical volumes on the fly. It also simplifies things a little by keeping the OS volume group and the file storage volume group separate.

Being an LVM newbie, I'll be referencing A simple introduction to working with LVM and The LVM HOWTO. You can assume that most any LVM-specific command syntax I pulled out of one of those two sources.

Now, the first thing to do is set up a partition on the RAID array, as LVM runs on top of physical partitions. I do this with fdisk, because that's the way I learned it. :) If I were a bit more clever, or if I felt like it, I'd do this with a single call to sfdisk.

Next I create a physical volume for the RAID: pvcreate /dev/sda1, and then a volume group: vgcreate file-storage /dev/sda1. Checking my work with vgscan, I see:

Reading all physical volumes. This may take a while...
Found volume group "os" using metadata type lvm2
Found volume group "file-storage" using metadata type lvm2


Now I want to create a logical volume (LV) that encompasses the entire volume group. I do this by first examining the output of vgdisplay, where I see the line: "Free PE / Size 357375 / 1.36 TB" (I told you it was a big honkin' RAID. Also note here that "PE" means "Physical Extent", the size of one quantum of storage in LVM. One PE is exactly the same size as one LE, so here one LE is about 4 MB). I will thus create an LV of 357375 extents, lvcreate -n files file-storage -l 357375. With this done, it's time to format the LV.

After consulting with my colleagues, I've decided to use ext4 for the filesystem on the RAID. I like what I read about ext4, both on Wikipedia and from IBM, and as this box is slated to become a backup server, it seems like a good place to play with it. Before I begin, though, I'll update the kernel to the latest version in (Debian) testing: 2.6.26-1, so as to have the latest ext4 fixes that have been included in Debian kernels. Even with that, though, I'll want to add "nodealloc" to the line in my fstab for the RAID:

It should be noted that the stock 2.6.26 ext4 has problems with delayed allocation and with filesystems with non-extent based files. So until Debian starts shipping a 2.6.27 based kernel or a 2.6.26 kernel with at least the 2.6.26-ext4-7 patchset, you should mount ext4dev filesystems using -o nodelalloc and only use freshly created filesystems using "mke2fs -t ext4dev". (Without these fixes, if you try to use an ext3 filesystem which was converted using tune2fs -E test_fs -o extents /dev/DEV, you will probably hit a kernel BUG the moment you try to delete or truncate an old non-extent based file.)


At any rate, per the ext4 HOWTO, I'll create an ext4 filesystem on the "files" LV with: mke2fs -t ext4dev /dev/file-storage/files. And wait. And wait. And wait some more, because 1.36 TiB is a lot of space.

From here, the RAID is like any other filesystem. Pick a mount point, make sure to mount it "-o nodelalloc", and off you go.

CORRECTION: Debian kernel 2.6.26-1 does not support the "nodelalloc" mount option. I ended up installing kernel 2.6.27.7 from http://kernel.org/. As the "nodelalloc" option was only recommended for 2.6.26-based kernels, I am no longer mounting the ext4 filesystem with the "nodelalloc" option.

No comments: