2008.08.03 - 17:36:34 PDT
There is very little documentation out there regarding mirroring with linux LVM. Mostly, it's just not necessary given that mirroring can be done at the device level instead. There are definitely some situations where it can be useful however. Here's an example situation (this is why I was interested):
VG: export
PV: /dev/xvdb1
PV: /dev/xvdc1 (this is the one that failed)
I have a volume group (named "export") with a number of (large) logical volumes. The volume group used to be on two physical volumes (underlying devices are raid), however one of those volumes failed. I was able to recover the raid and get all the databits moved off the physical volume, and reduced the volume group by that device. So now I have my volume group with a single PV (xvdb1), and have the raid back up and running (xvdc1). Before putting the physical volume back to normal use, I want to make sure the raid is stable. So what I'd like to do is:
1) Make sure that any new allocation takes place on the current raid device. I don't want to allocate on the new one until I know that it will stay up. It was a hairy recovery process the first time and I don't want to do it again.
2) Add the new raid back into the volume group, but use it only to create mirrors of the current LVs. Let it run like this for a couple weeks to see if the rebuilt raid stays ok.
3) If all goes well, break the mirror(s) leaving the data on the new raid, so I can drop the old one and re-create it as a hardware raid.
4) After that raid has been rebuilt, I want to add it back into the VG, and if possible reallocate the LVs so they stripe decently across the PVs.
I'll take care of the first two now, and the other two...hopefully all works out ok and I'll come back to post details about those.
1) making sure the current LVs stay on the current (only) PV -- To do so, I'll be mucking with the allocation policy. Long story short, the default policy (for PV) is "normal" which uses sensible values. For example, pieces of a mirror or stripes of an LV will not be allocated on the same PV. Makes sense. What I want to do is change the allocation policy from normal to a value of "cling". Cling means "stick to the physical volume that you were first allocated on." That's pretty simple to change with the lvconvert(8) command. I could change the PV allocation policy to cling, and since each LV has the (default) allocation policy "inherit", they would all start using the cling policy. I like to be difficult though, so I'll change it for each LV individually:
To just change the PV and let the LVs inherit it:
2) Add the new raid back in, and mirror existing logical volumes -- Not so tough; first add the PV back in (we know no extents will be allocated after the last step):
Now to use lvconvert to mirror the LVs. There are two important things to note. First, LVM wants to have 3 devices in order to do mirroring. There are two devices for each side of the mirror, and a third device is needed for the mirror log. The mirror log is a record on disk of what extents are synced between the mirror halves. In that way, if the server loses power or has some other failure, the mirror can be recreated and pick right back up where it was. There is another option: the mirror log can be kept in memory, however if the device is brought down and back up, it will need to be re-synced from the primary side of the mirror. For my purposes, I don't really care if the mirror breaks and never comes back, I just want to run it this way temporarily to see if one side fails. I'll use the "--mirrorlog core"
Going home from work -- to be continued...
VG: export
PV: /dev/xvdb1
PV: /dev/xvdc1 (this is the one that failed)
I have a volume group (named "export") with a number of (large) logical volumes. The volume group used to be on two physical volumes (underlying devices are raid), however one of those volumes failed. I was able to recover the raid and get all the databits moved off the physical volume, and reduced the volume group by that device. So now I have my volume group with a single PV (xvdb1), and have the raid back up and running (xvdc1). Before putting the physical volume back to normal use, I want to make sure the raid is stable. So what I'd like to do is:
1) Make sure that any new allocation takes place on the current raid device. I don't want to allocate on the new one until I know that it will stay up. It was a hairy recovery process the first time and I don't want to do it again.
2) Add the new raid back into the volume group, but use it only to create mirrors of the current LVs. Let it run like this for a couple weeks to see if the rebuilt raid stays ok.
3) If all goes well, break the mirror(s) leaving the data on the new raid, so I can drop the old one and re-create it as a hardware raid.
4) After that raid has been rebuilt, I want to add it back into the VG, and if possible reallocate the LVs so they stripe decently across the PVs.
I'll take care of the first two now, and the other two...hopefully all works out ok and I'll come back to post details about those.
1) making sure the current LVs stay on the current (only) PV -- To do so, I'll be mucking with the allocation policy. Long story short, the default policy (for PV) is "normal" which uses sensible values. For example, pieces of a mirror or stripes of an LV will not be allocated on the same PV. Makes sense. What I want to do is change the allocation policy from normal to a value of "cling". Cling means "stick to the physical volume that you were first allocated on." That's pretty simple to change with the lvconvert(8) command. I could change the PV allocation policy to cling, and since each LV has the (default) allocation policy "inherit", they would all start using the cling policy. I like to be difficult though, so I'll change it for each LV individually:
cd /dev/export
for i in * ; do lvconvert --alloc cling export/$i ; done
for i in * ; do lvconvert --alloc cling export/$i ; done
To just change the PV and let the LVs inherit it:
pvconvert --alloc cling /dev/xvdb1
2) Add the new raid back in, and mirror existing logical volumes -- Not so tough; first add the PV back in (we know no extents will be allocated after the last step):
pvcreate /dev/xvdc1
vgextend export /dev/xvdc1
vgextend export /dev/xvdc1
Now to use lvconvert to mirror the LVs. There are two important things to note. First, LVM wants to have 3 devices in order to do mirroring. There are two devices for each side of the mirror, and a third device is needed for the mirror log. The mirror log is a record on disk of what extents are synced between the mirror halves. In that way, if the server loses power or has some other failure, the mirror can be recreated and pick right back up where it was. There is another option: the mirror log can be kept in memory, however if the device is brought down and back up, it will need to be re-synced from the primary side of the mirror. For my purposes, I don't really care if the mirror breaks and never comes back, I just want to run it this way temporarily to see if one side fails. I'll use the "--mirrorlog core"
Going home from work -- to be continued...


