1. Store
  2. Apps
  3. Hardware
  4. Support
  5. Solutions

ClearFoundation

Forums
Welcome, Guest
Software RAID or RAID problem
(1 viewing) 1 Guest
Go to bottomPage: 1
TOPIC: Software RAID or RAID problem
#37503
Software RAID or RAID problem 2 Years, 7 Months ago  
Hi All,
After recently creating a new new initrd to solve one problem i created another. I was having a issue with a server that would not bring up my WAN adapter after a restart or via 'service network restart' i decided it was a issue with the realtek network card, i found a few guides talking about rebuilding the initrd, well that was not the issue and left me with a broken system, kernel panic after the chroot point (remount sysvol) . This happened on all initrd's accross the raid.
After my panic died down (after all it's a production box, the network WOULD come back if you unplugged eth WAN, restart network)
i looked into it.
I had membered the client asking for raid and the client supplied the box, the system had a fake raid, some bios trick that crashed HDD's, we had already suffered twice with the H/W failing the raid we went to software, well the clearos installer fail's with software raid, i had found a guide to do software raid in a manual form.
here is the gist of it.
1. install clearos, normally onto one hdd
2. change partition tabletype to be raid
3. mount via clearos cd, you have to run rescue, then load existing install (it will likley fail due to the change above)
if it does fail, then manually 'chroot into' the /mnt/sysimage folder the link below is a guide, i can say that most of it can be skipped, the important part is you mount your jail enviroment correct, eg /proc mount to /mnt/sysimage/proc and /dev/sda1 to /mnt/sysimage/boot (/dev /tmp )

Now the rest of the guide went onto the important part, why is my running system (raid) now not working?
well it turns out a file in the initrd tries to mount the sysroot to /sysroot, but due to our funky(software) raid setup and that of some LVM systems it mounts elsewhere.
The culprit is in the initrd file of your system, normally located in /boot, mine is /boot/inird-2.6.18-194.8.1.v5.img
initrd location (/boot) actually is a mount, /dev/sda1 (1st partition boot-files grub etc, 2nd swap, third is /sysroot or rest of filesystem)

Open Say's Me

right lets make a initrd ( in case of loss or damage) and then correct the mount location.

YOU may need some dev files and kernel sources to create these, but AFAIK I built them using clearos cd 5.2.

first create a working directory, I used /boot/recovery

mkdir recovery
mkinitrd initrd-`uname -r`.img `uname -r`

the ` symbol used above it not ' or the one on the right of keyboard, I am using the ` near 1 or ~ , using `uname -r` returns kernel revision and inserts it into the filename, hence you end up with

initrd-2.6.18-194.8.1.v5.img

now we still have to get into the .img file, it's compressed image and CPIO'ed
Here for more info on that. www.ibm.com/developerworks/linux/library/l-initrd/index.html
Right into the file, first create another directory, then we unzip & un-cpio.

Mkdir initrd
cd initrd
gunzip -c ../initrd-`uname -r`.img | cpio -idmv

your should be left with the following structure

drwxr-xr-x 9 root root 1.0K Feb 10 17:22 .
drwxr-xr-x 3 root root 1.0K Feb 10 17:21 ..
drwx------ 2 root root 1.0K Feb 10 17:22 bin
drwx------ 3 root root 1.0K Feb 10 17:22 dev
drwx------ 2 root root 1.0K Feb 10 17:21 etc
-rwx------ 1 root root 2.3K Feb 10 17:21 init
drwx------ 3 root root 1.0K Feb 10 17:22 lib
drwx------ 2 root root 1.0K Feb 10 17:21 proc
lrwxrwxrwx 1 root root 3 Feb 10 17:22 sbin -> bin
drwx------ 2 root root 1.0K Feb 10 17:21 sys
drwx------ 2 root root 1.0K Feb 10 17:21 sysroot

the file init is what we will edit

nano init
now in nano press ctrl+w , then type sysroot
you will find only one normally

echo Mounting root filesystem.
mount /sysroot

now for the software raid we have to change the mount to

mount -o defaults –ro -t ext3 /dev/md1 /sysroot

(for LVM I think it's mount -o defaults --ro -t ext3 /dev/VolGroup00/LogVol00 /sysroot )

then press ctrl+o to save ctrl+x to exit
now recreate the cpio'ed&gzipped file

find . | cpio -o -H newc | gzip -9 > ../initrd-`uname -r`.img.new

we now have the file

initrd-2.6.18-194.8.1.v5.img.new
in /boot/recovery/

now we

cd ..
cp initrd-2.6.18-194.8.1.v5.img.new /boot

thats it for file making. Now to test your new backup version of initrd ,
reboot
when you have the green boot screen press any key to goto boot selection , press e to edit, then highlight the initrd line, e to edit again and add .new to the end of the cmd line

your can now boot your test/backup initrd
to fill some other blanks in

my raid setup was

Personalities : [raid1]
md0 : active raid1 sdb1[0] sda1[1]
80192 blocks [2/2] [UU]

md1 : active raid1 sdb3[0] sda3[1]
974559040 blocks [2/2] [UU]

this is found by using

cat /proc/mdstat

my drives are
2 x 1tb
partition on both as follws (default clear install)

/dev/sda1 * 1 10 80293+ 83 Linux
/dev/sda2 11 141 1052257+ 82 Linux swap / Solaris
/dev/sda3 142 30401 243063450 83 Linux

but under the software raid I changed the partition id's after the install to the following


Device Boot Start End Blocks Id System
/dev/sda1 * 1 10 80293+ fd Linux raid autodetect
/dev/sda2 11 274 2120580 82 Linux swap / Solaris
/dev/sda3 275 121601 974559127+ fd Linux raid autodetect

not , def = 83 /82/83 and after it's fd/82/fd, this is due to me only using raid for /boot and root / which is sda1/sdb1 for /boot and sda3/sdb3 for / which the system see's as /dev/md0 for /boot and md1 for /
so once you have installed, created initrd backup/test, change partition id's , booted into new system you can then copy all the data to the second drive (mounted into system but not into raid array yet) then convert second drive's id's to FD , mount into raid array.

Further Reading
wiki.centos.org/HowTos/SoftwareRAIDonCentOS5
pbraun.nethence.com/doc/sysutils_linux/mdadm.html
wiki.openvz.org/Modifying_initrd_image

Attached is a unfinished script that does most of the work, it fails at the sed section and i never got round to finishing it. if anyone else can and repost, go ahead.


Hope this helps

Regards

Colin

Oh, the original problem with the wan, traced to ifp-eth located in /etc/sysconfig/network-scripts
When it's called via service network restart or ifup it does a arping command that checks to see if the static ip is used, the ISP recenlty changed radio brand and the new radio messes with it and so the system thinks the HOST IP ADDRESS ALREADY IN USE, the fix was to edit ifup-eth file, find the line that does the arping and hash out the exit 1 command as below

original :-

if ! arping -q -c 2 -w 3 -D -I ${REALDEVICE} ${IPADDR} ; then
echo $"Error, some other host already uses address ${IPADDR}."
exit 1
fi

Modded version

if ! arping -q -c 2 -w 3 -D -I ${REALDEVICE} ${IPADDR} ; then
echo $"Error, some other host already uses address ${IPADDR}."
# exit 1
fi

the hash skips the exit and my server's wan starts up.
Dirty fix I know but it worked
Colin
Fresh Boarder
Posts: 7
graphgraph
User Offline Click here to see the profile of this user
The administrator has disabled public write access.
 
Go to topPage: 1
  get the latest posts directly to your desktop