|
|
|
|
||||||
| linux.debian.user debian-user@lists.debian.org. |
![]() |
|
|
LinkBack | Outils de la discussion |
|
|
#1 |
|
Messages: n/a
Hébergeur: |
Hi guys, I've encountered a problem with my xen/raid setup.
My etch box has / on raid 1. When booting either 2.6.18-3 or -4 I get an error when /scripts/local-top/mdadm runs: (paraphrasing) Failure: failed to load Module 0 no such module Failure: failed to load Module 1 no such module Failure: failed to load Module 5 no such module Waiting for root filesytem... and that's it. I dug into the script in question and here are the offending lines: /usr/share/initramfs-tools/scripts/local-top/mdadm MD_DEVS=all MD_MODULES='linear multipath raid0 raid1 raid456 raid5 raid6 raid10' [ -s /conf/md.conf ] && . /conf/md.conf verbose && log_begin_msg Loading MD modules for module in ${MD_MODULES:-}; do if modprobe --syslog "$module"; then verbose && log_success_msg "loaded module ${module}." else log_failure_msg "failed to load module ${module}." fi done log_end_msg looks like it should just interate through the list and load the modules. I have confirmed that it works the way I expect in bash, but it doesn't work properly when booting. for some reason the module names seem to get replaced with just hte numbers "0" "1" and "5". I have hacked the script and rebuilt my initrds by commenting out the above section and just putting in a bunch of modprobes and it works. But clearly something wacky is going on here. 3 questions 1. Has anyone else seen this and have any insight? 2. How do I unpack my initrd to actually look at the script that is in the initrd (maybe it gets changed somehow?) so I can check that out directly. 3. is this a bug? thanks A -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGDUbkaIeIEqwil4YRAgXuAKDZZK68WgJ61Of1TkbmeG wjUozc7gCdGUiK eGW//vgYfcnvKgOT042gwYM= =NpLF -----END PGP SIGNATURE----- |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
also sprach Andrew Sackville-West <andrew@farwestbilliards.com> [2007.03.30.1920 +0200]:
> Failure: failed to load Module 0 no such module > Failure: failed to load Module 1 no such module > Failure: failed to load Module 5 no such module I don't even know what creates those messages. > looks like it should just interate through the list and load the > modules. I have confirmed that it works the way I expect in bash, > but it doesn't work properly when booting. for some reason the > module names seem to get replaced with just hte numbers "0" "1" > and "5". No, I don't think this is what's happening, but I also don't know what is going on. > I have hacked the script and rebuilt my initrds by commenting out > the above section and just putting in a bunch of modprobes and it > works. But clearly something wacky is going on here. Can you edit the script, add set -x at the top and post the output? > 2. How do I unpack my initrd to actually look at the script that is in > the initrd (maybe it gets changed somehow?) so I can check that out > directly. zcat initrd.img | cpio -i > 3. is this a bug? We'll see. -- Please do not send copies of list mail to me; I read the list! .''`. martin f. krafft <madduck@debian.org> : :' : proud Debian developer, author, administrator, and user `. `'` http://people.debian.org/~madduck - http://debiansystem.info `- Debian - when you have better things to do than fixing systems "zwei monologe, die sich gegenseitig immer und immer wieder störend unterbrechen, nennt man eine diskussion." -- charles tschopp -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGDi3CIgvIgzMMSnURAkubAKCKMYv3faOFeNR/oIgd3byJrVdLCwCdGvsG Ropy/FtSSWdgJ39NhoV8oew= =sdy8 -----END PGP SIGNATURE----- |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
On Sat, Mar 31, 2007 at 11:45:38AM +0200, martin f krafft wrote:
> also sprach Andrew Sackville-West <andrew@farwestbilliards.com> [2007.03.30.1920 +0200]: > > Failure: failed to load Module 0 no such module > > Failure: failed to load Module 1 no such module > > Failure: failed to load Module 5 no such module > > I don't even know what creates those messages. I know, its wierd, but it definitely looks like its happening in that script. > > > looks like it should just interate through the list and load the > > modules. I have confirmed that it works the way I expect in bash, > > but it doesn't work properly when booting. for some reason the > > module names seem to get replaced with just hte numbers "0" "1" > > and "5". > > No, I don't think this is what's happening, but I also don't know > what is going on. > > > I have hacked the script and rebuilt my initrds by commenting out > > the above section and just putting in a bunch of modprobes and it > > works. But clearly something wacky is going on here. > > Can you edit the script, add set -x at the top and post the output? will do. will post that up shortly. sorry for the delay, btw, had to run out of town. > > > 2. How do I unpack my initrd to actually look at the script that is in > > the initrd (maybe it gets changed somehow?) so I can check that out > > directly. > > zcat initrd.img | cpio -i thanks A -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGFUjoaIeIEqwil4YRAqbqAJ9wF3Sxb3rf9Ka3sCxlS/AeV3TbiwCg1sJ7 4hjPWIkdnTuR0OPcueWigLo= =b2nk -----END PGP SIGNATURE----- |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
martin, et al, sorry so long in getting back to this. Finally was in a
position to work on this a bit. so long sweet uptime. On Thu, Apr 05, 2007 at 12:07:21PM -0700, Andrew Sackville-West wrote: > On Sat, Mar 31, 2007 at 11:45:38AM +0200, martin f krafft wrote: > > also sprach Andrew Sackville-West <andrew@farwestbilliards.com> [2007.03.30.1920 +0200]: > > > Failure: failed to load Module 0 no such module > > > Failure: failed to load Module 1 no such module > > > Failure: failed to load Module 5 no such module > > > > I don't even know what creates those messages. > > I know, its wierd, but it definitely looks like its happening in that > script. > > > > > > looks like it should just interate through the list and load the > > > modules. I have confirmed that it works the way I expect in bash, > > > but it doesn't work properly when booting. for some reason the > > > module names seem to get replaced with just hte numbers "0" "1" > > > and "5". > > > > No, I don't think this is what's happening, but I also don't know > > what is going on. > > > > > I have hacked the script and rebuilt my initrds by commenting out > > > the above section and just putting in a bunch of modprobes and it > > > works. But clearly something wacky is going on here. > > > > Can you edit the script, add set -x at the top and post the output? > > will do. will post that up shortly. sorry for the delay, btw, had to > run out of town. So I returned the script to its original state, and the same problem is still there, running up-to-date etch. I put in set -x as you requested, and it looks like MD_MODULES gets set twice. Initially, its set appropriately as "linear ..." but later it just get set to "0 1 5" which of course causes the modprobe to fail. Unfortunately, it won't boot in this scenario (maybe... I'll try init=/bin/bash) so I've got to rig up a serial console to capture the messages. I'll report back with more. A -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGQ71baIeIEqwil4YRAvebAJ40rekNovr0oNv4vDffTQ 5lW32acwCguVBV iLplpM9PmYRVGF8wT30uTs0= =R/+L -----END PGP SIGNATURE----- |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
reviving an old thread as I have more to add...
ping martin krafft... On Sat, Mar 31, 2007 at 11:45:38AM +0200, martin f krafft wrote: > also sprach Andrew Sackville-West <andrew@farwestbilliards.com> [2007.03.30.1920 +0200]: > > Failure: failed to load Module 0 no such module > > Failure: failed to load Module 1 no such module > > Failure: failed to load Module 5 no such module > > I don't even know what creates those messages. these mesages are created by the initramfs script local-top/mdadm. > > > looks like it should just interate through the list and load the > > modules. I have confirmed that it works the way I expect in bash, > > but it doesn't work properly when booting. for some reason the > > module names seem to get replaced with just hte numbers "0" "1" > > and "5". > > No, I don't think this is what's happening, but I also don't know > what is going on. it is indeed what is happening. the local-top/mdadm script sources conf/md.conf. That file resets the value of MD_MODULES from stuff like "raid0 raid1..." to "0 1 5" and thus it bombs out on the boot. > > > I have hacked the script and rebuilt my initrds by commenting out > > the above section and just putting in a bunch of modprobes and it > > works. But clearly something wacky is going on here. > > Can you edit the script, add set -x at the top and post the output? I've done that. here is the relevant output from one of my DomU's when booting: Begin: Running /scripts/local-top ... + PREREQ=udev_er + prereqs + echo udev_er + exit 0 + PREREQ=udev_er + . /scripts/functions + [ -e /scripts/local-top/md ] + MDADM=/sbin/mdadm + [ -x /sbin/mdadm ] + MD_DEVS=all + MD_MODULES=linear multipath raid0 raid1 raid456 raid5 raid6 raid10 note that MD_MODULES is correctly set here but... + [ -s /conf/md.conf ] + . /conf/md.conf .../conf/md.conf gets sourced here and ... + MD_HOMEHOST=bigmomma + MD_DEVPAIRS=/dev/md1:5 /dev/md0:1 /dev/md11:1 /dev/md12:1 /dev/md10:0 /dev/md2:5 + MD_LEVELS=5 1 1 1 0 5 + MD_DEVS=all + MD_MODULES=0 1 5 ... that resets MD_MODULES here, before the modprobes begin below... + verbose + return 0 + log_begin_msg Loading MD modules + [ -x /sbin/usplash_write ] + _log_msg Begin: Loading MD modules ... + [ n = y ] + echo Begin: Loading MD modules ... Begin: Loading MD modules ... + modprobe --syslog -v 0 modprobe: FATAL: Module 0 not found. + log_failure_msg failed to load module 0. + _log_msg Failure: failed to load module 0. + [ n = y ] + echo Failure: failed to load module 0. Failure: failed to load module 0. + modprobe --syslog -v 1 modprobe: FATAL: Module 1 not found. + log_failure_msg failed to load module 1. + _log_msg Failure: failed to load module 1. + [ n = y ] + echo Failure: failed to load module 1. Failure: failed to load module 1. + modprobe --syslog -v 5 modprobe: FATAL: Module 5 not found. + log_failure_msg failed to load module 5. + _log_msg Failure: failed to load module 5. + [ n = y ] + echo Failure: failed to load module 5. Failure: failed to load module 5. + log_end_msg + [ -x /sbin/usplash_write ] + _log_msg Done. + [ n = y ] + echo Done. Done. + update_progress + [ -d /dev/.initramfs ] + [ -z 2 ] + PROGRESS_STATE=3 + echo PROGRESS_STATE=3 + [ -x /sbin/usplash_write ] + [ ! -f /proc/mdstat ] + verbose + return 0 + panic cannot initialise MD subsystem (/proc/mdstat missing) + [ -x /sbin/usplash_write ] + [ = 0 ] clearly there is a conflict in the way md.conf is set up and I frankly don't know where that happens. > > > 2. How do I unpack my initrd to actually look at the script that is in > > the initrd (maybe it gets changed somehow?) so I can check that out > > directly. > > zcat initrd.img | cpio -i > > > 3. is this a bug? > > We'll see. I'm not sure what really brought this all about, but I suspect that perhaps this machine got something messed up in its raid configuration at the end of the etch release cycle. honestly, i haven't spent much time digging for this problem as my uptimes are so long, that booting just doesn't happen that much. It came up recently due to some minor problems in one of my DomU's and i took it offline for a fsck. upon reboot, i noticed the set -x output streaming by and went back to take a look. My DomU's use the same initrd (I should probably fix that) as Dom0 and end up providing a great test bed for this problem as i don't have to take the whole system down to troubleshoot it. I haven't had a kernel upgrade since the problem surfaced, and so no opportunity to see if the problem has been resolved yet either. But I now have multiple initrd's floating around for this system and so can test it a little more easily. If there is more i can do to determine if this is a real problem, or just some local anomaly I'm experiencing, please let me know. A -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGkx41aIeIEqwil4YRArk/AKC39dWN3owO83SX3XtEiE6cC2L8hwCg5ERu FrlRaEL1PBO2MdzWtKBUtUQ= =hGRv -----END PGP SIGNATURE----- |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
also sprach Andrew Sackville-West <andrew@farwestbilliards.com> [2007.07.10.0750 +0200]:
> it is indeed what is happening. the local-top/mdadm script sources > conf/md.conf. That file resets the value of MD_MODULES from stuff > like "raid0 raid1..." to "0 1 5" and thus it bombs out on the > boot. This must be the trail of the bug, because on my systems, MD_MODULES in conf/md.conf has the 'raid' prefix: MD_MODULES='raid1 raid10' Can I please see your mdadm.conf file and the generated initramfs? Please put them somewhere where I can download them instead of attaching them to an email. If it turns out to be what I think it is, it should be a trivial bug to fix. -- Please do not send copies of list mail to me; I read the list! .''`. martin f. krafft <madduck@debian.org> : :' : proud Debian developer, author, administrator, and user `. `'` http://people.debian.org/~madduck - http://debiansystem.info `- Debian - when you have better things to do than fixing systems "i wish i hadn't slept all day, it's really lowered my productivity" -- robert mcqueen -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGkyhaIgvIgzMMSnURAnzWAJ9JoQAYW6wZdIvYkHxXW1 C6JBKBTACfQxoh PnyF7eDWGJ8h0iW0l62KdjE= =VVxQ -----END PGP SIGNATURE----- |
|
|
|
#7 |
|
Messages: n/a
Hébergeur: |
also sprach martin f krafft <madduck@debian.org> [2007.07.10.0834 +0200]:
> Can I please see your mdadm.conf file and the generated initramfs? > Please put them somewhere where I can download them instead of > attaching them to an email. If it turns out to be what I think it > is, it should be a trivial bug to fix. Try the following patch agaist /usr/share/initramfs-tools/hooks/mdadm. It ensures that RAID levels include the word 'raid'. I think your mdadm.conf may say stuff like level=5, when it should say level=raid5 or not specify level= at all. Index: hook ================================================== ================= --- hook (revision 351) +++ hook (working copy) @@ -198,7 +198,7 @@ [ -n "${dev:-}" ] || continue echo -n "${dev}:" if [ -n "${level:-}" ]; then - echo -n "$level" + echo -n "raid${level#raid}" else echo -n "$($MDADM --detail $dev | sed -rne 's,[[:space:]]+Raid Level : ,,p')" fi -- Please do not send copies of list mail to me; I read the list! .''`. martin f. krafft <madduck@debian.org> : :' : proud Debian developer, author, administrator, and user `. `'` http://people.debian.org/~madduck - http://debiansystem.info `- Debian - when you have better things to do than fixing systems ... and don't get caught in the .NET! -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGkynuIgvIgzMMSnURAkQeAJ9iznWR3PeRY9IVn4G3Kn 7MhGVpZgCfYhJY 1pWBZLTV6ZilQGUyT3WbCoM= =PiRq -----END PGP SIGNATURE----- |
|
|
|
#8 |
|
Messages: n/a
Hébergeur: |
On Tue, Jul 10, 2007 at 08:40:46AM +0200, martin f krafft wrote:
> also sprach martin f krafft <madduck@debian.org> [2007.07.10.0834 +0200]: > > Can I please see your mdadm.conf file and the generated initramfs? > > Please put them somewhere where I can download them instead of > > attaching them to an email. If it turns out to be what I think it > > is, it should be a trivial bug to fix. > > Try the following patch agaist > /usr/share/initramfs-tools/hooks/mdadm. It ensures that RAID levels > include the word 'raid'. I think your mdadm.conf may say stuff like > level=5, when it should say level=raid5 or not specify level= at > all. bingo! you are correct about my mdadm.conf. I haven't tried the patch yet, but that must indeed be the problem as I have specified level=5 and similar instead of level=raid5. I will fix up my mdadm.conf. I'll leave my two files up for a day or so incase you want to look at them. and i'll try the patch before I fix up the mdadm.conf... report back soon. A > > Index: hook > ================================================== ================= > --- hook (revision 351) > +++ hook (working copy) > @@ -198,7 +198,7 @@ > [ -n "${dev:-}" ] || continue > echo -n "${dev}:" > if [ -n "${level:-}" ]; then > - echo -n "$level" > + echo -n "raid${level#raid}" > else > echo -n "$($MDADM --detail $dev | sed -rne 's,[[:space:]]+Raid Level : ,,p')" > fi > > -- > Please do not send copies of list mail to me; I read the list! > > .''`. martin f. krafft <madduck@debian.org> > : :' : proud Debian developer, author, administrator, and user > `. `'` http://people.debian.org/~madduck - http://debiansystem.info > `- Debian - when you have better things to do than fixing systems > > ... and don't get caught in the .NET! -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGkzIKaIeIEqwil4YRAqNCAJ9hu2UIAV4wpaUVz/hfIbXLuxcihwCeIB/W zgQx/sF0UBu9LjE0soWv59I= =6tc1 -----END PGP SIGNATURE----- |
|
![]() |
| Outils de la discussion | |
|
|