Saturday, October 20, 2012

Grub conks out, Lilo to the rescue

It would seem that I say goodbye to an ever-increasing number of established wisdoms and state-of-the-arts of the Linux world these days. After getting rid of both KDE and Gnome some time back, now it's Grub, the GRand Unified Bootloader, that has drawn my displeasure. Here's the story why.

Actually I never paid a lot of attention to boot loaders until two weeks ago or so. To me, they simply were "that magic that makes Linux boot" - Well, there's a time for everything, which is usually when something stops working. When I found out that grub would conk out with a cryptic error message the minute I stuck my shiny new RAID-Controller into it's assigned slot, I decided that now was the time to get into boot loaders.

After searching the web for that particular error didn't yield any result whatsoever, I did what I always do in such cases: logged on to IRC and asked the guys who develop the gizmo itself if they knew what the trouble was.
Tinkerer's Tips No. 6:
If you're in trouble, the Web is your friend. However, your best friends are mailing lists, forums and IRC, because there you will get something you won't find on the net: custom-fabricated answers.
As a matter of fact, they did know what that error meant: It meant that some thingamabob named int12 conked out, likely because of some weird memory-mapping issue. Their solution was: use Grub 2.00.

It turns out however, that no major distro offers 2.00 except in their most bleeding-edge distros, which I sure as hell didn't want to install over my smooth-running Linux.
Tinkerer's Tips No. 7:
Never ever try to solve a problem in an otherwise stable system by using experimental stuff. It just might solve the original problem, but it sure as hell will create a truckload of new ones - and often enough there's no way of telling beforehand if you can safely revert back to the original configuration
Not that I didn't try to shoe-horn in grub-pc from Debian Sid, but after having a cursory glance at the dependency tree, I discarded that solution rather quickly. Not that I didn't try to compile Grub 2.00 from source (nobody shall say I shy back from a little compiling) but that broke off with a few cryptic, very likely library-version-related errors, see above.
Well, what was to be done? Actually I had exactly three options: sell my shiny new RAID controller, shelve it until Debian shipped the 2.00 Grub (which likely wouldn't be until the next version after Wheezy) or try another boot loader.

Since Options 1 and 2 would leave me without the data security I craved (See New toys: LSI MegaRAID 9240-8i for why) at least for the foreseeable future, and since I didn't even know if Grub 2.00 would fix the problem - I couldn't find any live distro shipping with it either - it was pretty much a foregone conclusion. Writing this, I must say that now I see why several professional distros still use LiLo and not the vaunted grub, VMware's ESXi for instance, which I'm using at the office, and which doesn't have a problem with the very same controller card.

 Just to be fair: Others report that they have that specific card - the LSI MegaRAID 9240-8i to be exact - running fine with Grub, so there's probably another factor involved. Likely it has something to do with the chipset, which, watercooling junkie that I am, I can't change any easier than I can my bootloader, and certainly not as cheaply.

So I got into Lilo. If I'm telling you that I'm writing this on a computer containing said RAID controller, will it tell you how I fared?
So: How about we go hands on and I tell you how I did it?

OK, here's my lilo.conf for starters: (don't worry, I'll go over it line by line later.)

boot=/dev/sda
map=/boot/map
install=/boot/boot.b
prompt
timeout=100
compact
lba32
vga=795
default=Linux
image=/boot/vmlinuz-2.6.32-5-amd64
        label=Linux
        initrd=/boot/initrd.img-2.6.32-5-amd64
        root=/dev/mapper/system-root
        read-only
other=/dev/sdb
        label="Chainload Grub"
Looks simple compared to a grub.cfg, doesn't it? Well, you know my views: KISS, and it does the job, or I wouldn't be here. So here's the promised line-by-line breakdown:

boot=/dev/sda
This is the device where lilo will install the bootloader. Just like always: you can install either into a mbr or into a partition. If you install into a partition, you're going to need another bootloader in an mbr somewhere chainloading into lilo. More about that later. Note that in many howtos on the net, there's still a hda or something there, because many tutorials on Lilo haven't been updated since grub sucked it's thumb. Rule of same: what you see if you run

ls -1 /dev/ |egrep '.d[a-z][0-9]+$
will fit there fine, as will all symlinks in /dev/disk . Basically, this can be any device you can imagine, including but not limited to multi-devices like Linux software-raids and lvms. How much use it is to install into some logical volume that the bios never sees is another question, but Lilo definitely can be installed into traditional Linux multi-devices. There are a few specific configuration options for that, but that's beyond the scope of this post.

map=/boot/map
install=/boot/boot.b
Internal stuff for Lilo, specifically files it creates during install. If you have more than one installation of Lilo sharing the same /boot, you're going to have to modify those, otherwise you can just leave them alone.
prompt
timeout=100
Orders Lilo to display a prompt or OS chooser the way grub does. The timeout value is in tenths of seconds, so in this case the timeout is ten seconds.

compact
lba32
Informs lilo that it's going to be installed inside a large device and orders it to chunk several read operations into one if possible to improve performance. Doesn't do a lot these days, but is still standard, in case somebody wants to install in a disk from the mid nineties.

vga=795
Passes the video mode to the kernel on boot, in my case 1280x1024@24bit.  If you're not sure what to use, stick with vga=normal and put up with those ugly, two-feet-high characters. If you don't want to, look up the video settings for your monitor and graphics card on the interwebz. Hint: it's the exact same for grub, so don't bother looking for anything special for lilo.

default=Linux
Sets the default entry to boot after the timeout runs out. Use the same value you use for label(see below).


image=/boot/vmlinuz-2.6.32-5-amd64
        label=Linux
        initrd=/boot/initrd.img-2.6.32-5-amd64
        root=/dev/mapper/system-root
        read-only
Every entry in the lilo.conf starts with an image entry, giving the local path to the linuz-image to use. Don't worry about what lilo does with that, you'll only give yourself worry lines. It has to be there, it has to be correct and it starts a new entry in lilo.conf.

label=Linux
Next comes the label. That's important for two reasons: because it identifies the entry in the Lilo boot menu (so you can find it) and because it goes into the default entry mentioned above.

initrd=/boot/initrd.img-2.6.32-5-amd64
The initrd option is pretty much self-explaining too, there usually is one for every kernel image you have, except on very exotic systems. When you don't have one each, or don't use one at all, you'll probably not get a lot new info out of this howto anyhow, and those cases are beyond the scope of this article (and me, for that matter) anyways.

root=/dev/mapper/system-root
The root option specifies what you want to have mounted as your rootfs. Again, use whatever you can find in your /dev/disk, /dev/mapper or any other block device. Choose wisely though, because if your root filesystem isn't what it is supposed to be, you're lucky if you have a good initrd that can present you with an emergency shell to find out where the problem lies.

read-only
This option specifies that the root filesystem should be mounted read-only on boot, because the boot process will automatically re-mount it read-write after fsck-ing it.

other=/dev/sdb
        label="Chainload Grub"
Ok, I admit it: I was lie-to-childrening when I said every entry begins with an image option - here's one noteable exception. Blocks beginning with the other option are chainloading entries, as an option they take the device where the other (sic!) boot loader is located.

Well, that's all for today. Hope what I wrote helps somebody who faces the same problem I did to switch over to Lilo without all the research and hassle I had to endure to scrape together all the info I did need.

Have fun!

No comments:

Post a Comment