i2o, adaptec, raidutils, and 2.6.x

November 10th, 2004

I’m getting hit with Debian bug #278239. What’s happening is I upgraded the kernel on the server to 2.6.x (which gave a significant performance boost) and the standard dpt_i2o module has been subsumed by the i2o_block and associated modules. I guess the kernel headers changed too, along with the device names/numbers associated with managing and mounting the array. This breaks the standard compilation of raidutils, which means I get the error

File /dev/dpti17 Could Not Be Opened

I haven’t recompiled my raidutils yet, hoping for a swift resolution of the bug.

new machine, and x86_64 joy

November 10th, 2004

So, I transplanted the brains of one of my workstations today, taking out an aging k7 and stuffing in a new Athlon 64. This required some re-jiggering of my previous availability calculations, since the bogomips are low on these new Athlons. I decided to run a quick benchmark on all the machines and use that value — but what benchmark to use?

I settled on whetstones, a benchmark with gravitas. Also, it can measure double-precision floating-point arithmetic, the same kind of operations most of the computing we’re doing does do. My new machine has 31% fewer bogomips than my previous champ, but can do 50% more whetstones.

I’m just running the old Debian i386 on this machine, since the speed boost is substantial, even without re-compiling everything over to 64 bits. I’ve installed debian-amd64 on a spare partition, and I’ll play with it a little when I get a chance. Rebuilding everything and setting up a special distribution system for the x86_64 stuff is a bit of a pain, though. I might wait until I’ve got another 64-bit machine before I go too far with it. I don’t think the man-hours saved by moving up to 64-bit exceed the number of hairs I’d pull out of my head doing it, figuratively speaking.

curse you, heisenbugs!

November 5th, 2004

So.

Dell Precision Workstation 350 has a problem when running Linux kernels in the 2.6.x series with the keyboard and busmice. If you boot them up without touching a key (say in GRUB, for instance) you’ll get a message like i8042.c: Can't read CTR while initializing i8042 in your dmesgs.

This makes the keyboard and mouse not respond. If, while trying to debug this problem, you touch a key sometime between POST and when the kernel tries to initialize the i8042, you won’t have this problem.

One workaround is to touch a key between POST and initialization every time the machine boots. Another workaround is to add acpi=off to your kernel bootup arguments, say in GRUB or LILO. I understand that you could also pass i8042.noacpi=1, according to the linux-kernel mailing list — but I haven’t tried this.

This is very annoying.

Also, this does not seem to affect Dell Precision Workstations 360.

going gentle into that good night

August 23rd, 2004

Yet another drive in yet another raid array has failed on me. This one is less important, though, so we’re not replacing it. And also, $300 for a 78 GB drive is just crazy. We’re going to let the hot spare rebuild the array, and the plan is that when the next disk fails, we’ll shrink the container. The analogy is that instead of replacing the flat tires on this automobile, we’re going to turn it into a tricycle, then a bicycle, then a unicycle, and finally a no-cycle.

still fighting the good fight

August 19th, 2004

Yesterday, I swapped around yet another spare drive in my RAID enclosure, hoping, on the advice of the Adaptec techs, that this would stop my irritating runs-for-24-hours-and-then-dies problem.

It did not. Now, I’m left with the conclusion that I’m facing a situation like that described in Adaptec knowledge base article #2328, which tells me I need to backup my RAID, format all the drives, rebuild the array, and restore the data from backup. Does anyone have 1.4 terabytes of storage they can lend me? Next year, this will not be a problem, because we’ll all be running around with USB keychains with that much space. Right now, though, it’s problematic.