Yesterday I got an emergency call to fix a RHEL5 system that had taken a dive. The system was not booting and the sotware RAID1 containing the root filesystem had split.
I added the failed disk back into the RAID to see if it was really broken, or if it was just a gremlin. (This all happened after a massive storm).
Once the mirror started rebuilding the kernel began to spew out a continuous stream of these:
Mar 11 13:18:07 sparkus kernel: hdc: status timeout: status=0xd0 { Busy }
Mar 11 13:18:07 sparkus kernel: ide: failed opcode was: unknown
Mar 11 13:18:07 sparkus kernel: hdc: no DRQ after issuing MULTWRITE
Mar 11 13:18:07 sparkus kernel: ide1: reset: success
I thought “OK, that disk really is dead”. But I was wrong.
A bit of Googling suggested that “MULTWRITE” is a command unique to IDE disks, but this disk was SATA. It was also suspicious that the disk was mapped as “hdc”, not “sdc” where I would expect a SATA disk to appear on a modern kernel.
It turns out that the BIOS was set to present the SATA disks in “legacy mode”. Translation: “lie to the operating system and tell it the disks are IDE, even though they aren’t“.
I flipped the BIOS to show the disks as native SATA and brought the system back up. The disks mapped to “/dev/sdx“, the mirror started rebuilding with better throughput than before and the kernel errors ceased.
So folks, don’t let your BIOS lie to your kernel… unless you really need to.
0 Responses to “SATA is SATA… unless your BIOS lies”