Intel Matrix Storage Manager (RAID) drivers are perfect!

I am using Intel’s ICH9R RAID controller integrated on my server’s motherboard for a RAID5 configuration of three disks. Since I don’t need huge performance the mentioned hardware is enough for me – I assume the driver is bug free. Now, one would think that Intel is a serious company, their RAID controller is good and support is great, right? Wrong.

First, let me mention that this isn’t my first problem with this combination, although used for RAID 0+1 with 4 disks. It corrupted the array when I was using Apple’s software such as iTunes. Intel’s support denied any problem with their drivers. They were perfect they said. After a while Intel silently acknowledged the problem and fixed the drivers.

Present time, my Windows 2003 R2 server went BSOD (it wasn’t clear why at that time), I restarted it and immediately after login the RAID software kicked in and started rebuilding the array. It happens sometimes when OS is reset like this. After the rebuild was finished it notified me that one of the drives in array failed and array is running in degraded mode, IOW if another disk fails, say goodbye to data. I also checked the minidump produced by BSOD and go figure:

CUSTOMER_CRASH_COUNT: 1 DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP BUGCHECK_STR: 0x8086 PROCESS_NAME: Idle CURRENT_IRQL: 2 LAST_CONTROL_TRANSFER: from f7b4a79a to 80827c3e STACK_TEXT: 8089a444 f7b4a79a 00008086 8b26b1c0 8acf5340 nt!KeBugCheck+0x14 WARNING: Stack unwind information not available. Following frames may be wrong. 8089a45c f7b4adcd 0000001b 8089a500 f7b13c6f iaStor+0x3b79a 8089a468 f7b13c6f 8b266000 808722e0 8b26b1c0 iaStor+0x3bdcd 8089a500 f7b14905 8b26b1c0 8b392878 ffdffa40 iaStor+0x4c6f 8089a598 f7b511e3 8b26b1c0 00000000 8089a600 iaStor+0x5905 8089a5a8 808320f0 8b266728 8b266000 00000000 iaStor+0x421e3 8089a600 8088de1f 00000000 0000000e 00000000 nt!KiRetireDpcList+0xca 8089a604 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0x37 ##ERIGNORE##

I am not a great kernel debugging guy but it looks obvious to me that the BSOD was actually caused by iaStor which means Intel Matrix Storage Manager driver. Looks like that a drive failed and RAID driver managed to thrash my OS into BSOD instead of just mark the drive as failed (as later did). So I contacted Intel’s support once again. The conversation was something like this (short version):

Me: “Your driver BSOD my OS due to a drive failure.”
Intel support: “I am glad that you’ve found the cause of the problem, just replace the drive and array will be restored.”
Me: “Yes, but what about BSOD? Your driver really shouldn’t thrash my OS even if a drive fails.”
IS: “These drivers have been rigorously tested and we have not experience such error. If the issue continues or reoccurs, please contact your motherboard manufacturer.”
Me: “Please, it wouldn’t be the first time that these “rigorously tested” drivers would fail miserably instead of protecting the disk content (providing the link to earlier problem with iTunes). Now, please, send my minidump information to a serious engineer of Intel.”
IS: “Then again, we have tested the Intel(R) Matrix Storage Manager and we have not seen this. You need to keep in mind as well that we have not developed or manufactured your motherboard. Even though your board is based on our chipset and RAID controller, the integration of these vary from manufacturer to manufacturer. At this point, we can only recommend that you contact Gigabyte (op.a.: my motherboard manufactures) for further technical assistance.”

Conclusion? It is groundhog day all over again. The story repeats. Intel drivers/hardware is perfect, not a chance that there is an error in there. And if you see an error, it can’t be their, since they haven’t seen it. Yet. They practically tested all the possibilities and their stuff won’t misfire. Great job, Intel. This is support at its best – deny the problems.
Granted, there are very minimal chances that the chipset is badly integrated. And so there are chances we are not alone. If you ask me, Intel should treat a BSOD report very seriously and analyze it “rigorously”. Sticking their head into the sand won’t make the bug disappear. It just won’t.

Leave a Reply