Intel Matrix Storage Manager (RAID) drivers are perfect!

I am using Intel's ICH9R RAID controller integrated on my server's motherboard for a RAID5 configuration of three disks. Since I don't need huge performance the mentioned hardware is enough for me - I assume the driver is bug free. Now, one would think that Intel is a serious company, their RAID controller is good and support is great, right? Wrong.

First, let me mention that this isn't my first problem with this combination, although used for RAID 0+1 with 4 disks. It corrupted the array when I was using Apple's software such as iTunes. Intel's support denied any problem with their drivers. They were perfect they said. After a while Intel silently acknowledged the problem and fixed the drivers.

Present time, my Windows 2003 R2 server went BSOD (it wasn't clear why at that time), I restarted it and immediately after login the RAID software kicked in and started rebuilding the array. It happens sometimes when OS is reset like this. After the rebuild was finished it notified me that one of the drives in array failed and array is running in degraded mode, IOW if another disk fails, say goodbye to data. I also checked the minidump produced by BSOD and go figure:

CUSTOMER_CRASH_COUNT: 1 DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP BUGCHECK_STR: 0x8086 PROCESS_NAME: Idle CURRENT_IRQL: 2 LAST_CONTROL_TRANSFER: from f7b4a79a to 80827c3e STACK_TEXT: 8089a444 f7b4a79a 00008086 8b26b1c0 8acf5340 nt!KeBugCheck+0x14 WARNING: Stack unwind information not available. Following frames may be wrong. 8089a45c f7b4adcd 0000001b 8089a500 f7b13c6f iaStor+0x3b79a 8089a468 f7b13c6f 8b266000 808722e0 8b26b1c0 iaStor+0x3bdcd 8089a500 f7b14905 8b26b1c0 8b392878 ffdffa40 iaStor+0x4c6f 8089a598 f7b511e3 8b26b1c0 00000000 8089a600 iaStor+0x5905 8089a5a8 808320f0 8b266728 8b266000 00000000 iaStor+0x421e3 8089a600 8088de1f 00000000 0000000e 00000000 nt!KiRetireDpcList+0xca 8089a604 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0x37 ##ERIGNORE##

I am not a great kernel debugging guy but it looks obvious to me that the BSOD was actually caused by iaStor which means Intel Matrix Storage Manager driver. Looks like that a drive failed and RAID driver managed to thrash my OS into BSOD instead of just mark the drive as failed (as later did). So I contacted Intel's support once again. The conversation was something like this (short version):

Me: "Your driver BSOD my OS due to a drive failure."
Intel support: "I am glad that you've found the cause of the problem, just replace the drive and array will be restored."
Me: "Yes, but what about BSOD? Your driver really shouldn't thrash my OS even if a drive fails."
IS: "These drivers have been rigorously tested and we have not experience such error. If the issue continues or reoccurs, please contact your motherboard manufacturer."
Me: "Please, it wouldn't be the first time that these "rigorously tested" drivers would fail miserably instead of protecting the disk content (providing the link to earlier problem with iTunes). Now, please, send my minidump information to a serious engineer of Intel."
IS: "Then again, we have tested the Intel(R) Matrix Storage Manager and we have not seen this. You need to keep in mind as well that we have not developed or manufactured your motherboard. Even though your board is based on our chipset and RAID controller, the integration of these vary from manufacturer to manufacturer. At this point, we can only recommend that you contact Gigabyte (op.a.: my motherboard manufactures) for further technical assistance."

Conclusion? It is groundhog day all over again. The story repeats. Intel drivers/hardware is perfect, not a chance that there is an error in there. And if you see an error, it can't be their, since they haven't seen it. Yet. They practically tested all the possibilities and their stuff won't misfire. Great job, Intel. This is support at its best - deny the problems.
Granted, there are very minimal chances that the chipset is badly integrated. And so there are chances we are not alone. If you ask me, Intel should treat a BSOD report very seriously and analyze it "rigorously". Sticking their head into the sand won't make the bug disappear. It just won't.

My article about missing HTC TyTN II graphics drivers published on Moj Mikro

A while ago I wrote an article about HTC TyTN II and its missing graphics drivers for Slovene computer magazine Moj Mikro. Now the article has been published on-line, so check it out if you are interested in why TyTN II's graphics performance is dismal, or how HTC is treating their loyal customers. The article is in Slovene.

Read the article here.

Update since article has been written: new ROM has been delivered with Windows Mobile 6.1 and no proper graphics drivers.

Windows Home Server just saved my day

[VMWare] Virtual Server 1.0.5 ... free

[MS] Windows Home Server (running under VS)... $~170

750GB disk used for storing backups ... ~120€

Recovery from WHS backup ... priceless

I've been running Windows Home Server for quite some time now. I use it exclusively for doing nightly disk image backups for all computers of mine. WHS' storage mechanism that minimizes the disk space required for backups and its speed are just amazing. Briefly: it doesn't store duplicate data - IOW, if you have two computers with same OS, only one file per computer will be stored as long as they are exactly the same.

And like my bicycle helmet I hoped that I would never actually use it for real (to restore a computer). However, I had to restore my production machine yesterday. An odd thing happened yesterday - my Vista x86 just froze, at least so it seemed but the music from Winamp was still playing thus I deduced there are problems with graphics and the OS is still running non-graphical operations. I tried to connect using Remote Desktop without success. The remote shutdown command didn't help either. The only action remaining at that point was hardware reset switch, which should be used as a last resort. I crossed my fingers and reset the computer. Then, during the boot time (which was kind of slow) Vista started checking NTFS integrity on my disk. Ouch, not a good sign. After a while I was able to log on just to find that there is no network connection anymore and there is a problem with my event log service: "There is a problem with Event Log service. Check event log for more details" - funny, isn't it. Catch 22 by all means. At this point I was left with three options:

  1. Format the disk and reinstall
  2. Try to repair Vista
  3. Restore from most recent nightly backup

Option 1. is not that bad as it seems. Reinstalling from scratch from time to time isn't a bad idea after all - a lot of mess gets cleaned. The downside is the amount of time required - a day or two at least and constant attention.

Option 2. is more tricky. When there are such problems that I was experiencing the repair is doomed to fail.

So I opted for the most appealing option 3. After few clicks on "Next" button, 10 hours and 400GB files restored, my computer was working like it was the night when the backup was taken. The restore was straightforward and relatively quick - copying 400GB takes time regardless of how you are doing it (my restore was doing something like >11MB/s) So, big kudos to Windows Home Server - it spared me a lot of work and time.

One mystery remains though. My BIOS was downgraded (sometime before restore) somehow - it was version F10 (I upgrade it form time to time) and after computer crashed it was F3. Truth is that I am not sure when it was downgraded, this is first time I noticed it (I noticed because F3 misspelled word RAID to RIAD).

Bottom line: always have a backup handy.