We have experienced an issue with one of our servers.
This IBM x3650 M2 installed with Windows Server 2008 x64 R2 is running Hyper-V and multiple virtual machines on it. It is also connected to a EMC NS-960 FC SAN, from which it mounts volumes where the virtual hard disks are located.
1- We noticed that when one of the virtual machine is having a certain load, the complete IBM server has a blue screen showing “IRQL_NOT_LESS_OR_EQUAL” with STOP ERROR “0x0000000A”.
Figure 1: Stop Error
2- It is not the only strange behavior. When the dump consecutive to the error finishes and the server restarts, it stays a long time trying to load Windows in vain. We have to physically go to the datacenter, stop the server, remove the FC cables, and start it up, for the OS to load correctly. We then plug back the FC cables once the OS is completely loaded.
You can imagine the inconvenience, especially when important services are running on the virtual machines!
1- Having a look to different knowledge bases, I found that Microsoft discusses the issue of Windows Server 2008 R2 experiencing stop error 0x0000000A and has published a hotfix.
The following article may be interesting for you to read:
Article ID: 979903 – "STOP 0x000000A" Stop error when there is a request to allocate a large amount of contiguous physical memory in Windows Server 2008 R2 or Windows 7.
It is explained in the article that the issue is due to Memory Manager struggling with database scans.
2- I also found another article on the second behavior and for that one too, Microsoft had a hotfix published.
Article ID: 979374 – The system becomes unbootable after you add raw disks to a Windows Server 2008 R2-based computer that has EFI enabled
Due to the fact that our server runs UEFI and is mounting external storage volumes from the SAN, the system is unbootable, they explain. I just wonder why this issue did not appear before, as we have been running with that server almost for a year without experiencing this.
I would also recommend you follow this link which discusses the same behavior of IBM servers with external storage: http://sites.google.com/site/virtualkb/accueil/hardware/localoperatingsystembootfailswhenexternalstorageisattached-ibmsystemx3550m2x3650m2andbladecenterhs22
The issue might occur on series:
BladeCenter HS22, Type 1936, any model
BladeCenter HS22, Type 7870, any model
System x3550 M2, Type 4198, any model
System x3550 M2, Type 7946, any model
System x3650 M2, Type 4199, any model
System x3650 M2, Type 7947, any model
So I understood that we were merely leaving two different issues at a time.
Did the hotfixes…fix the issues?
We successfully downloaded and applied the two hotfixes one month ago. Since then, we haven’t experienced any of the previous behaviors!!!
Note that in the KB 979374, there are two methods proposed: one method if you are already running your server with the OS installation, another method to be followed at the installation of the server (it is about including the hotfixes into your install media, and you would have to add one other hotfix 975535).
The lesson: So if you have a server built on EFI and may have to add external storage volumes, I would recommend you go to the second method before you install your OS!