We met with our Netapp team today and received the technical explanation we needed to move forward with the hardware replacement option.As one reply already mentioned, there is a real hardware issue identified with the 32xx/62xx series and Netapp is now working to proactively replace the parts with suspect PCM (DRAM), SAS, IOxM chipsour clusters operate in active:standby mode so we won't need downtime or risk of production failover for this fix.thanksOn Jan 10, 2013, at 12:29 PM, Patrick Giagnocavo <xemacs5@gmail.com> wrote:I am only a newbie with NetApps, however have some experience with
rackmount servers as I have 2 racks' worth of them :)
A machine check exception is generated by the CPU, usually.
This Wikipedia page tells you in general what is going on:
http://en.wikipedia.org/wiki/Machine_Check_Exception
so the 2nd core (CPU1, not CPU0) had a problem (in the original post
on this thread).
The problem was not correctable and seems to have been on the PCI
Express bus (either on the bridge chip itself, or a device connected
to it).
You are not the only person to experience this (found via google):
https://twitter.com/nerdicwalker/status/110360608121167873
They require diagnosis because the error message is not specific
enough to figure out what is going on.
The only times I have seen this in my systems (non-NA) were 1) bad or
slightly incompatible RAM, easily fixed 2) motherboard was bad and I
stopped using it. So, there is a quite a range as to what can be
going on.
Hope this helps,
Patrick
PS am looking for FAS250 or so on the cheap for testing / dev work if
anyone has one.
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters