Greetings,
Has anyone upgraded from any version of 7.3.3 to a 7.3.5 variant seen any behavioral differences? I also upgraded diagnostics from 5.5 to 5.6.1.
I upgraded 2 clusters from 7.3.3P5 to 7.3.5.1P4 at the first of the year. One cluster is a 6040 and the other is a 6080. Both have similar layouts of disk space and usage, CIFS, NFS, PAMII cards, 10G networking, etc.
After the upgrade, the 6040 cluster is behaving normally. Our NMS monitoring server immediately started reporting SNMP polling timeouts for the 6080 cluster. There were over 130 errors for each head within the first 2 days and 0 for the entire month of December. The 6080 cluster has user home directories via NFS. I see periodic "hiccups" in response on my workstation, as have other users. We had one of the filers panic due to a failed SAS controller a couple of weeks after the upgrade. During that failover time, there were a lot more complaints about unacceptable performance until I got the replacement part and failed things back.
I was able to pull a performance graph from the first of December to the end of January for both filers. This data came from SNMP queries of the filers. You can clearly see that the baseline CPU utilization was ~40% before the upgrade and ~60% after the upgrade. That would also account for the poor performance during the cluster failover since one filer head can't maintain the load (60%) for both heads without degrading performance.
Through all of this, nothing changed but the OS and diagnostics firmware. I just reinstalled the OS again this weekend just to make sure everything was happy with it. There has been no change. I've got a call open with NetApp but they are not seeing anything wrong and are unable to explain what is driving the CPU higher since everything looks acceptable to them. Unfortunately I did not do a perfstat collection prior to the upgrade so there isn't anything to compare it to.
I did find the option nfs.mountd.trace option became very verbose after the upgrade (prior post on that) and had to be turned off. I'm wondering if there are any other issues anyone else may have found after this type of upgrade?
Thanks,
Jeff
Are your filers configured to send weekly performance stats? If yes,content of autosupport E-Mail is kept under /etc/log/autosupport for some time and you may still find them there. Even if already purged, NetApp keeps them for months if not for years. Also depending on size of root volume, detailed performance archives may also still be available (although they are purged much more frequently).
________________________________________ From: toasters-bounces@teaparty.net [toasters-bounces@teaparty.net] On Behalf Of Jeff Cleverley [jeff.cleverley@avagotech.com] Sent: Tuesday, February 07, 2012 01:28 To: toasters@teaparty.net Subject: Filer behavior differences between 7.3.3 and 7.3.5.
Greetings,
Has anyone upgraded from any version of 7.3.3 to a 7.3.5 variant seen any behavioral differences? I also upgraded diagnostics from 5.5 to 5.6.1.
I upgraded 2 clusters from 7.3.3P5 to 7.3.5.1P4 at the first of the year. One cluster is a 6040 and the other is a 6080. Both have similar layouts of disk space and usage, CIFS, NFS, PAMII cards, 10G networking, etc.
After the upgrade, the 6040 cluster is behaving normally. Our NMS monitoring server immediately started reporting SNMP polling timeouts for the 6080 cluster. There were over 130 errors for each head within the first 2 days and 0 for the entire month of December. The 6080 cluster has user home directories via NFS. I see periodic "hiccups" in response on my workstation, as have other users. We had one of the filers panic due to a failed SAS controller a couple of weeks after the upgrade. During that failover time, there were a lot more complaints about unacceptable performance until I got the replacement part and failed things back.
I was able to pull a performance graph from the first of December to the end of January for both filers. This data came from SNMP queries of the filers. You can clearly see that the baseline CPU utilization was ~40% before the upgrade and ~60% after the upgrade. That would also account for the poor performance during the cluster failover since one filer head can't maintain the load (60%) for both heads without degrading performance.
Through all of this, nothing changed but the OS and diagnostics firmware. I just reinstalled the OS again this weekend just to make sure everything was happy with it. There has been no change. I've got a call open with NetApp but they are not seeing anything wrong and are unable to explain what is driving the CPU higher since everything looks acceptable to them. Unfortunately I did not do a perfstat collection prior to the upgrade so there isn't anything to compare it to.
I did find the option nfs.mountd.trace option became very verbose after the upgrade (prior post on that) and had to be turned off. I'm wondering if there are any other issues anyone else may have found after this type of upgrade?
Thanks,
Jeff
-- Jeff Cleverley Unix Systems Administrator 4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611 _______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters