Hello All,
We're rolling out a pair of CDOT clusters and setting up Nagios monitoring.
On our pre-CDOT clusters we check the global health of the filers using SNMP and the miscGlobalStatusMessage OID. It returns along the lines of: NETAPP-MIB::miscGlobalStatusMessage.0 = STRING: The system's global status is normal.
In 8.3, even though the MIB that NetApp publishes for CDOT includes this OID, an SNMP query returns: NETAPP-MIB::miscGlobalStatusMessage = No Such Object available on this agent at this OID
Anybody got any guidance?
And if we're out of luck via SNMP, any ideas for an equivalent shell command I can pass via SSH?
Hope to hear from you,
Randy in Seattle
Hi,
I failed to run 7Mode SNMP based checks for cDOT filers, it simply doesn't work. So I employed power of NetApp SDK. I discovered some ready-to-use scripts on github, and was particularly inspired be the following two:
https://github.com/aleex42/netapp-cdot-nagios https://github.com/willemdh/check_netapp_ontap
Take a look, a lot of aspects are already covered.
However I took it further and created my own scripts (I'm not a perl dev by any means, so pardon my perl-fu) and started publishing it on github:
https://github.com/sl0n/netapp
So, it turned out cDOT 8.[2,3] (maybe prior version as well) don't report a broken PSU on a *disk shelf* (not controller itself) whatsoever, that's how my first script appeared. Stay tuned, I'm about to publish some more scripts that I've already tested in production environment. Patches are welcome =)
Cheers,
On Tue, Jun 9, 2015 at 7:07 PM, Rue, Randy rrue@fredhutch.org wrote:
Hello All,
We're rolling out a pair of CDOT clusters and setting up Nagios monitoring.
On our pre-CDOT clusters we check the global health of the filers using SNMP and the miscGlobalStatusMessage OID. It returns along the lines of: NETAPP-MIB::miscGlobalStatusMessage.0 = STRING: The system's global status is normal.
In 8.3, even though the MIB that NetApp publishes for CDOT includes this OID, an SNMP query returns: NETAPP-MIB::miscGlobalStatusMessage = No Such Object available on this agent at this OID
Anybody got any guidance?
And if we're out of luck via SNMP, any ideas for an equivalent shell command I can pass via SSH?
Hope to hear from you,
Randy in Seattle
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Folks,
Not that it¹s a direct or possibly not even helpful response to this thread but OCUM (free) offers quite a bit of functionality and speaks SNMP or can relay events through other protocols should you so wish.
Many Thanks
Anders Ljungberg Sr.Director Enterprise Transformation and Operations NetApp +14084821148
+447730437939 anders@netapp.com
-----Original Message----- From: Momonth momonth@gmail.com Reply-To: Momonth momonth@gmail.com Date: Tuesday, 23 June 2015 14:45 To: "Rue, Randy" rrue@fredhutch.org Cc: "toasters@teaparty.net" toasters@teaparty.net Subject: Re: check Global Status of a CDOT cluster using SNMP?
Hi,
I failed to run 7Mode SNMP based checks for cDOT filers, it simply doesn't work. So I employed power of NetApp SDK. I discovered some ready-to-use scripts on github, and was particularly inspired be the following two:
https://github.com/aleex42/netapp-cdot-nagios https://github.com/willemdh/check_netapp_ontap
Take a look, a lot of aspects are already covered.
However I took it further and created my own scripts (I'm not a perl dev by any means, so pardon my perl-fu) and started publishing it on github:
https://github.com/sl0n/netapp
So, it turned out cDOT 8.[2,3] (maybe prior version as well) don't report a broken PSU on a *disk shelf* (not controller itself) whatsoever, that's how my first script appeared. Stay tuned, I'm about to publish some more scripts that I've already tested in production environment. Patches are welcome =)
Cheers,
On Tue, Jun 9, 2015 at 7:07 PM, Rue, Randy rrue@fredhutch.org wrote:
Hello All,
We're rolling out a pair of CDOT clusters and setting up Nagios monitoring.
On our pre-CDOT clusters we check the global health of the filers using SNMP and the miscGlobalStatusMessage OID. It returns along the lines of: NETAPP-MIB::miscGlobalStatusMessage.0 = STRING: The system's global status is normal.
In 8.3, even though the MIB that NetApp publishes for CDOT includes this OID, an SNMP query returns: NETAPP-MIB::miscGlobalStatusMessage = No Such Object available on this agent at this OID
Anybody got any guidance?
And if we're out of luck via SNMP, any ideas for an equivalent shell command I can pass via SSH?
Hope to hear from you,
Randy in Seattle
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
The only issue with ocum is that the events are only collected every 15 minutes. As for the snmp on cdot, I tested a few of them and they worked (failovers, reboots, etc). I'll try to unplug something on one of the new boxes and ensure that also works.
Sent from my BlackBerry 10 smartphone on the Bell network. Original Message From: Ljungberg, Anders Sent: Tuesday, June 23, 2015 7:14 PM To: Momonth; Rue, Randy Cc: toasters@teaparty.net Subject: Re: check Global Status of a CDOT cluster using SNMP?
Folks,
Not that it¹s a direct or possibly not even helpful response to this thread but OCUM (free) offers quite a bit of functionality and speaks SNMP or can relay events through other protocols should you so wish.
Many Thanks
Anders Ljungberg Sr.Director Enterprise Transformation and Operations NetApp +14084821148
+447730437939 anders@netapp.com
-----Original Message----- From: Momonth momonth@gmail.com Reply-To: Momonth momonth@gmail.com Date: Tuesday, 23 June 2015 14:45 To: "Rue, Randy" rrue@fredhutch.org Cc: "toasters@teaparty.net" toasters@teaparty.net Subject: Re: check Global Status of a CDOT cluster using SNMP?
Hi,
I failed to run 7Mode SNMP based checks for cDOT filers, it simply doesn't work. So I employed power of NetApp SDK. I discovered some ready-to-use scripts on github, and was particularly inspired be the following two:
https://github.com/aleex42/netapp-cdot-nagios https://github.com/willemdh/check_netapp_ontap
Take a look, a lot of aspects are already covered.
However I took it further and created my own scripts (I'm not a perl dev by any means, so pardon my perl-fu) and started publishing it on github:
https://github.com/sl0n/netapp
So, it turned out cDOT 8.[2,3] (maybe prior version as well) don't report a broken PSU on a *disk shelf* (not controller itself) whatsoever, that's how my first script appeared. Stay tuned, I'm about to publish some more scripts that I've already tested in production environment. Patches are welcome =)
Cheers,
On Tue, Jun 9, 2015 at 7:07 PM, Rue, Randy rrue@fredhutch.org wrote:
Hello All,
We're rolling out a pair of CDOT clusters and setting up Nagios monitoring.
On our pre-CDOT clusters we check the global health of the filers using SNMP and the miscGlobalStatusMessage OID. It returns along the lines of: NETAPP-MIB::miscGlobalStatusMessage.0 = STRING: The system's global status is normal.
In 8.3, even though the MIB that NetApp publishes for CDOT includes this OID, an SNMP query returns: NETAPP-MIB::miscGlobalStatusMessage = No Such Object available on this agent at this OID
Anybody got any guidance?
And if we're out of luck via SNMP, any ideas for an equivalent shell command I can pass via SSH?
Hope to hear from you,
Randy in Seattle
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Attaching the OCUM Install and Setup guide if anyone wants to give it s go...
Many Thanks
Anders Ljungberg Sr.Director Enterprise Transformation and Operations NetApp +14084821148
+447730437939 anders@netapp.com
-----Original Message----- From: "basilberntsen@gmail.com" basilberntsen@gmail.com Date: Tuesday, 23 June 2015 16:28 To: Anders Ljungberg anders.ljungberg@netapp.com, Momonth momonth@gmail.com, "Rue, Randy" rrue@fredhutch.org Cc: "toasters@teaparty.net" toasters@teaparty.net Subject: Re: check Global Status of a CDOT cluster using SNMP?
The only issue with ocum is that the events are only collected every 15 minutes. As for the snmp on cdot, I tested a few of them and they worked (failovers, reboots, etc). I'll try to unplug something on one of the new boxes and ensure that also works.
Sent from my BlackBerry 10 smartphone on the Bell network. Original Message From: Ljungberg, Anders Sent: Tuesday, June 23, 2015 7:14 PM To: Momonth; Rue, Randy Cc: toasters@teaparty.net Subject: Re: check Global Status of a CDOT cluster using SNMP?
Folks,
Not that it¹s a direct or possibly not even helpful response to this thread but OCUM (free) offers quite a bit of functionality and speaks SNMP or can relay events through other protocols should you so wish.
Many Thanks
Anders Ljungberg Sr.Director Enterprise Transformation and Operations NetApp +14084821148
+447730437939 anders@netapp.com
-----Original Message----- From: Momonth momonth@gmail.com Reply-To: Momonth momonth@gmail.com Date: Tuesday, 23 June 2015 14:45 To: "Rue, Randy" rrue@fredhutch.org Cc: "toasters@teaparty.net" toasters@teaparty.net Subject: Re: check Global Status of a CDOT cluster using SNMP?
Hi,
I failed to run 7Mode SNMP based checks for cDOT filers, it simply doesn't work. So I employed power of NetApp SDK. I discovered some ready-to-use scripts on github, and was particularly inspired be the following two:
https://github.com/aleex42/netapp-cdot-nagios https://github.com/willemdh/check_netapp_ontap
Take a look, a lot of aspects are already covered.
However I took it further and created my own scripts (I'm not a perl dev by any means, so pardon my perl-fu) and started publishing it on github:
https://github.com/sl0n/netapp
So, it turned out cDOT 8.[2,3] (maybe prior version as well) don't report a broken PSU on a *disk shelf* (not controller itself) whatsoever, that's how my first script appeared. Stay tuned, I'm about to publish some more scripts that I've already tested in production environment. Patches are welcome =)
Cheers,
On Tue, Jun 9, 2015 at 7:07 PM, Rue, Randy rrue@fredhutch.org wrote:
Hello All,
We're rolling out a pair of CDOT clusters and setting up Nagios monitoring.
On our pre-CDOT clusters we check the global health of the filers using SNMP and the miscGlobalStatusMessage OID. It returns along the lines of: NETAPP-MIB::miscGlobalStatusMessage.0 = STRING: The system's global status is normal.
In 8.3, even though the MIB that NetApp publishes for CDOT includes this OID, an SNMP query returns: NETAPP-MIB::miscGlobalStatusMessage = No Such Object available on this agent at this OID
Anybody got any guidance?
And if we're out of luck via SNMP, any ideas for an equivalent shell command I can pass via SSH?
Hope to hear from you,
Randy in Seattle
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
When I went back to see how we did this for our 7-mode filers I was reminded that we had many of the same problems there, especially when setting up monitoring via Nagios and performance metrics feeding our roll-your-own dashboards. There were and are some perl-based Nagios plug-ins available that use the (perl-based) NetApp API but we ended up writing some basic python scripts that use snmp if a metric is available or a direct SSH call if not. I confess they're rough, embedded community strings and host names and the like, if I can get time to clean them up I'll see about putting them up somewhere.
For global health, we ended up with a nagio plugin that just calls the CLI command "system health status show" and squawks if the output is NOT "ok" Sadly, for CDOT 8.3 that's all the output you get.
Right now I'm trying to track an overall metric for latency/service time. Calling the CLI command "qos statistics latency show -iterations 1" but so far I only get 0 even if I measure in uS, I suspect that the complete cluster is always going to show a low average.
On the old filer we tracked latency for all volumes using the stats show command and specifically getting the volue:avg_latency counter. The graph on our dashboard showed the average latency for all volumes as well as the worst single at any time. Anybody have any guidance on finding this information scriptomatically for 8.3?
Randy
-----Original Message----- From: Ljungberg, Anders [mailto:Anders.Ljungberg@netapp.com] Sent: Tuesday, June 23, 2015 4:39 PM To: basilberntsen@gmail.com; Momonth; Rue, Randy Cc: toasters@teaparty.net Subject: Re: check Global Status of a CDOT cluster using SNMP?
Attaching the OCUM Install and Setup guide if anyone wants to give it s go...
Many Thanks
Anders Ljungberg Sr.Director Enterprise Transformation and Operations NetApp +14084821148
+447730437939 anders@netapp.com
-----Original Message----- From: "basilberntsen@gmail.com" basilberntsen@gmail.com Date: Tuesday, 23 June 2015 16:28 To: Anders Ljungberg anders.ljungberg@netapp.com, Momonth momonth@gmail.com, "Rue, Randy" rrue@fredhutch.org Cc: "toasters@teaparty.net" toasters@teaparty.net Subject: Re: check Global Status of a CDOT cluster using SNMP?
The only issue with ocum is that the events are only collected every 15 minutes. As for the snmp on cdot, I tested a few of them and they worked (failovers, reboots, etc). I'll try to unplug something on one of the new boxes and ensure that also works.
Sent from my BlackBerry 10 smartphone on the Bell network. Original Message From: Ljungberg, Anders Sent: Tuesday, June 23, 2015 7:14 PM To: Momonth; Rue, Randy Cc: toasters@teaparty.net Subject: Re: check Global Status of a CDOT cluster using SNMP?
Folks,
Not that it¹s a direct or possibly not even helpful response to this thread but OCUM (free) offers quite a bit of functionality and speaks SNMP or can relay events through other protocols should you so wish.
Many Thanks
Anders Ljungberg Sr.Director Enterprise Transformation and Operations NetApp +14084821148
+447730437939 anders@netapp.com
-----Original Message----- From: Momonth momonth@gmail.com Reply-To: Momonth momonth@gmail.com Date: Tuesday, 23 June 2015 14:45 To: "Rue, Randy" rrue@fredhutch.org Cc: "toasters@teaparty.net" toasters@teaparty.net Subject: Re: check Global Status of a CDOT cluster using SNMP?
Hi,
I failed to run 7Mode SNMP based checks for cDOT filers, it simply doesn't work. So I employed power of NetApp SDK. I discovered some ready-to-use scripts on github, and was particularly inspired be the following two:
https://github.com/aleex42/netapp-cdot-nagios https://github.com/willemdh/check_netapp_ontap
Take a look, a lot of aspects are already covered.
However I took it further and created my own scripts (I'm not a perl dev by any means, so pardon my perl-fu) and started publishing it on github:
https://github.com/sl0n/netapp
So, it turned out cDOT 8.[2,3] (maybe prior version as well) don't report a broken PSU on a *disk shelf* (not controller itself) whatsoever, that's how my first script appeared. Stay tuned, I'm about to publish some more scripts that I've already tested in production environment. Patches are welcome =)
Cheers,
On Tue, Jun 9, 2015 at 7:07 PM, Rue, Randy rrue@fredhutch.org wrote:
Hello All,
We're rolling out a pair of CDOT clusters and setting up Nagios monitoring.
On our pre-CDOT clusters we check the global health of the filers using SNMP and the miscGlobalStatusMessage OID. It returns along the lines of: NETAPP-MIB::miscGlobalStatusMessage.0 = STRING: The system's global status is normal.
In 8.3, even though the MIB that NetApp publishes for CDOT includes this OID, an SNMP query returns: NETAPP-MIB::miscGlobalStatusMessage = No Such Object available on this agent at this OID
Anybody got any guidance?
And if we're out of luck via SNMP, any ideas for an equivalent shell command I can pass via SSH?
Hope to hear from you,
Randy in Seattle
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
This is from a 7m zenoss plugin I created. Did you try these in cmode?
#system Status ##global status of 3 is normal GlobalStatusoid=.1.3.6.1.4.1.789.1.2.2.4.0 ##interconnect status of 4 is up InterconnectStatusoid=.1.3.6.1.4.1.789.1.2.3.8.0 #partner status of 2 us okay PartnerStatusoid=.1.3.6.1.4.1.789.1.2.3.4.0 #Failover state of 2 is good. Means ready to take over if needed, but active-active FailoverStateoid=1.3.6.1.4.1.789.1.2.3.2.0 #nvram status of 1 means good NvramBatteryStatusoid=.1.3.6.1.4.1.789.1.2.5.1.0
-----Original Message----- From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Rue, Randy Sent: Monday, June 29, 2015 5:29 PM To: 'toasters@teaparty.net' Subject: RE: check Global Status of a CDOT cluster using SNMP?
When I went back to see how we did this for our 7-mode filers I was reminded that we had many of the same problems there, especially when setting up monitoring via Nagios and performance metrics feeding our roll-your-own dashboards. There were and are some perl-based Nagios plug-ins available that use the (perl-based) NetApp API but we ended up writing some basic python scripts that use snmp if a metric is available or a direct SSH call if not. I confess they're rough, embedded community strings and host names and the like, if I can get time to clean them up I'll see about putting them up somewhere.
For global health, we ended up with a nagio plugin that just calls the CLI command "system health status show" and squawks if the output is NOT "ok" Sadly, for CDOT 8.3 that's all the output you get.
Right now I'm trying to track an overall metric for latency/service time. Calling the CLI command "qos statistics latency show -iterations 1" but so far I only get 0 even if I measure in uS, I suspect that the complete cluster is always going to show a low average.
On the old filer we tracked latency for all volumes using the stats show command and specifically getting the volue:avg_latency counter. The graph on our dashboard showed the average latency for all volumes as well as the worst single at any time. Anybody have any guidance on finding this information scriptomatically for 8.3?
Randy
-----Original Message----- From: Ljungberg, Anders [mailto:Anders.Ljungberg@netapp.com] Sent: Tuesday, June 23, 2015 4:39 PM To: basilberntsen@gmail.com; Momonth; Rue, Randy Cc: toasters@teaparty.net Subject: Re: check Global Status of a CDOT cluster using SNMP?
Attaching the OCUM Install and Setup guide if anyone wants to give it s go...
Many Thanks
Anders Ljungberg Sr.Director Enterprise Transformation and Operations NetApp +14084821148
+447730437939 anders@netapp.com
-----Original Message----- From: "basilberntsen@gmail.com" basilberntsen@gmail.com Date: Tuesday, 23 June 2015 16:28 To: Anders Ljungberg anders.ljungberg@netapp.com, Momonth momonth@gmail.com, "Rue, Randy" rrue@fredhutch.org Cc: "toasters@teaparty.net" toasters@teaparty.net Subject: Re: check Global Status of a CDOT cluster using SNMP?
The only issue with ocum is that the events are only collected every 15 minutes. As for the snmp on cdot, I tested a few of them and they worked (failovers, reboots, etc). I'll try to unplug something on one of the new boxes and ensure that also works.
Sent from my BlackBerry 10 smartphone on the Bell network. Original Message From: Ljungberg, Anders Sent: Tuesday, June 23, 2015 7:14 PM To: Momonth; Rue, Randy Cc: toasters@teaparty.net Subject: Re: check Global Status of a CDOT cluster using SNMP?
Folks,
Not that it¹s a direct or possibly not even helpful response to this thread but OCUM (free) offers quite a bit of functionality and speaks SNMP or can relay events through other protocols should you so wish.
Many Thanks
Anders Ljungberg Sr.Director Enterprise Transformation and Operations NetApp +14084821148
+447730437939 anders@netapp.com
-----Original Message----- From: Momonth momonth@gmail.com Reply-To: Momonth momonth@gmail.com Date: Tuesday, 23 June 2015 14:45 To: "Rue, Randy" rrue@fredhutch.org Cc: "toasters@teaparty.net" toasters@teaparty.net Subject: Re: check Global Status of a CDOT cluster using SNMP?
Hi,
I failed to run 7Mode SNMP based checks for cDOT filers, it simply doesn't work. So I employed power of NetApp SDK. I discovered some ready-to-use scripts on github, and was particularly inspired be the following two:
https://github.com/aleex42/netapp-cdot-nagios https://github.com/willemdh/check_netapp_ontap
Take a look, a lot of aspects are already covered.
However I took it further and created my own scripts (I'm not a perl dev by any means, so pardon my perl-fu) and started publishing it on github:
https://github.com/sl0n/netapp
So, it turned out cDOT 8.[2,3] (maybe prior version as well) don't report a broken PSU on a *disk shelf* (not controller itself) whatsoever, that's how my first script appeared. Stay tuned, I'm about to publish some more scripts that I've already tested in production environment. Patches are welcome =)
Cheers,
On Tue, Jun 9, 2015 at 7:07 PM, Rue, Randy rrue@fredhutch.org wrote:
Hello All,
We're rolling out a pair of CDOT clusters and setting up Nagios monitoring.
On our pre-CDOT clusters we check the global health of the filers using SNMP and the miscGlobalStatusMessage OID. It returns along the lines of: NETAPP-MIB::miscGlobalStatusMessage.0 = STRING: The system's global status is normal.
In 8.3, even though the MIB that NetApp publishes for CDOT includes this OID, an SNMP query returns: NETAPP-MIB::miscGlobalStatusMessage = No Such Object available on this agent at this OID
Anybody got any guidance?
And if we're out of luck via SNMP, any ideas for an equivalent shell command I can pass via SSH?
Hope to hear from you,
Randy in Seattle
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters