You have a couple options open to you, that may or may not fit
your needs.
1.
While DSM does not have any alerting built-in, Snapdrive
does. While it may not give you an email alert when a path fails, it will
give you other storage alerts which are more impacting than a single path
failing. I do like your idea of pulling from the command line, but it
sounds pretty heavy for any environment larger than 20 hosts. What I
would do is create the command line output, parse it for failures, and email
out only on failures a few times a day (don’t forget to share the script
;)
2.
With Brocades you can configure an email address for
administrator, which will email out when a link goes down. While it won’t
show you a problem with zoning causing a path to fail, it will show you when a
link / gbic fails.
3.
With Netapp Operations Manager / DFM you can configure custom
alerts on various events on the Filer, these include: HBA Port:
Offline, HBA Port: Port Error, HBA Port: Traffic High. Keep
in mind if you have alerting on all events “Critical or worse”
already set up, Netapp has some items missing from that list. I am pretty
sure the HBA errors are not included in “critical”, just like “lun
offline” is not a critical alert either.
I think to get exactly what you want you should start with #1.
It sounds like alerting is really important in your environment (I suppose it
is to everyone) so you may want to do #2 since it is free, and I would push for
Ops Mgr as well too. To Stetson’s point as well, the Host Attach
Kit also includes other command line utilities that may be helpful in your scripting
adventures, and it is required to be installed for any supportable Netapp
config.
HTH,
Hadrian
From:
owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com] On Behalf
Of Webster, Stetson
Sent: Tuesday, April 08, 2008 7:29 AM
To: Jon Hill; toasters@mathworks.com
Subject: RE: MPIO and path failure
For NetApp, be sure to review the NetApp Host Utilities software
downloadable from the NOW site. If your host OS supports it, I recommend
using ALUA.
Cheers ...........
Stetson
M. Webster
Onsite Professional Services Engineer
PS - North Amer. - East
NetApp
919.250.0052 Mobile
Stetson.Webster@netapp.com
www.netapp.com
From: Jon Hill [mailto:JHill@jennison.com]
Sent: Tuesday, April 08, 2008 10:23 AM
To: Webster, Stetson; toasters@mathworks.com
Subject: RE: MPIO and path failure
I should clarify that in this case the storage array was an old HP
XP-512 that we are migrating away from. But it made us realize that we
didn't have any mechanism in place to alert us when a host (on this array or on
NTAP) loses one of its paths.
The latest revs of HP's client-side MPIO application for the XP
(it's called auto path and I doubt very strongly that it would work with an
NTAP array) does have its own SNMP MIB and even an e-mail alerting
feature. I was hoping that the NTAP MPIO DSM might have something
similar.
To answer your question, we do believe the head is to blame because
we've swapped out every other component - HBAs, fibre cables, Brocade
switches. And we don't think it's a software issue because the HBA that
connects to the bad path has no comm even when configuring the boot BIOS during
the POST. However, the diagnostics dumps that the backline support
engineers generated on the head show everything as fine, so this is one of
those fairly subtle issues where the client can't communicate over one path but
every component is reporting a status of OK. Hence the interest in a
client-side utility.
I did find that I could run c:\Program
Files\NetApp\mpio\dsmcli path list to generate a (barely)
human-readable list of paths, but it would take some effort to automate and I
figured if the DSM is already gathering this data then it may have another
interface for retrieving it.
From: Webster, Stetson
[mailto:Stetson.Webster@netapp.com]
Sent: Tuesday, April 08, 2008 10:08 AM
To: Jon Hill; toasters@mathworks.com
Subject: RE: MPIO and path failure
Paths are merely a route **TO** the filer, but rather
something **ON** the filer.
Therefore, that's an environmental issue external to the filer
(rather than something actually on the filer). So you would need to
investigate this with your switch vendor, etc.
Do you perhaps have some indication that something on the filer
failed and caused this?
Stetson
M. Webster
Onsite Professional Services Engineer
PS - North Amer. - East
NetApp
919.250.0052 Mobile
Stetson.Webster@netapp.com
www.netapp.com
From: Jon Hill [mailto:JHill@jennison.com]
Sent: Tuesday, April 08, 2008 8:06 AM
To: toasters@mathworks.com
Subject: MPIO and path failure
We had an
incident the other day where we lost one of our two paths to storage. We
noticed the path failure quite by accident, and some time after it actually
happened.
Does anyone
know whether there's a way that the NTAP fibre DSM can be configured to
generate an alert (by SNMP or e-mail) whenever a path fails?