I had an interesting question posed by a student yesterday - "aside from interfaces being up, and volumes being online, what do you typically monitor?" Some of my initial thoughts were:
- inodes being available - ifstat on multimode vifs to make sure interfaces are being used - io counts on luns/qtrees/vols/ - client side nfsstat or the equivalent for that protocol from the client side
Anyway, I was trying to think of some of the "non-standard" things to monitor, and thought I'd put it out to the list and see what other people are typically doing.
Cheers, Nick
Hi Nicholas,
Some of the stuff we monitor includes:
- Replication lags of SnapMirror and SnapVault relations - Status of HBAs, cluster interconnects - Redundancy of disk paths to disks - The presence of temporary/cloned LUNs (eg. during a SME/SMSQL verification they should be there, but they should get disconnected and disappear after a while unless something has gone wrong with the verify) - You can measure volume/LUN read/write/total latencies - Check for the presence of certain snapshots (eg. sv_nightly.0) on a SV destination volume, and check to make sure it is less than 24 hours old
Best regards, Filip
On Thu, Sep 30, 2010 at 5:38 PM, Nicholas Bernstein nick@nicholasbernstein.com wrote:
I had an interesting question posed by a student yesterday - "aside from interfaces being up, and volumes being online, what do you typically monitor?" Some of my initial thoughts were:
- inodes being available - ifstat on multimode vifs to make sure interfaces are being used - io counts on luns/qtrees/vols/ - client side nfsstat or the equivalent for that protocol from the client side
Anyway, I was trying to think of some of the "non-standard" things to monitor, and thought I'd put it out to the list and see what other people are typically doing.
Cheers, Nick
Volume stuff like read/write_latency, read/write_data, and read/write_ops are typically good, as well as cpu load. Total transactions on an aggregate basis is useful too.Oh and free space on the aggregate and volume.
The latest nfs client package, includes nfs-iostat which can report avg round trip times which can be helpful to diagnose what's slow from a client point of view.
-Blake
On Thu, Sep 30, 2010 at 8:38 AM, Nicholas Bernstein nick@nicholasbernstein.com wrote:
I had an interesting question posed by a student yesterday - "aside from interfaces being up, and volumes being online, what do you typically monitor?" Some of my initial thoughts were:
- inodes being available - ifstat on multimode vifs to make sure interfaces are being used - io counts on luns/qtrees/vols/ - client side nfsstat or the equivalent for that protocol from the client side
Anyway, I was trying to think of some of the "non-standard" things to monitor, and thought I'd put it out to the list and see what other people are typically doing.
Cheers, Nick