Hi Nicholas,
Some of the stuff we monitor includes:
- Replication lags of SnapMirror and SnapVault relations - Status of HBAs, cluster interconnects - Redundancy of disk paths to disks - The presence of temporary/cloned LUNs (eg. during a SME/SMSQL verification they should be there, but they should get disconnected and disappear after a while unless something has gone wrong with the verify) - You can measure volume/LUN read/write/total latencies - Check for the presence of certain snapshots (eg. sv_nightly.0) on a SV destination volume, and check to make sure it is less than 24 hours old
Best regards, Filip
On Thu, Sep 30, 2010 at 5:38 PM, Nicholas Bernstein nick@nicholasbernstein.com wrote:
I had an interesting question posed by a student yesterday - "aside from interfaces being up, and volumes being online, what do you typically monitor?" Some of my initial thoughts were:
- inodes being available - ifstat on multimode vifs to make sure interfaces are being used - io counts on luns/qtrees/vols/ - client side nfsstat or the equivalent for that protocol from the client side
Anyway, I was trying to think of some of the "non-standard" things to monitor, and thought I'd put it out to the list and see what other people are typically doing.
Cheers, Nick