On 06 Jan 16:17, John Stoffel wrote:
"John" == John Constable jc18@sanger.ac.uk writes:
John> We've had some success using their API - I have python scripts John> that pull out network info and latency metrics into our graphite John> system. We are probably going to switch to this for our Nagios John> systems as we have seen the SNMP timeout under similar John> conditions.
John> I think you can get what you need from the diagnosis API, more John> specifically the diagnosis-alert-info, but I haven't tested it.
John> I do find that the API is much quicker than SNMP for bulk John> queries (all io stats for every volume, for example), and much John> more reliable. Its an area that rewards some time spent, IMHO..
Care to share your work with the rest of us, so we can all benefit? Or give pointers to the docs we need to read? I've got a cDOT setup which is crying out to be monitored and since our new Nagios instance might not cut it from what you say, having other options would be ideal.
Sorry for the delay - I'm trying to work out if the combination of SDK, API and open source licences for the code mean I can post it (sigh). Plus, it needs a little documenting, but I do plan to send out details if I can..