Edward Rolison wrote:
Thanks for all the answers thus far. They've been helpful.
I'm bottle necked on CPU primarily. I've checked the drives, and there's
Sorry, but I have to ask: which ONTAP version is this on your 6280? Before or after 8.2.1? 7-mode I presume. When you say your're at 90% "CPU" you're referring to the *first* column of the output of sysstat -x, under a File Sharing ("NAS") type workload, yes?
Your're basically stating that your're at 850/1200 = 71% CPU. It's a respectful load on a head, you'd better be below ~30% on the partner head or else your s**t will stop short at an HA failover
The interesting part here is what the whole output from systat -M looks like in detail, rolling stats with say 2.5 m interval:
'priv set -q diag; sysstat -M 150'
no particular hotspots there - I have nice big aggregates with ~200 spindles, and they're comfortably less than 30%. Nor network - trunked 10G cards, and none are particularly 'hot'. (Nor anywhere else I'm looking).
Very good -- there's some processing bottleneck you're hitting it sounds like that to me. I have quite some experiance from this kind of stuff, on several large 6290s rather simliar to yours it feels like. Same "class" of File Sharing (NFS dominated) workload as well. You might be badly fragmented for free space, especially of the W latencu is bad but R is AOK and metadata (GETATTR, LOOKUP & ACCES) are all good
BTW the answer to your Q at the start of this trehad is, of course, "it depends". It's always that answer. Always. Always. (Not possible to give any rought rule of thumb even)
/M
But the 'sysstat -x' cpu load is in the mid-high 90s, and the 'sysstat -M' is giving a sum of around 850% (12CPUs).
I do have particular volumes 'running hot' that correlates with latency spikes on another volume (and is getting complaints from another user group). By 'running hot' I mean '10K IOPs' and 200MB/sec read, 200MB/sec write. Whilst it's doing that, if my 'other customer' tries to use their share, they get pretty persistent 20ms+ latency (from 'stats show volume') at about 1000 iops/20MB read/sec.
I'm pretty sure this is my root cause, but my 'gut feeling' is that it shouldn't be. Hence the question.