In sysstat you can't say definitively, but you can see evidence of read / write amplification by comparing network bytes in / out to disk write / reads. There are other reasons for these not to match though.Instead go to the misaligned read stats. You could look at specific counters, but to be honest it's a pain, and NetApp nicely added it to nfsstat for us. (Presuming 7.3.5+ or 8.0.2+)
nfsstat Anything not in bin0 on the misaligned read / write stats is misaligned.nfsstat -d Will give you the names of the top files (vmdks).What do you see there?Colin BiebersteinHow can you tell this from sysstat output?
From: Jeff Mohler [mailto:speedtoys.racing@gmail.com]
Sent: Thursday, 26 September 2013 3:39 PM
To: Andreev, Nikita
Cc: Colin Bieberstein; toasters@teaparty.net
Subject: Re: FAS2050 high CPU usage
Yup, misalignment is your problem.
Some try to talk it away as minor, it's not.
Tons of overhead in that output...all misaligned IO.
On Wed, Sep 25, 2013 at 10:16 PM, Andreev, Nikita <Nikita.Andreev@visionstream.com.au> wrote:
Each controller has one aggregate of FC 15k drives. Controller1 is using 20 internal drives and Controller2 has a shelf of 24 drives connected to it. But the problem is not with disks. Because disk busy is between 30%-50%.
We do use deduplication on all of the VMware volumes. That was my first idea. But I tried to disable all deduplication at once and didn’t observe any significant difference in CPU usage.
Compression is not used.
SnapMirror is done overnight and doesn’t impact production during business hours. We don’t use SnapVault.
Here is an excerpt from sysstat –x 5 output:
CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk FCP iSCSI FCP kB/s iSCSI kB/s
in out read write read write age hit time ty util in out in out
97% 1392 0 0 1392 22597 6044 18172 32302 0 0 5s 97% 50% Fs 23% 0 0 0 0 0 0
94% 2024 0 0 2024 14507 8039 20251 18952 0 0 4s 95% 61% Ff 13% 0 0 0 0 0 0
98% 2125 0 0 2125 20336 10821 18652 31241 0 0 5s 98% 85% Ff 19% 0 0 0 0 0 0
99% 1801 0 0 1801 31365 14119 20186 31352 0 0 5s 96% 60% Ff 13% 0 0 0 0 0 0
97% 1396 0 0 1396 27938 14741 24378 29044 0 0 3s 97% 70% F 20% 0 0 0 0 0 0
99% 1644 0 0 1644 24698 20634 26610 29229 0 0 1 95% 70% Fn 17% 0 0 0 0 0 0
100% 1337 0 0 1337 28669 19982 20661 42895 0 0 3s 96% 91% Ff 21% 0 0 0 0 0 0
95% 1082 0 0 1082 21795 11085 25374 34816 0 0 2s 95% 77% Ff 17% 0 0 0 0 0 0
99% 982 0 0 982 31760 15161 25116 42265 0 0 2s 97% 73% Ff 19% 0 0 0 0 0 0
96% 1096 0 0 1096 21200 5391 15923 28497 0 0 2s 98% 58% Ff 27% 0 0 0 0 0 0
Sysstat –m is not supported on FAS2050, because it’s single core.
Aggregates are more than 70% free.
We do have misaligned VMs. But I don’t think that the amount of misaligned operations is more than 10-20%. I’ll collect detailed statistics tomorrow and report back.
Regards,
Nikita Andreev | Systems Engineer (Contract)
Visionstream IT Infrastructure Team
236 East Boundary Road, 2 North Drive
Virginia Park, Bentleigh East VIC 3165
From: Colin Bieberstein [mailto:colin@bieberstein.ca]
Sent: Thursday, 26 September 2013 12:09 PM
To: Andreev, Nikita
Cc: toasters@teaparty.net
Subject: Re: FAS2050 high CPU usage
What sort of aggregates do you have behind your datastores? (aggr status -r or sysconfig -v) Is it, for instance, using only the internal sata disk? A minimum config was 10 drives iirc. That would easily limit your performance down where you describe.
Otherwise the common CPU culprits are: deduplication jobs, compression jobs, and inline compression. Then snap vault and snap mirror activity. Given that it's a branch office... How many replication jobs have you got running? These won't show up on your protocol ops / second however they will absolutely drag performance down with their I/O.
If not, what are you seeing with a sysstat -x 1 and a sysstat -m 1?
Have you filled the aggregates and or volumes up past 90%?
Do you have misaligned VMs?
That's where I'd start looking... Without specifics it's hard to point you at a cause, but your 2050 can deliver a lot more than the 1000 IOPS you see with a couple shelves... I believe that you max out with 1 loop of six shelves on that controller but it might have been 4 shelves.
Colin Bieberstein
On Sep 25, 2013, at 6:42 PM, "Andreev, Nikita" <Nikita.Andreev@visionstream.com.au> wrote:Hi All,
We’re using FAS2050 in one of our branch offices to run VMware cluster over NFS. It turns out that this box runs on 100% CPU usage even when servicing 40MBs/1000IOPS (the max I’ve seen was 70MB/s). Which results in ridiculous latencies.
I do realise that it’s a Celeron CPU. I just want to double check with you guys, that it’s something you’d expect from this box. Because these days 40MB/s seems to be too little even as a CIFS file server for a small team in a branch office.
Regards,
Nikita
This email is intended for the named recipient only. The information contained in this message may be confidential, or commercially sensitive. If you are not the intended recipient you must not reproduce or distribute any part of the email, disclose its contents to any other party, or take any action in reliance on it. If you have received this email in error, please contact the sender immediately and please delete this message completely from any systems. Confidentiality and legal privilege are not waived or lost by reason of mistaken delivery to you.
______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
_____________________________________________________________________________________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters
This email is intended for the named recipient only. The information contained in this message may be confidential, or commercially sensitive. If you are not the intended recipient you must not reproduce or distribute any part of the email, disclose its contents to any other party, or take any action in reliance on it. If you have received this email in error, please contact the sender immediately and please delete this message completely from any systems. Confidentiality and legal privilege are not waived or lost by reason of mistaken delivery to you.
______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters
--
---
Gustatus Similis Pullus
This email is intended for the named recipient only. The information contained in this message may be confidential, or commercially sensitive. If you are not the intended recipient you must not reproduce or distribute any part of the email, disclose its contents to any other party, or take any action in reliance on it. If you have received this email in error, please contact the sender immediately and please delete this message completely from any systems. Confidentiality and legal privilege are not waived or lost by reason of mistaken delivery to you.
______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________