Yup, misalignment is your problem.

Some try to talk it away as minor, it's not.

Tons of overhead in that output...all misaligned IO.




On Wed, Sep 25, 2013 at 10:16 PM, Andreev, Nikita <Nikita.Andreev@visionstream.com.au> wrote:

Each controller has one aggregate of FC 15k drives. Controller1 is using 20 internal drives and Controller2 has a shelf of 24 drives connected to it. But the problem is not with disks. Because disk busy is between 30%-50%.

 

We do use deduplication on all of the VMware volumes. That was my first idea. But I tried to disable all deduplication at once and didn’t observe any significant difference in CPU usage.

 

Compression is not used.

 

SnapMirror is done overnight and doesn’t impact production during business hours. We don’t use SnapVault.

 

Here is an excerpt from sysstat –x 5 output:

 

CPU   NFS  CIFS  HTTP   Total    Net kB/s   Disk kB/s     Tape kB/s Cache Cache  CP   CP Disk    FCP iSCSI   FCP  kB/s iSCSI  kB/s

                                  in   out   read  write  read write   age   hit time  ty util                 in   out    in   out

97%  1392     0     0    1392 22597  6044  18172  32302     0     0     5s  97%  50%  Fs  23%      0     0     0     0     0     0

94%  2024     0     0    2024 14507  8039  20251  18952     0     0     4s  95%  61%  Ff  13%      0     0     0     0     0     0

98%  2125     0     0    2125 20336 10821  18652  31241     0     0     5s  98%  85%  Ff  19%      0     0     0     0     0     0

99%  1801     0     0    1801 31365 14119  20186  31352     0     0     5s  96%  60%  Ff  13%      0     0     0     0     0     0

97%  1396     0     0    1396 27938 14741  24378  29044     0     0     3s  97%  70%  F   20%      0     0     0     0     0     0

99%  1644     0     0    1644 24698 20634  26610  29229     0     0     1   95%  70%  Fn  17%      0     0     0     0     0     0

100%  1337     0     0    1337 28669 19982  20661  42895     0     0     3s  96%  91%  Ff  21%      0     0     0     0     0     0

95%  1082     0     0    1082 21795 11085  25374  34816     0     0     2s  95%  77%  Ff  17%      0     0     0     0     0     0

99%   982     0     0     982 31760 15161  25116  42265     0     0     2s  97%  73%  Ff  19%      0     0     0     0     0     0

96%  1096     0     0    1096 21200  5391  15923  28497     0     0     2s  98%  58%  Ff  27%      0     0     0     0     0     0

 

Sysstat –m is not supported on FAS2050, because it’s single core.

 

Aggregates are more than 70% free.

 

We do have misaligned VMs. But I don’t think that the amount of misaligned operations is more than 10-20%. I’ll collect detailed statistics tomorrow and report back.

 

Regards,

 

Nikita Andreev | Systems Engineer (Contract)

Visionstream IT Infrastructure Team

236 East Boundary Road, 2 North Drive

Virginia Park, Bentleigh East VIC 3165

E: Nikita.Andreev@visionstream.com.au

W: www.visionstream.com.au

 

From: Colin Bieberstein [mailto:colin@bieberstein.ca]
Sent: Thursday, 26 September 2013 12:09 PM


To: Andreev, Nikita
Cc: toasters@teaparty.net
Subject: Re: FAS2050 high CPU usage

 

What sort of aggregates do you have behind your datastores?  (aggr status -r  or sysconfig -v) Is it, for instance, using only the internal sata disk?   A minimum config was 10 drives iirc.   That would easily limit your performance down where you describe.

 

Otherwise the common CPU culprits are:  deduplication jobs, compression jobs, and inline compression.  Then snap vault and snap mirror activity.   Given that it's a branch office... How many replication jobs have you got running?   These won't show up on your protocol ops / second however they will absolutely drag performance down with their I/O. 

 

If not, what are you seeing with a sysstat -x 1 and a sysstat -m 1?  

 

Have you filled the aggregates and or volumes up past 90%?  

 

Do you have misaligned VMs?

 

That's where I'd start looking...   Without specifics it's hard to point you at a cause, but your 2050 can deliver a lot more than the 1000 IOPS you see with a couple shelves... I believe that you max out with 1 loop of six shelves on that controller but it might have been 4 shelves.  

 

Colin Bieberstein 

 


On Sep 25, 2013, at 6:42 PM, "Andreev, Nikita" <Nikita.Andreev@visionstream.com.au> wrote:

Hi All,

 

We’re using FAS2050 in one of our branch offices to run VMware cluster over NFS. It turns out that this box runs on 100% CPU usage even when servicing 40MBs/1000IOPS (the max I’ve seen was 70MB/s). Which results in ridiculous latencies.

 

I do realise that it’s a Celeron CPU. I just want to double check with you guys, that it’s something you’d expect from this box. Because these days 40MB/s seems to be too little even as a CIFS file server for a small team in a branch office.

 

Regards,

Nikita

 

 



This email is intended for the named recipient only. The information contained in this message may be confidential, or commercially sensitive. If you are not the intended recipient you must not reproduce or distribute any part of the email, disclose its contents to any other party, or take any action in reliance on it. If you have received this email in error, please contact the sender immediately and please delete this message completely from any systems. Confidentiality and legal privilege are not waived or lost by reason of mistaken delivery to you.

______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit
http://www.symanteccloud.com
______________________________________________________________________

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters




This email is intended for the named recipient only. The information contained in this message may be confidential, or commercially sensitive. If you are not the intended recipient you must not reproduce or distribute any part of the email, disclose its contents to any other party, or take any action in reliance on it. If you have received this email in error, please contact the sender immediately and please delete this message completely from any systems. Confidentiality and legal privilege are not waived or lost by reason of mistaken delivery to you.

______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters




--
---
Gustatus Similis Pullus