On Thu, Dec 3, 2015 at 3:59 PM, <basilberntsen@gmail.com> wrote:

You might just be seeing the unavoidable performance of a system stretched as far as it'll go. You could improve system health by using QOS to throttle incoming writes, but that would increase host-observed latency. You could also, as mentioned, stagger your IO, if the solr side supports that kind of thing.

Sent from my BlackBerry 10 smartphone on the Bell network.

From: Philip Gardner, Jr.
Sent: Thursday, December 3, 2015 3:57 PM
To: Basil
Cc: Toasters
Subject: Re: FAS8040 getting crushed by Solr replication

This is actually 8.3. I will open up a case with NetApp eventually, just figured I would ask here first to see if anyone had any quick ideas.

filer::> version
NetApp Release 8.3: Mon Mar 09 23:01:28 PDT 2015

On Thu, Dec 3, 2015 at 3:09 PM, Basil <basilberntsen@gmail.com> wrote:
I've heard of issues with the performance of flashpool in versions of CDOT under 8.3 when background cleanup tasks like snapshot deletes are done. That said, I'd open a ticket with Netapp and have them analyze it.

On Thu, Dec 3, 2015 at 2:47 PM, Philip Gardner, Jr. <phil.gardnerjr@gmail.com> wrote:
Hi all -

I have an 8040 with 88 x 900G 10k disks, all assigned to a single aggregate on one of the controllers. There are a few volumes on here, all vSphere NFS datastores. This aggregate also has a slice of flash pool assigned to it, currently about 900GB usable.

We recently deployed some CentOS 6 VMs on these datastores that are running Solr, which is an applciation that is used for distributed indexing. The replication is done in typical a master/slave relationship. My understanding of Solr's replication is that it is done periodically where the slaves download any new index files that exist on the master but not on the slaves, into a temp location, and then the slaves replace their existing index files with the new index files from the master. So it appears to be a mostly sequential write process.

During the replication events, we are seeing the controller hosting this particular datastore basically getting crushed and issuing B and b CPs. Here is some output of sysstat during one of the replication events:

CPU   Total     Net   kB/s    Disk   kB/s    Tape   kB/s Cache Cache    CP CP Disk
       ops/s      in    out    read write    read write    age    hit time ty util
7%     854   60795 38643   14108 107216       0      0    20     96% 100% :     9%
7%     991   61950 41930    6542 89350       0      0    20     95% 100% :     9%
4%     977   62900 38820    1244   2932       0      0    20     93%    9% :     1%
4%     811   52853 35658      76     12       0      0    20     96%    0% -     1%
5%     961   67428 43600      60     12       0      0    20     97%    0% -     1%
4%     875   57204 41222      66      4       0      0    20     97%    0% -     1%
5%    1211   78933 59481     110     12       0      0    20     97%    0% -     1%
16%    1024   55549 31785   33306 84626       0      0    20     97%   89% T    14%
7%    1164   56356 36122   14830 102808       0      0    20     96% 100% :     8%
49%   13991 909816 56134    3926 62136       0      0    24     82% 100% B     7%
78%   13154 842333 55302   53011 868408       0      0    24     83% 100% :    51%
83%   12758 818914 59706   44897 742156       0      0    23     89%   97% F    45%
84%   11997 765669 53760   64084 958309       0      0    26     89% 100% B    59%
80%   11823 725972 46004   73227 867704       0      0    26     88% 100% B    51%
83%   15125 957531 46144   42439 614295       0      0    23     87% 100% B    36%
74%    9584 612985 42404   67147 839408       0      0    24     93% 100% B    48%
78%   11367 751672 64071   49881 770340       0      0    24     88% 100% B    46%
79%   12468 822736 53757   38995 595721       0      0    24     87% 100% #    34%
56%    6315 396022 48623   42597 601630       0      0    24     94% 100% B    35%
67%    7923 554797 56459   26309 715759       0      0    24     87% 100% #    43%
69%   13719 879990 37401   41532 333768       0      0    22     87% 100% B    22%
45%      24   52946 42826   33186 736345       0      0    22     98% 100% #    41%
72%   13909 888007 46266   29109 485422       0      0    22     87% 100% B    28%
59%    8036 523206 53199   41719 646767       0      0    22     90% 100% B    37%
68%    7336 505544 63590   46602 870744       0      0    22     91% 100% B    49%
71%   12673 809175 29070   21208 556669       0      0     6     89% 100% #    38%
70%   12097 726574 49381   36251 588939       0      0    24     90% 100% B    35%

And here is some iostat output from one of the Solr slaves during the same timeframe:

12/03/2015 06:48:36 PM
avg-cpu: %user   %nice %system %iowait %steal   %idle
           7.54    0.00    7.42   44.12    0.00   40.92

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await svctm %util
sda               0.00     4.50    0.00    0.00     0.00     0.00     0.00     5.46    0.00   0.00 62.65
sdb               0.00 26670.00    0.00 190.50     0.00    95.25 1024.00   162.75 214.87   5.25 100.00
dm-0              0.00     0.00    1.00   11.50     0.00     0.04     8.00     5.59    0.00 50.12 62.65
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    3.00     0.00     0.01     8.00     2.44    0.00 135.33 40.60
dm-3              0.00     0.00    0.00 26880.00     0.00   105.00     8.00 20828.90 194.77   0.04 100.00

12/03/2015 06:48:38 PM
avg-cpu: %user   %nice %system %iowait %steal   %idle
           9.23    0.00   16.03   24.23    0.00   50.51

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await svctm %util
sda               0.00   177.00    1.00   19.50     0.00     0.79    78.83     7.91 651.90 16.59 34.00
sdb               0.00 73729.00    0.00 599.50     0.00   299.52 1023.23   142.51 389.81   1.67 100.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     4.56    0.00   0.00 27.55
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00 186.50     0.00     0.73     8.00    87.75 483.59   1.82 34.00
dm-3              0.00     0.00    0.00 74310.00     0.00   290.27     8.00 18224.54 402.32   0.01 100.00

12/03/2015 06:48:40 PM
avg-cpu: %user   %nice %system %iowait %steal   %idle
           9.27    0.00   10.04   22.91    0.00   57.79

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await svctm %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb               0.00 24955.50    0.00 202.00     0.00   101.00 1024.00   142.07 866.56   4.95 100.05
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-3              0.00     0.00    0.00 25151.50     0.00    98.25     8.00 18181.29 890.67   0.04 100.05

12/03/2015 06:48:42 PM
avg-cpu: %user   %nice %system %iowait %steal   %idle
           9.09    0.00   12.08   21.95    0.00   56.88

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await svctm %util
sda               0.00     2.50    0.00    1.50     0.00     0.01    18.67     0.46   36.33 295.33 44.30
sdb               0.00 59880.50    0.00 461.50     0.00   230.75 1024.00   144.82 173.12   2.17 99.95
dm-0              0.00     0.00    0.00    1.00     0.00     0.00     8.00     0.81    0.00 407.50 40.75
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    3.50     0.00     0.01     8.00     0.13   37.29 10.14   3.55
dm-3              0.00     0.00    0.00 60352.50     0.00   235.75     8.00 18538.70 169.30   0.02 100.00

As you can see, we are getting some decent throughput, but it causes the latency to spike on the filer. I have heard that the avgrq-sz in iostat is related to the block size, can anyone verify that? Is a 1MB block size too much for the filer? I am still researching if there is a way to modify this in Solr, but I haven't come up with much yet. Note, the old Solr slaves were made up of physcal DL360p's with only a local 2-disk 10k RAID1. The new slaves and relay-master are currently all connected with 10Gb, which removed the network 1Gb bottleneck for the replication, which could be uncorking the bottle so-to-speak. I'm still at a loss why this is hurting the filer so much though.

Any ideas?

--
GPG keyID: 0xFECC890C
Phil Gardner

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

--
GPG keyID: 0xFECC890C
Phil Gardner

GPG keyID: 0xFECC890C
Phil Gardner