toasters

toasters@lists.teaparty.net

1 participants
13516 discussions

list meta information
by Tom Yates 09 Dec '15

09 Dec '15

On Tue, 24 Nov 2015, Tom Yates wrote: > While I'm posting meta-stuff, I've had some difficulty in getting the > CAPTCHA to work in the list subscription page (as anyone who's tried to > subscribe recently will have noticed). If you're a python hacker who > doesn't mind helping me debug the problem, I'd appreciate an email > **off-list**, as I'm definitely not one. Vast thanks to John Constable <jc18(a)sanger.ac.uk>, who got this working. Web subscriptions are now re-enabled. I should also mention that it was pointed out to me that the "digest" function of the list was almost useless, because the volume threshold was set so low that a new digest was created nearly every time a message was sent. I have increased it considerably, so people subscribed to the digest should have been getting one a day, except when traffic is exceptionally high. We now return you to your regularly-scheduled programming. -- Tom Yates - http://www.teaparty.net

1 0

Netapp disk manufacture
by Klise, Steve 09 Dec '15

09 Dec '15

Does anyone know where I can find part numbers for disk size, manufacture? I looked on the NOW site but couldn’t find anything. I need to know if they are seagate, WD, etc.. I need them for all SAS, SATA, FC, SSD, etc. I have a customer that insists on a certain manufacturer. Steve

10 17

1gb sfp support in fas2500
by Mike Gossett 09 Dec '15

09 Dec '15

I was reading the tech specs on the fas2552/4 platforms and there is verbiage indicating that with 8.3.1 and onward, there is support for 1gb sfp modules in the uta2 ports. I have deployed many of these units in the past 18 months and have always forced 10gb switching to support it, even though 4x1g may have been adequate in some circumstances. My NetApp sales team seems to be unfamiliar with this and is researching. Can anyone shed some light on this for me? Are there 1g copper modules or only 1g fiber sfp? Has anyone quoted / sold / deployed these yet? Any caveats? Thanks Mike Sent from my mobile phone

2 2

SP Firmware 2.3.2 for FAS2552
by Alexander Griesser 04 Dec '15

04 Dec '15

Hey, does anyone have the download link for SP firmware 2.3.2 at hand? The website is still showing 2.3.1 as the latest and greatest and we got two FAS2552 delivered recently and one of them came with 2.3.2 already, so I'd like to update the second one tot hat release too :) ::*> sp show IP Firmware Node Type Status Configured Version IP Address ------------- ---- ----------- ------------ --------- ------------------------- nodeA SP online true 2.3.2 10.100.55.112 nodeB SP online true 2.3.2 10.100.55.111 Thanks, Alexander Griesser Head of Systems Operations ANEXIA Internetdienstleistungs GmbH E-Mail: AGriesser(a)anexia-it.com<mailto:AGriesser@anexia-it.com> Web: http://www.anexia-it.com<http://www.anexia-it.com/> Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

1 0

FAS8040 getting crushed by Solr replication
by Philip Gardner, Jr. 04 Dec '15

04 Dec '15

Hi all - I have an 8040 with 88 x 900G 10k disks, all assigned to a single aggregate on one of the controllers. There are a few volumes on here, all vSphere NFS datastores. This aggregate also has a slice of flash pool assigned to it, currently about 900GB usable. We recently deployed some CentOS 6 VMs on these datastores that are running Solr, which is an applciation that is used for distributed indexing. The replication is done in typical a master/slave relationship. My understanding of Solr's replication is that it is done periodically where the slaves download any new index files that exist on the master but not on the slaves, into a temp location, and then the slaves replace their existing index files with the new index files from the master. So it appears to be a mostly sequential write process. During the replication events, we are seeing the controller hosting this particular datastore basically getting crushed and issuing B and b CPs. Here is some output of sysstat during one of the replication events: CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk ops/s in out read write read write age hit time ty util 7% 854 60795 38643 14108 107216 0 0 20 96% 100% : 9% 7% 991 61950 41930 6542 89350 0 0 20 95% 100% : 9% 4% 977 62900 38820 1244 2932 0 0 20 93% 9% : 1% 4% 811 52853 35658 76 12 0 0 20 96% 0% - 1% 5% 961 67428 43600 60 12 0 0 20 97% 0% - 1% 4% 875 57204 41222 66 4 0 0 20 97% 0% - 1% 5% 1211 78933 59481 110 12 0 0 20 97% 0% - 1% 16% 1024 55549 31785 33306 84626 0 0 20 97% 89% T 14% 7% 1164 56356 36122 14830 102808 0 0 20 96% 100% : 8% 49% 13991 909816 56134 3926 62136 0 0 24 82% 100% B 7% 78% 13154 842333 55302 53011 868408 0 0 24 83% 100% : 51% 83% 12758 818914 59706 44897 742156 0 0 23 89% 97% F 45% 84% 11997 765669 53760 64084 958309 0 0 26 89% 100% B 59% 80% 11823 725972 46004 73227 867704 0 0 26 88% 100% B 51% 83% 15125 957531 46144 42439 614295 0 0 23 87% 100% B 36% 74% 9584 612985 42404 67147 839408 0 0 24 93% 100% B 48% 78% 11367 751672 64071 49881 770340 0 0 24 88% 100% B 46% 79% 12468 822736 53757 38995 595721 0 0 24 87% 100% # 34% 56% 6315 396022 48623 42597 601630 0 0 24 94% 100% B 35% 67% 7923 554797 56459 26309 715759 0 0 24 87% 100% # 43% 69% 13719 879990 37401 41532 333768 0 0 22 87% 100% B 22% 45% 24 52946 42826 33186 736345 0 0 22 98% 100% # 41% 72% 13909 888007 46266 29109 485422 0 0 22 87% 100% B 28% 59% 8036 523206 53199 41719 646767 0 0 22 90% 100% B 37% 68% 7336 505544 63590 46602 870744 0 0 22 91% 100% B 49% 71% 12673 809175 29070 21208 556669 0 0 6 89% 100% # 38% 70% 12097 726574 49381 36251 588939 0 0 24 90% 100% B 35% And here is some iostat output from one of the Solr slaves during the same timeframe: 12/03/2015 06:48:36 PM avg-cpu: %user %nice %system %iowait %steal %idle 7.54 0.00 7.42 44.12 0.00 40.92 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 4.50 0.00 0.00 0.00 0.00 0.00 5.46 0.00 0.00 62.65 sdb 0.00 26670.00 0.00 190.50 0.00 95.25 1024.00 162.75 214.87 5.25 100.00 dm-0 0.00 0.00 1.00 11.50 0.00 0.04 8.00 5.59 0.00 50.12 62.65 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 3.00 0.00 0.01 8.00 2.44 0.00 135.33 40.60 dm-3 0.00 0.00 0.00 26880.00 0.00 105.00 8.00 20828.90 194.77 0.04 100.00 12/03/2015 06:48:38 PM avg-cpu: %user %nice %system %iowait %steal %idle 9.23 0.00 16.03 24.23 0.00 50.51 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 177.00 1.00 19.50 0.00 0.79 78.83 7.91 651.90 16.59 34.00 sdb 0.00 73729.00 0.00 599.50 0.00 299.52 1023.23 142.51 389.81 1.67 100.00 dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.56 0.00 0.00 27.55 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 186.50 0.00 0.73 8.00 87.75 483.59 1.82 34.00 dm-3 0.00 0.00 0.00 74310.00 0.00 290.27 8.00 18224.54 402.32 0.01 100.00 12/03/2015 06:48:40 PM avg-cpu: %user %nice %system %iowait %steal %idle 9.27 0.00 10.04 22.91 0.00 57.79 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 0.00 24955.50 0.00 202.00 0.00 101.00 1024.00 142.07 866.56 4.95 100.05 dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-3 0.00 0.00 0.00 25151.50 0.00 98.25 8.00 18181.29 890.67 0.04 100.05 12/03/2015 06:48:42 PM avg-cpu: %user %nice %system %iowait %steal %idle 9.09 0.00 12.08 21.95 0.00 56.88 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 2.50 0.00 1.50 0.00 0.01 18.67 0.46 36.33 295.33 44.30 sdb 0.00 59880.50 0.00 461.50 0.00 230.75 1024.00 144.82 173.12 2.17 99.95 dm-0 0.00 0.00 0.00 1.00 0.00 0.00 8.00 0.81 0.00 407.50 40.75 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 3.50 0.00 0.01 8.00 0.13 37.29 10.14 3.55 dm-3 0.00 0.00 0.00 60352.50 0.00 235.75 8.00 18538.70 169.30 0.02 100.00 As you can see, we are getting some decent throughput, but it causes the latency to spike on the filer. I have heard that the avgrq-sz in iostat is related to the block size, can anyone verify that? Is a 1MB block size too much for the filer? I am still researching if there is a way to modify this in Solr, but I haven't come up with much yet. Note, the old Solr slaves were made up of physcal DL360p's with only a local 2-disk 10k RAID1. The new slaves and relay-master are currently all connected with 10Gb, which removed the network 1Gb bottleneck for the replication, which could be uncorking the bottle so-to-speak. I'm still at a loss why this is hurting the filer so much though. Any ideas? -- GPG keyID: 0xFECC890C Phil Gardner

7 12

Re: Shutting down a controller pair in a 4-node cluster.
by Peter Choi 03 Dec '15

03 Dec '15

Yes manually moving epsilon is feasible in a planned downtime scenario. What about in a disaster situation - if a pair goes down unexpectedly, will epsilon me auto migrated to either of the surviving nodes? > On Dec 3, 2015, at 4:39 PM, Gelb, Scott <scott(a)red8.com> wrote: > > I tested on 8.3.1 and when an ha-pair is degraded, epsilon moves to the other ha-pair when one of the nodes goes down, however I would move epsilon manually to a node in the other ha-pair for this type of maintenance and not assume it will move. Then you have 2 nodes plus epsilon and continue to serve data for the aggregates/volumes on those nodes…the data on the other ha-pair shutdown will be inaccessible and the cluster will not be in a healthy state. > > From: toasters-bounces(a)teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Peter Choi > Sent: Thursday, December 3, 2015 1:35 PM > To: toasters(a)teaparty.net > Subject: Shutting down a controller pair in a 4-node cluster. > > Hi, > > Please consider the following scenario, as we are conducting testing. I've done my research and received varying information from support , so I thought to get this audience's take on this. > > Basically we have a flexpod. > > - 2 cabinets; > - Each cabinet consists of: CN1610, Nexus, FI, 2 x FAS8020 (cDOT 8.3.1P1), UCS blades; > - Controllers are joined together in a 4-node cluster configuration; > - Cabled in such a manner that we have controller HBA and switch port redundancy. > > If we were to shutdown a controller pair (2 nodes that are HA partners to each other), effectively shutting down 1 of the 2 cabinets. > > - Will the data on the disks attached to the other two controller nodes (the surviving HA pair) still be served and accessible? > - What happens if epsilon resides on one of the nodes that are shut down. Will it be automatically shifted to a surviving member in the cluster? > > Thanks.

2 1

Re: FAS8040 getting crushed by Solr replication
by Philip Gardner, Jr. 03 Dec '15

03 Dec '15

On Thu, Dec 3, 2015 at 4:33 PM, John Stoffel <john(a)stoffel.org> wrote: > >>>>> "Jr" == Jr Gardner <phil.gardnerjr(a)gmail.com> writes: > > So that means you're indexing 300Gb worth of data, or generating 300Gb > worth of Solr indexes? How big are the generated Indexes? The reason > I ask is because maybe if they are that big, it means you're writing > 1.2Tb of indexes, which is the same data... > > So if you could maybe split the indexers into two pools, and have each > pair have only one system doing the indexing... that might big a big > savings. > Not all of the index files get written at the same time. That would only happen for a new/fresh slave with no existing index. The index is split into many files, and only the new ones that the master creates are pulled down by the slaves and written to disk. We are talking about 1-2GB for an update, but only if there are updates to the master. There may not be updates for every replication check interval. > How was the performance on the handful of DL360s? And how is the > perforance of the slaves compared to the old setup? Even if you're > beating the Netapp to death for a bit, maybe it's a net win? > Performance is "decent". Those are writing about as fast as the 8040 is before slowing things down with the B CPs, 2-300MB/s. > So this I think answers my question, the master trawls the dataset > building the index, then the slaves copy those new indexes to their > storage area. > > And it really is too bad that you can't just use the Netapp for the > replication with either a snapshot or a snapmirror to replicate the > new files to the six slaves. If they're read only, you should be > working hard to keep them from reading the same file six times from > the master and then writing six copies back to the Netapp. > > Now hopefully the clients aren't re-writing all 300Gb each time, and > the write numbers you show are simply huge! You're seeing 10x the > writes compared to reads, which implies that these slaves aren't setup > right. They should be read/mostly! > > Does the index really need to be updated every three minutes? That's > a pretty darn short time. > > And is there other load on the 8040 cluster as well? > Yeah, It looks like this is something we are going to have to redesign before we go the virtualized instances. I agree there are a few different ways to do it, but the snapmirror option seems like a good one. I also agree that its ineffecient to have the same copy of data getting written numerous times. These are really write heavy at the moment because they are not in production yet. We just recently got these new VMs set up and going, and I was just watching performance take a hit across this controller and wanted to investigate. The index does need to be updated relatively frequently. We use this particular index to store vehicle/inventory data for the frontend web site to query, so when a client makes changes on the backend, they want the changes to take affect relatively quickly; otherwise its a "bad experience" :) > What's the load on the Netapp when no nodes are writing at all? Are > you getting hit by lots of writes then? If so... you need more > spindles. And how full is the aggregate? And how busy/full are the > other volumes? > Here is a snapshot of sysstat when those solr slaves are not writing to disk: CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk ops/s in out read write read write age hit time ty util 3% 705 18201 2460 26 0 0 0 3s 97% 0% - 1% 14% 1128 36607 12876 29246 59480 0 0 3s 99% 62% T 10% 6% 968 33432 9258 15874 98826 0 0 3s 97% 100% : 8% 6% 619 29781 20739 5605 95763 0 0 3s 99% 100% : 9% 4% 1055 43136 15750 84 18108 0 0 3s 97% 25% : 3% 13% 1041 38311 5779 52 12 0 0 3s 98% 0% - 1% 4% 1089 33113 7183 44 0 0 0 3s 99% 0% - 1% 4% 1277 43362 14837 86 16 0 0 3s 98% 0% - 1% 5% 1843 48354 24844 26 12 0 0 3s 100% 0% - 1% 16% 1849 57845 21218 29590 65774 0 0 3s 99% 74% T 10% 6% 772 39316 24262 17096 97466 0 0 3s 97% 100% : 8% 7% 1019 57397 32988 19028 86126 0 0 3s 99% 100% : 8% 13% 843 29941 9331 882 37964 0 0 3s 96% 62% : 2% 3% 759 31928 12799 40 12 0 0 3s 97% 0% - 2% 4% 1216 56116 26869 88 16 0 0 3s 98% 0% - 1% 3% 904 49644 25957 38 0 0 0 3s 97% 0% - 1% 5% 1300 60471 36002 96 12 0 0 3s 94% 0% - 1% 16% 2127 45971 6880 23822 29086 0 0 3s 99% 36% T 10% Pretty quiet. The timer CP is much nicer to see... Aggregate is not full at all: filer::> df -A -h -x aggr_sas900 Aggregate total used avail capacity aggr_sas900 58TB 30TB 28TB 52% And this particular volume is only at 65% capacity. Other volumes aren't over 80% either. -- GPG keyID: 0xFECC890C Phil Gardner

1 0

Shutting down a controller pair in a 4-node cluster.
by Peter Choi 03 Dec '15

03 Dec '15

Hi, Please consider the following scenario, as we are conducting testing. I've done my research and received varying information from support , so I thought to get this audience's take on this. Basically we have a flexpod. - 2 cabinets; - Each cabinet consists of: CN1610, Nexus, FI, 2 x FAS8020 (cDOT 8.3.1P1), UCS blades; - Controllers are joined together in a 4-node cluster configuration; - Cabled in such a manner that we have controller HBA and switch port redundancy. If we were to shutdown a controller pair (2 nodes that are HA partners to each other), effectively shutting down 1 of the 2 cabinets. - Will the data on the disks attached to the other two controller nodes (the surviving HA pair) still be served and accessible? - What happens if epsilon resides on one of the nodes that are shut down. Will it be automatically shifted to a surviving member in the cluster? Thanks.

1 0

Quota not work
by Uwe Schenkel 03 Dec '15

03 Dec '15

Hi, i have trouble with NFS on a Volumen with quota on 8.1.4P4 7-Mode. Diskquota not work correct. /etc/quota testuser user@/vol/vol_test 1M When testuser ist copy files 2x 512k (as sample) and will then copy one more file, quota works. The new file is 0k. BUT when testuser copy a 5MB single file (volume is empty), ontap give me a error: wafl.quota.user.exceeded:notice: uid 1004: disk quota on volume vol_test. What is the trick???? Uwe

4 4

NETAPP TOASTERS FACEBOOK GROUP HAS BEEN CREATED
by Milazzo Giacomo 24 Nov '15

24 Nov '15

Hi everybody, don't you think that time for mail newsgroup have passed? I think so. https://www.facebook.com/groups/netapptoasters/ I've just created a new group on Facebook and named it "NetApp Toasters". I've also mutuated from the LinkedIn logo I've designed years ago a new one. And I please NetApp to let me use their own as header. I know that this could be another YAST where s stands for support :) ... but considering that NetApp mailing list is active since years maybe this group could be. Many advantages can be useful on using a FB group to share infos and comments and, moreover, to enclose same discussions in just one thread. I'm alone with a few guys managing a group that I hope will grow on a daily base, so please be patient if your admission request could take one hour such one day. Of course we all are professionals and any kind of content unpolite or inappropriate will be removed by the incontestable judgement of administrators. Thank you in advance if you'll appreciate it. My best regards, PS) netapptoasters(a)groups.facebook.com<mailto:netapptoasters@groups.facebook.com> mail sent to this address will be addressed as post...but not sure about this Dott. Giacomo Milazzo Senior Consultant & Technical Account Manager mobile: +39 340.6001045 @-mail: g.milazzo(a)sinergy.it<mailto:g.milazzo@sinergy.it> Web: http://www.sinergy.it<http://www.sinergy.it/> [cid:4DB1B876-8A76-4829-9B40-9B136E344F73@fastwebnet.it] SINERGY SpA Viale dei Santi Pietro e Paolo 50 00144 - Roma RM Tel. +39 06 44243674 Fax +39 06 44245272

9 11

Jump to page:

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

toasters