Classification: UNCLASSIFIED Caveats: NONE
If NVRAM has no place in a NetApp storage system's I/O operations, why would the size of NVRAM correlate with the I/O capabilities of a NetApp storage system?
I think the point is that I/O does not pass through NVRAM, i.e., NVRAM is not used as a disk cache. It is used as a transaction log and in the event of an outage, the NVRAM log is replayed. When a write request arrives, the request is logged to NVRAM and the filer ACKs the request. "At its leisure" the filer commits a batch of write requests to disk by writing a CP (consistency point). Then the corresponding NVRAM log storage is reused to log new requests. When NVRAM approaches capacity, the filer MUST write a CP. Otherwise there is no room in NVRAM to log any more incoming write requests, forcing the filer to reject them.
So the size of NVRAM determines to some degree how much of a backlog of uncommitted write requests that a filer can tolerate.
Jim Blackburn US Army james.m.blackburn33.civ@mail.mil Note: I am not authorized to obligate, commit, or execute United States government funds. No part of this message should be construed as a request or directive to obligate, commit, or execute United States government funds.
-----Original Message----- From: Jeff Mohler [mailto:speedtoys.racing@gmail.com] Sent: Thursday, April 11, 2013 3:12 PM To: Blackburn, James M (Jim) Jr. CIV USARMY PEO EIS (US) Cc: Toasters@teaparty.net Subject: Re: NVRAM weirdness (UNCLASSIFIED)
Sure. NVRAM only holds a copy of data to be written..if all goes well, it's discarded.
On Thu, Apr 11, 2013 at 8:00 PM, Blackburn, James M (Jim) Jr. CIV USARMY PEO EIS (US) james.m.blackburn33.civ@mail.mil wrote:
Classification: UNCLASSIFIED Caveats: NONE
Would "capabilities" happen to be a reference to "I/O capabilities?"
-- Jim Blackburn US Army james.m.blackburn33.civ@mail.mil Note: I am not authorized to obligate, commit, or execute United States government funds. No part of this message should be construed as a request or directive to obligate, commit, or execute United States government funds.
-----Original Message-----
From: Jeff Mohler [mailto:speedtoys.racing@gmail.com] Sent: Thursday, April 11, 2013 2:56 PM To: Blackburn, James M (Jim) Jr. CIV USARMY PEO EIS (US) Cc: Toasters@teaparty.net
Subject: Re: NVRAM weirdness (UNCLASSIFIED)
It is sized appropriately to the capabilities it [controller] was designed for.
Hence..adding more, would not provide an improvement for any given system.
On Thu, Apr 11, 2013 at 7:31 PM, Blackburn, James M (Jim) Jr. CIV USARMY PEO EIS (US) james.m.blackburn33.civ@mail.mil wrote:
Classification: UNCLASSIFIED Caveats: NONE If "NVRAM has no place in the IO operation of a filer", why doesn't every NetApp storage system have the same amount of NVRAM/NVMEM? -- Jim Blackburn US Army james.m.blackburn33.civ@mail.mil Note: I am not authorized to obligate, commit, or execute United States government funds. No part of this message should be construed as a request or directive to obligate, commit, or execute United States government funds. -----Original Message----- From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Jeff Mohler Sent: Thursday, April 11, 2013 2:10 PM To: Scott Eno Cc: Toasters@teaparty.net Subject: Re: NVRAM weirdness For the OP and others..NVRAM has no place in the IO operation of a filer. It holds copies of data that are in main RAM, just in -case- there is a problem with site power/etc. There has been a continued misunderstanding of how Netapp NVRAM works over time..when in fact, it does no work at all. :) In sysstat..a view of "disks" being 100% busy is more a view of "storage" as a whole. Not 'a' disk. If storage is stuck..then CP time will also be likewise stuck at 100%... Incorrect interpretation of sysstat and how Ontap works...can lead to a lot of confusion, and continued incorrect phraseology. On Thu, Apr 11, 2013 at 12:31 PM, Scott Eno <s.eno@me.com> wrote: Looks like you're hanging on a consistency point operation, but the sysstat output starts after that last CP was logged. This looks very similar to the ":s" CP bug, but can't say for sure. You have a long running CP operation, you NVRAM has filled up waiting for the operation to complete, but with no "s" after the "#", then it's not processing special files. It's doing something else. Could be one of these bugs: 651573 Inefficient pre-fetching of metadata blocks delays WAFL consistency point 651597 Inefficient pre-fetching of metadata blocks delays WAFL consistency point 651926 Inefficient pre-fetching of metadata blocks delays WAFL consistency point 651929 Inefficient pre-fetching of metadata blocks delays WAFL consistency point 655724 Inefficient pre-fetching of metadata blocks delays WAFL consistency point 657018 Inefficient pre-fetching of metadata blocks delays WAFL consistency point 650554 Inefficient pre-fetching of metadata blocks delays WAFL consistency point On Apr 10, 2013, at 6:46 PM, Peta Thames <petathames@gmail.com> wrote: Hi all, Has anyone seen this pattern in a sysstat before? TOASTERa> sysstat -u 1 CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk ops/s in out read write read write age hit time ty util 29% 36942 1 0 264 0 0 0 7s 91% 100% # 100% 29% 36896 1 0 1616 0 0 0 7s 90% 100% # 100% 28% 36658 1 0 32 24 0 0 7s 100% 100% # 100% 28% 36625 1 0 108 0 0 0 7s 97% 100% # 100% 27% 36318 1 0 136 8 0 0 7s 100% 100% # 100% 28% 36391 1 0 364 144 0 0 7s 97% 100% # 100% 25% 30387 1 0 88 0 0 0 7s 98% 100% # 100% 21% 24450 1 0 8 0 0 0 7s 100% 100% # 100% 20% 22042 1 0 24 672 0 0 7s 100% 100% # 100% 18% 19781 1 0 284 8 0 0 7s 69% 100% # 100% 18% 19636 1 0 0 0 0 0 7s 100% 100% # 100% 16% 18066 1 0 24 24 0 0 7s 100% 100% # 100% 15% 16999 1 0 56 0 0 0 7s 56% 100% # 100% 15% 16622 1 0 11040 6496 0 0 7s 46% 100% # 1824% 14% 15947 1 0 24 32 0 0 7s 100% 100% # 100% 14% 15448 0 0 0 0 0 0 7s 100% 100% # 100% 14% 15902 0 0 3304 4532 0 0 7s 88% 100% # 100% 14% 15800 0 0 32 24 0 0 7s 100% 100% # 100% 15% 16651 0 0 0 0 0 0 7s 100% 100% # 100% CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk ops/s in out read write read write age hit time ty util 14% 15778 0 0 1520 2932 0 0 7s 100% 100% # 100% 13% 14311 0 1 28 24 0 0 7s 100% 100% # 100% 12% 12153 0 0 0 0 0 0 7s 100% 100% # 100% 12% 11993 0 0 8 0 0 0 7s 100% 100% # 100% 12% 12134 0 0 7368 5996 0 0 7s 18% 100% # 100% 13% 11981 0 0 20356 14956 <tel:20356%C2%A0%2014956> 0 0 7s 100% 100% # 100% 12% 11978 0 0 0 0 0 0 7s 100% 100% # 100% 11% 11617 0 0 24 24 0 0 7s 100% 100% # 100% 12% 11708 0 0 8 0 0 0 7s 100% 100% # 100% 11% 11630 0 0 11936 17872 0 0 7s 100% 100% # 100% 11% 11639 0 0 24 32 0 0 7s 100% 100% # 100% 12% 11791 0 0 12 0 0 0 7s 87% 100% # 100% 22% 11874 0 0 64452 92188 0 0 7s 100% 100% # 100% 12% 12015 0 0 32 24 0 0 7s 100% 100% # 100% 12% 11954 0 0 556 0 0 0 7s 85% 100% # 100% 12% 11899 0 0 0 8 0 0 7s 100% 100% # 100% 13% 13408 0 0 23816 35600 0 0 7s 91% 100% # 100% 13% 14610 0 0 0 0 0 0 7s 100% 100% # 100% 14% 15355 0 0 1248 0 0 0 7s 72% 100% # 100% 14% 15584 0 0 18652 15224 0 0 7s 98% 100% # 100% CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk ops/s in out read write read write age hit time ty util 14% 15397 0 0 0 8 0 0 7s 100% 100% # 100% 15% 15859 0 1 0 0 0 0 7s 100% 100% # 100% 14% 15557 0 0 4944 4568 0 0 7s 100% 100% # 100% 14% 15559 0 0 8 0 0 0 7s 100% 100% # 100% 14% 15603 0 0 0 0 0 0 7s 100% 100% # 100% 14% 15507 0 0 3652 4724 0 0 7s 100% 100% # 100% 14% 15593 0 0 20 0 0 0 7s 75% 100% # 100% 12% 12571 0 0 268 0 0 0 7s 100% 100% # 100% 10% 9962 0 0 92 2072 0 0 7s 96% 100% # 100% 9% 8521 0 0 0 0 0 0 7s 100% 100% # 100% 8% 7680 0 0 4 8 0 0 7s 100% 100% # 100% 8% 7459 0 0 24 24 0 0 7s 100% 100% # 100% 8% 7050 0 0 2580 1600 0 0 7s 100% 100% # 100% 8% 6961 0 0 8 0 0 0 7s 100% 100% # 100% 8% 6954 0 0 24 24 0 0 7s 100% 100% # 100% 7% 6936 0 0 584 2400 0 0 7s 100% 100% # 100% 8% 7043 0 0 248 0 0 0 7s 0% 100% # 100% 7% 6946 0 0 24 24 0 0 7s 100% 100% # 100% 8% 7320 0 0 1860 7252 0 0 7s 100% 100% # 100% 55% 19882 2 2 89920 29733 0 0 1 86% 99% B 45% CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk ops/s in out read write read write age hit time ty util 88% 17409 31 187 267319 196942 0 0 6s 83% 100% : 89% 87% 16064 11 249 304646 192912 0 0 1 78% 100% : 85% 85% 6274 70 86 322234 233245 0 0 1 81% 100% # 87% 85% 12486 18 340 236474 102383 0 0 3s 82% 100% B 73% 98% 13842 3 29 247937 255188 0 0 2s 85% 100% : 80% 99% 17878 1 1 348355 223367 0 0 3s 84% 100% # 86% 99% 4117 1 1 359628 269212 0 0 3s 78% 100% # 92% 83% 12622 10 31 252272 170064 0 0 2s 83% 100% B 73% 94% 19321 4 10 248864 249292 0 0 2s 88% 100% : 96% 99% 15893 0 0 318375 261482 0 0 2s 85% 100% : 89% 88% 13326 34 76 284960 194032 0 0 2s 82% 100% B 90% 86% 11106 73 104 267760 231524 0 0 2s 81% 100% : 100% 72% 11160 0 0 206927 128869 0 0 3s 75% 100% : 100% 63% 11304 0 0 220736 25276 0 0 3s 67% 100% : 100% 68% 11216 0 0 193324 15184 0 0 3s 66% 100% : 100% 65% 11343 0 0 175120 12588 0 0 3s 60% 100% : 100% 52% 7384 0 0 205780 104564 0 0 3s 61% 100% # 100% 83% 7205 0 0 261108 178256 0 0 3s 74% 100% B 90% 75% 11256 0 0 225164 96332 0 0 3s 76% 100% : 93% 99% 6279 0 0 296056 251191 0 0 2s 87% 100% : 90% CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk ops/s in out read write read write age hit time ty util 65% 8616 0 0 220068 165749 0 0 2s 76% 73% : 71% 89% 5443 0 1 338684 239932 0 0 3s 80% 94% F 87% 95% 6726 0 0 379242 240750 0 0 3s 79% 100% : 91% 83% 8721 1 0 342940 113394 0 0 2s 82% 61% : 89% 77% 11971 1 0 314356 24 0 0 2s 85% 0% - 92% ... and back to normal. Looking in Perf Advisor, it was during an extremely short, sharp latency spike that impacted every volume on every aggregate at exact the same time. It was the fourth of four spikes that had happened a few minutes apart during the previous 20 mins. It looks like the NVRAM flush got stuck somehow but I have no idea what would cause that. We're running ONTAP 8.0.2 7-mode on a very busy FAS3170. Thanks, Peta _______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters _______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters -- --- Gustatus Similis Pullus Classification: UNCLASSIFIED Caveats: NONE _______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
--
Gustatus Similis Pullus
Classification: UNCLASSIFIED Caveats: NONE
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
--
Gustatus Similis Pullus
Classification: UNCLASSIFIED Caveats: NONE
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Steve Losen scl@virginia.edu phone: 434-924-0640
University of Virginia ITC Unix Support