Does it need corrected?

Sent from my iPhone

On Jan 17, 2014, at 1:51 PM, Jordan Slingerland <Jordan.Slingerland@independenthealth.com> wrote:

Did you guys ever figure out what was causing these/correct this issue.

 

I am seeing what sounds like the same thing on a 3240 , however..

 

When I enable the counters

 

Wafl_memory_used

Wafl_memory_free

Walf_bufs_available

Wafl_bufs_available_for_cp

 

I do not see anything that indicates low bufs. 

 

--Jordan

 

 

From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Jeff Mohler
Sent: Tuesday, December 03, 2013 11:02 AM
To: Alexander Griesser
Cc: toasters@teaparty.net
Subject: Re: AW: Consistency Point Type M

 

Sure.  Had to be it.  

Sent from my iPhone


On Dec 3, 2013, at 10:57 AM, Alexander Griesser <ag@anexia.at> wrote:

Well, actually I was seeing high CPU usage and degraded performance, so this type of workload must have triggered a bottleneck somewhere…

 

 

 

Alexander Griesser

System-Administrator

 

ANEXIA Internetdienstleistungs GmbH

 

Telefon: +43-5-0556-320

Telefax: +43-5-0556-500

 

E-Mail: ag@anexia.at

Web: http://www.anexia.at

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 

Von: Jeff Mohler [mailto:speedtoys.racing@gmail.com]
Gesendet: Dienstag, 03. Dezember 2013 16:12
An: Alexander Griesser
Cc: Maarten Lippmann; toasters@teaparty.net
Betreff: Re: Consistency Point Type M

 

Not sure there is anything to 'blame'.  Its just an earlier CP event than via timer or volume.

If you had M's with B's, ya..bottleneck.

 

On Mon, Dec 2, 2013 at 11:14 PM, Alexander Griesser <ag@anexia.at> wrote:

Hi all,

 

thanks for your valuable insights and suggestions.

I’ve read several times now that the 3240s with Flash Cache or the 3240s itself are very low on RAM, but I doubt there’s a way to add more RAM to these boxes to mitigate this potential bottleneck, or am I wrong?

I think it would be the cheapest option if lack of RAM is to blame for these M cp types – but then, it’s all about how the storage components come into play with each other so I think adding more RAM will possibly just not help at all on this hardware or cause other problems, right?

 

Actually, this is not the „new workload“ on this system, it was a one timer and I already talked to the customer about that and he said he was doing something unusual due to lack of time and wanted to restore multiple databases for test purposes to save time for him doing it compared to restoring them sequentially, which is where the big write workload came from.

 

So for now, I think I will not open a case since this was an unusual workload and should not happen again and if the customer lets it happen again, he’s aware of the fact that the storage performance might degrade during that kind of workload which he’s apparently fine with right now.

 

Thanks,

 

Alexander Griesser

System-Administrator

 

ANEXIA Internetdienstleistungs GmbH

 

Telefon: +43-5-0556-320

Telefax: +43-5-0556-500

 

E-Mail: ag@anexia.at

Web: http://www.anexia.at

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 

Von: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] Im Auftrag von Maarten Lippmann
Gesendet: Dienstag, 03. Dezember 2013 03:44
An: toasters@teaparty.net


Betreff: Re: Consistency Point Type M

 

+1 on not touching minra after 8.0.

At this point it's a deprecated option that no longer does good, and can do (not too much, but measurable performance) harm.

As a matter of fact, 7MTT nowadays will remove it from volumes migrated to cDOT from 7G for this very reason.

3240s running 8.0+ with PAM cards are very low on RAM available to wafl, so open a case as suggested, it's the most likely bottleneck in your new workload triggering these Ms.

WAFL can eat into mbuff reserved RAM if need be, but not vice versa (similar to snapshots being able to use space outside of snap reserve but not vice versa) so if wafl is needing more ram than normal, the available mbuff RAM is going to shrink, increasing the likelihood of running out of mbufs.

Maarten

On 12/1/2013 12:57 PM, aprilogi wrote:

Hello All:

 

Hey wait, unless you are using something lower than 7.3X, minra isn’t likely to do much for you.

 

NetApp changed and added in many new memory algorithms staring in Data ONTAP 7.3. All these algorithms work under the hood and aren’t documented but they misty make any use of minra obsolete.  By 8.0, if minra does anything, it is likely your imagination.

Minra with 7.3 might help in cases where you have very large random reads but other wise, it shouldn’t help at all.

And yes, minra means minimum read ahead.  Early versions of Data ONTAP would assume that when you needed a block of data, you would soon need other blocks that were likely adjacent so it would read a large set of blocks into memory.  That changed drastically with Data ONTAP 7.3

 

For backup jobs, the pollution cache control algorithm should detect the behavior and prevent long reads with would elevate the need for minra. I would be very suspicious is minra actual worked.  It is more likely that Alexander’s customer thought it worked.

 

Looking at KB 3012539 (kb.netapp.com)

M CP caused by low mbufs; writes data to the disk in order to prevent an out-of-memory buffer situation

F after the M means flushing modified data to disk

n after the M means flushing normal files

 

Since you have a Flash card and since you memory is sized with your system, it is likely that you are simply running into the limitations for this storage system.   And adding to this would be that you were alerted that your CPU usage was hight.  It would take a lot more analysis to know for sure.

 

 

—April

 

 

On Dec 1, 2013, at 11:33 AM, Alexander Griesser <ag@anexia.at> wrote:

 

Hi Sebastian,

 

sorry for my late reply, got distracted.

 

Thanks for the useful information, will look into it and work out some possible changes on the volume settings with the customer, maybe that helps in this regard.

 

Bye,

 

Alexander Griesser

System-Administrator

 

ANEXIA Internetdienstleistungs GmbH

 

Telefon: +43-5-0556-320

Telefax: +43-5-0556-500

 

E-Mail: ag@anexia.at

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 

Von: Sebastian Goetze [mailto:spgoetze@gmail.com
Gesendet: Mittwoch, 27. November 2013 11:17
An: Alexander Griesser; toasters@teaparty.net
Betreff: Re: Consistency Point Type M

 

So I see, that this was not the typical workload...

I suggested minra, because it solved a similar problem at one of my students company (I'm a NetApp instructor):
Volume was used as backup target for TSM, lot's of Buffer related CPs, bad performance.
Since it didn't make any sense to try and 'read ahead' (and was suggested by NetApp), they turned it off on this volume. Solved the performance problem (and the buffer-related CPs)...

Extents are good for certain databases, 'space_optimized' is a setting that balances space usage and performance...

Since the 3240 doesn't have a lot of memory, FlexShare settings (e.g. Cache policies) that give ONTAP hints on how to use it's memory most efficiently (and where not to waste it) might also prove helpful.
E.g. minimizing FlashCache for volumes where it's not needed (e.g. Log-Volumes). Hint: you can dynamically set and re-set these policies.

Hope that helps

Sebastian


On 11/27/2013 10:21 AM, Alexander Griesser wrote:

Hi Sebastian,

 

well, in this particular case, the client was doing some heavy database restore operations, so it was mostly write.

I guess „minra“ stands for minimal read ahead, right? Didn’t try that yet – no_atime_update is set to off and I haven’t changed that – but it probably makes sense to set that on volumes containing only LUNs, so I guess at least activating no_atime_update can’t hurt in this case, right?

 

Maybe it would help if I knew what the M stands for to better understand what’s going on and what’s the cause for that… So right now I’m just guessing and changing options on the volumes without knowing the possible impacts can turn out bad :-/

 

Can you explain why you would suggest to turn on minra and change extent to space_optimized?

 

Thanks,

 

Alexander Griesser

System-Administrator

 

ANEXIA Internetdienstleistungs GmbH

 

Telefon: +43-5-0556-320

Telefax: +43-5-0556-500

 

E-Mail: ag@anexia.at

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 

Von: Sebastian Goetze [mailto:spgoetze@gmail.com
Gesendet: Mittwoch, 27. November 2013 10:15
An: Alexander Griesser; toasters@teaparty.net
Betreff: Re: Consistency Point Type M

 

Hi Alexander,

hmmmm. Only ISCSI. Lot's of writing, hardly reading.

Did you try (or evaluate) setting 'minra on' on the volumes that are active?
And 'no_atime_update on'?
Maybe 'extent space_optimized', if applicable?  (random write followed by sequential read)

Just some ideas...

Sebastian



On 11/27/2013 9:38 AM, Alexander Griesser wrote:

Good morning,

 

does anyone know what consistency point type M is and what’s causing it? I was getting them yesterday on one of my 3240s, 8.1.3P1, it has a Flash Cache Card installed if that’s relevant for this problem.

I was alerted by OnCommand Core about high CPU usage on this filer.

 

*> sysstat -x 1

CPU    NFS   CIFS   HTTP   Total     Net   kB/s    Disk   kB/s    Tape   kB/s  Cache  Cache    CP  CP  Disk   OTHER    FCP  iSCSI     FCP   kB/s   iSCSI   kB/s

                                       in    out    read  write    read  write    age    hit  time  ty  util                            in    out      in    out

94%      0      0      0    1122  282622   5880   24406 398932       0      0     5s    99%   99%  Mf   45%       0      0   1122       0      0  271544      0

93%      0      0      0    1225  286280   5973   21928 366135       0      0     5s    99%   96%  Ms   44%       0      0   1225       0      0  275276      0

87%      0      0      0    1422  316299   6638   24652 243511       0      0     5s    99%   85%  Mn   31%       5      0   1417       0      0  303576      0

93%      0      0      0    1018  228505   4797   27948 376695       0      0     5s    99%   87%  Mf   41%       0      0   1018       0      0  219500      0

94%      0      0      0    1271  280539   5900   24036 267711       0      0     5s    99%   76%  Mn   27%       0      0   1271       0      0  269419      0

91%      0      0      0    1177  271879   5717   28680 312704       0      0     5s    99%   77%  M    38%       0      0   1177       0      0  261226      0

94%      0      0      0    1186  264471   5563   29846 352275       0      0     5s    99%   82%  Ms   38%       0      0   1186       0      0  254111      0

92%      0      0      0    1369  297761   6271   22447 233291       0      0     5s    99%   79%  Mn   30%       5      0   1364       0      0  285898      0

92%      0      0      0    1123  255828   5368   23300 367620       0      0     5s    99%   83%  Mf   40%       0      0   1123       0      0  246068      0

91%      0      0      0    1201  272685   5722   26273 292545       0      0     5s    99%   77%  Mn   37%       0      0   1201       0      0  261885      0

92%      0      0      0    1355  293194   6145   27210 273654       0      0     5s    99%   76%  Mn   33%      83      0   1272       0      0  282260      0

94%      0      0      0    1011  234096   4915   24933 352155       0      0     5s    99%   81%  Mn   37%       0      0   1011       0      0  224053      0

90%      0      0      0    1211  275554   5789   36624 285552       0      0     5s    98%   80%  Mn   33%       5      0   1206       0      0  264635      0

89%      0      0      0    1325  294916   6197   21227 309395       0      0     5s    99%   78%  M    30%       0      0   1325       0      0  283356      0

90%      0      0      0    1134  258937   5429   26731 379776       0      0     5s    98%   84%  Mf   42%       0      0   1134       0      0  248736      0

91%      0      0      0    1271  283024   5960   29953 241203       0      0     5s    99%   65%  Mn   28%       0      0   1271       0      0  272128      0

90%      0      0      0    1357  287545   6030   29769 329507       0      0     5s    99%   84%  Mf   31%       0      0   1357       0      0  275694      0

92%      0      0      0    1265  271400   5695   18026 349410       0      0     9s    99%   84%  Ms   39%       6      0   1259       0      0  260907      0

90%      0      0      0    1228  285091   5995   27944 256361       0      0     9s    99%   72%  Mn   30%       0      0   1228       0      0  273977      0

 

*> sysstat -M 1

ANY1+ ANY2+ ANY3+ ANY4+  AVG CPU0 CPU1 CPU2 CPU3 Network Protocol Cluster Storage Raid Target Kahuna WAFL_Ex(Kahu) WAFL_XClean SM_Exempt Cifs Exempt Intr Host  Ops/s   CP

  97%   86%   65%   39%  74%  73%  76%  75%  71%     31%       0%      0%     18%  25%    18%    17%    121%( 68%)         12%        0%   0%    41%   5%   7%   8717  50%

  94%   79%   54%   25%  66%  68%  68%  69%  59%     38%       0%      0%     16%  20%    20%    12%    104%( 64%)          2%        0%   0%    37%   5%  10%   8336  67%

  98%   89%   72%   44%  78%  80%  81%  80%  70%     31%       0%      0%     17%  30%    17%    12%    127%( 73%)         15%        0%   0%    51%   5%   6%   8099  67%

  96%   83%   65%   41%  75%  77%  77%  79%  67%     30%       0%      0%     17%  28%    16%    11%    120%( 71%)         14%        0%   0%    46%   4%  14%   7461  49%

  94%   78%   52%   24%  65%  65%  66%  65%  62%     37%       0%      0%     15%  15%    20%    13%    117%( 65%)          0%        0%   0%    27%   6%   8%   9042  33%

  96%   84%   67%   42%  74%  76%  78%  78%  64%     31%       0%      0%     17%  29%    16%     9%    112%( 66%)         13%        0%   0%    58%   4%   6%   7642  83%

  97%   84%   61%   39%  72%  74%  72%  74%  68%     39%       0%      0%     16%  25%    16%    14%    116%( 70%)         13%        0%   0%    40%   4%   6%   7811  34%

  94%   78%   55%   28%  67%  69%  70%  70%  60%     36%       0%      0%     16%  19%    20%    12%    113%( 63%)          2%        0%   0%    33%   5%  12%   9018  51%

  96%   85%   66%   40%  73%  76%  77%  78%  62%     34%       0%      0%     17%  28%    17%     7%    123%( 72%)         14%        0%   0%    44%   5%   6%   7926  78%

  96%   84%   64%   37%  72%  73%  74%  75%  67%     34%       0%      0%     16%  25%    17%    14%    118%( 68%)         14%        0%   0%    41%   5%   6%   7682  69%

  95%   83%   63%   38%  72%  72%  73%  73%  69%     34%       0%      0%     15%  21%    16%    10%    130%( 73%)          8%        0%   0%    41%   5%   6%   7491  87%

  96%   84%   66%   40%  74%  75%  77%  77%  68%     38%       0%      0%     15%  24%    18%    13%    126%( 70%)         11%        0%   0%    38%   5%   7%   8309  61%

  97%   85%   66%   42%  74%  75%  75%  77%  70%     35%       0%      0%     17%  27%    17%    13%    122%( 71%)         10%        0%   0%    46%   5%   6%   7431  63%

  95%   85%   65%   42%  73%  75%  75%  76%  69%     38%       0%      0%     14%  24%    15%     6%    133%( 77%)         13%        0%   0%    40%   4%   5%   6856  96%

  95%   86%   70%   47%  76%  79%  77%  80%  69%     36%       0%      0%     15%  29%    14%     9%    127%( 73%)         16%        0%   0%    49%   4%   5%   6005 100%

  98%   88%   72%   49%  80%  79%  79%  81%  80%     34%       0%      0%     15%  28%    14%    17%    132%( 75%)         16%        0%   0%    50%   4%  11%   5859  82%

  96%   86%   67%   41%  75%  77%  77%  78%  68%     38%       0%      0%     15%  26%    16%     9%    126%( 74%)         14%        0%   0%    43%   5%   6%   6800  65%

  98%   89%   73%   49%  79%  79%  80%  80%  77%     35%       0%      0%     16%  28%    16%    15%    132%( 75%)         15%        0%   0%    50%   4%   6%   7030  54%

  95%   84%   68%   45%  75%  75%  77%  77%  71%     33%       0%      0%     14%  26%    14%    11%    135%( 75%)         15%        0%   0%    42%   4%   6%   6613  62%

 

The only thing I was able to find so far ist hat it has something to do with low mbufs, but that didn’t really help me.

At the time I took this sysstat, wafl scan status showed some bitmap rearrangements, but they’re always running, sometimes more, sometimes less.

No Snapmirror, Snapvault, Deduplication or anything like this running on this filer – it just has aggregate snapshots turned on, but no volume snapshots.

 

Any idea?

 

Thanks,

 

Alexander Griesser

System-Administrator

 

ANEXIA Internetdienstleistungs GmbH

 

Telefon: +43-5-0556-320

Telefax: +43-5-0556-500

 

E-Mail: ag@anexia.at

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 






_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

 

 

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

 




_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

 


_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters




--
---
Gustatus Similis Pullus