1) FlashPool will never go cold due to a power failure or similar. That's my #1 reason for preferring it to FlashCache.
3) FlashCache can be shared among multiple aggregates according to their needs, whereas FlashPool is fixed to one. Sometimes that helps address unknown or dynamic caching needs
As you wrote, FlashCache will not be used by data residing on an SSD or FlashPool aggregate.
'
In my experience, the difference between FlashCache and FlashPool is almost never described in terms of performance. It can happen, but it usually seems to come up only with really obscure workloads, such as a system that being absolutely crushed with a random IO where the lower overall overhead of FlashCache helps a bit. It's rare, though.
Here's my main thoughts:
1) FlashPool will never go cold due to a power failure or similar. That's my #1 reason for preferring it to FlashCache.
2) FlashPool can capture random overwrites, which can be really, really helpful with certain database workloads that have a lot of such IO.
3) FlashCache can be shared among multiple aggregates according to their needs, whereas FlashPool is fixed to one. Sometimes that helps address unknown or dynamic caching needs.
4) The fact some IO was cacheable doesn't mean anyone cares it was cacheable. AWA does pretty good, but it's not definitive. For example, let's say you have a workload that could be 2X faster with 1TB of FlashPool SSD but it runs at midnight and nobody cares about how fast it runs. Why waste the SSD?
I'd probably just take it slow. Add a few SSD's to each aggregate and dole them out slowly. Reevaulate every so often. Remember - once an SSD is added into, you can't get rid of it.
-----Original Message-----
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net ] On Behalf Of Ehrenwald, Ian
Sent: Tuesday, November 22, 2016 11:16 PM
To: toasters@teaparty.net
Subject: Flash Cache vs Flash Pool
Hello Toasters
I'm thinking about if/how to implement Flash Pool in our new cDOT clusters, and was wondering if anyone could provide a bit of real world guidance for me.
We have two SAS aggregates, aggr_sas_600g_c1n1 with 8x24 600g and
aggr_sas_1200g_c1n1 with 4x24 1.2t.
I've been doing a bunch of reading about Flash Pool vs Flash Cache and am trying to better understand their strengths and weaknesses. Flash Pool accelerates writes as well as reads (Flash Cache is reads only), however with Flash Pool there seems to be the potential for slower cache access/throughput vs Flash Cache since the data needs to travel the SAS path vs Flash Cache which is probably DMA through PCIe. Maybe that's not a concern at all, I don't know. Additionally, it appears that using Flash Pool disables the Flash Cache functionality for the aggregates which are in hybrid mode (makes sense), but then we have expensive add-in cards doing nothing.
Our theoretical Flash Pool would be 2x24 200g, giving us about 5.5t of usable caching space to sprinkle into these aggregates.
I've been running AWA on the cluster against those two SAS aggregates for
~24 hours and have come up with these stats:
### FP AWA Stats ###
Host mrk_c1n1 Memory 61054 MB
ONTAP Version NetApp Release 8.3.2P5: Tue Aug 23 01:27:00 PDT
2016
Basic Information
Aggregate aggr_sas_600g_c1n1
Current-time Tue Nov 22 17:06:53 EST 2016
Start-time Mon Nov 21 12:48:17 EST 2016
Total runtime (sec) 101918
Interval length (sec) 600
Total intervals 157
In-core Intervals 1024
Summary of the past 157 intervals
max
------------
Read Throughput (MB/s): 339.039
Write Throughput (MB/s): 123.536
Cacheable Read (%): 56
Cacheable Write (%): 66
Max Projected Cache Size (GiB): 787.463
Summary Cache Hit Rate vs. Cache Size
Referenced Cache Size (GiB): 714.650
Referenced Interval: ID 132 starting at Tue Nov 22 12:33:05 EST 2016
Size 20% 40% 60% 80% 100%
Read Hit (%) 25 30 30 30 33
Write Hit (%) 1 2 2 2 2
The entire results and output of Automated Workload Analyzer (AWA) are estimates. The format, syntax, CLI, results and output of AWA may change in future Data ONTAP releases. AWA reports the projected cache size in capacity. It does not make recommendations regarding the number of data SSDs required. Please follow the guidelines for configuring and deploying Flash Pool; that are provided in tools and collateral documents. These include verifying the platform cache size maximums and minimum number and maximum number of data SSDs.
Basic Information
Aggregate aggr_sas_1200g_c1n1
Current-time Tue Nov 22 17:06:53 EST 2016
Start-time Mon Nov 21 12:40:57 EST 2016
Total runtime (sec) 102357
Interval length (sec) 600
Total intervals 158
In-core Intervals 1024
Summary of the past 158 intervals
max
------------
Read Throughput (MB/s): 914.247
Write Throughput (MB/s): 257.318
Cacheable Read (%): 41
Cacheable Write (%): 26
Max Projected Cache Size (GiB): 2412.178
Summary Cache Hit Rate vs. Cache Size
Referenced Cache Size (GiB): 2142.380
Referenced Interval: ID 113 starting at Tue Nov 22 09:04:41 EST 2016
Size 20% 40% 60% 80% 100%
Read Hit (%) 34 38 38 38 41
Write Hit (%) 7 7 7 7 9
The entire results and output of Automated Workload Analyzer (AWA) are estimates. The format, syntax, CLI, results and output of AWA may change in future Data ONTAP releases. AWA reports the projected cache size in capacity. It does not make recommendations regarding the number of data SSDs required. Please follow the guidelines for configuring and deploying Flash Pool; that are provided in tools and collateral documents. These include verifying the platform cache size maximums and minimum number and maximum number of data SSDs.
### FP AWA Stats End ###
Aggregate aggr_sas_600g_c1n1 has a lot of random overwrites (66%) that could have been cached. The volumes in that aggregate are pretty much exclusively Oracle databases. The other aggregate, aggr_sas_1200g_c1n1, doesn't seem hit as hard.
Given those statistics, what would you do if your options were ~5.5t of Flash Pool vs buying another 2t Flash Cache card per node in this HA pair?
I seem to be missing the 'Projected Read Offload' and 'Projected Write Offload' statistics which would have been very useful, mentioned at the Flash Pool documentation in
https://library.netapp.com/ecmdocs/ECMP1368404/html/GUID- 2C3EC0DF-FEFE-4871
-A161-4A3BAC87DB69.html
Thanks for any insight you all can provide.
--
Ian Ehrenwald
Senior Infrastructure Engineer
Hachette Book Group, Inc.
1.617.263.1948 / ian.ehrenwald@hbgusa.com
This may contain confidential material. If you are not an intended recipient, please notify the sender, delete immediately, and understand that no disclosure or reliance on the information herein is permitted. Hachette Book Group may monitor email to and from our network.
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters