Ok so this is a minimal deployment: just one (1) HA-pair FAS8300. This "archive" storage, is it for pure compliance reasons? (You mention writing it out to tape even...)
I thought for speeding the NL-SAS aggregates a little bit up, we also use some SSDs as flash-pool, like we have it now on our old dev NetApp cluster.
Sure, it is advisable definitely to have Flash as a chache in such a system. But you won't need much at all that's my tip. Do the simulations with AWA over at least 4 weeks (like Johan Gislén wrote) and see for yourself.
Now, if you know your application and use cases well, you will know if the data written to all these NL-SAS drives will be read a lot occasionally and if it will be random R (cache will help, and be necessary) or seq R. If the latter, the SSDs won't really do much for you; the data will be sucked in from spinning in that scenario and 7.2K rpm drives are VERY VERY slow... you risk spending $$$ on SSD for almost no benefit if your scenario is like that.
Again: if you know your workload, and it is indeed very light -- it's "archive" type data in the true sense and it pretty much just sits there on NL-SAS once it's been written there once, you could just as well skip FlashPool and use an adequate amount of FlashCache. It won't cache W of course, but for archive type use cases it's unlikely to matter.
While it is possible to tune FlashPool a bit, there are quite a few parameters you can change, but it's hard to make a difference IRL (I tried it once with our heavy NFS aggregated workload and then just gave it up, not worth the effort). If you know you have a large portion of random overwrites in your workload (>> 10%) then FlashPool will "win" over FlashCache. For READ they're the same pretty much and I cannot believe you'd ever notice any difference.
This is too little info for me to understand:
The other SSDs will be used for backup and DR purposes.
Do you perhaps mean that the 100% SSD Aggrs you plan to put in this FAS300 node pair are for DR purposes? DR of what? You perhaps plan to sync-mirror data from your AFF based production cluster to this FAS8300? That's fine if the workload is small enough for the FAS8300 to handle it in your DR situation, but if I were you I would think long and hard about how to recover from such a potential state where [part of] your production workload goes to the FAS8300's SSD Aggr... I.e. how do you get back to your normal production state once this has happened? If you cannot do that in any way that makes sense, the cure might be worse than the decease to to speak.
I would also think through very thoroughly what your definition of "disaster" is (in your specific situation) and which ones exactly this DR you're referring to will protect from. It's always a complex optimisation problem.
We think about moving some data to the 8300 cluster in the long term,
So this data would be "low pressure" production data on your AFF cluster now, I take it. It's not very intense but still not "archive" type data. So putting such data on very slow spinning, is often dangerous in that it risks getting performance issues. And the FlashPool might not help as much as you would wish, even if you have lots of it. This is the kind of scenario which will inevitably give you headaches in the long run, moving data back and forth between different clusters isn't even non-disruptive. How can you be sure that data you've moved to this slow FAS8300, doesn't "pick up speed" again later and the application/data owners start to complain? How can you know that you have adequate space at that point in time to migrate it back to your AFF based production cluster? If you know this, then no prob!
The very good thing about AFF (Cx00 and Ax00) is that you don't have to care. You can throw anything and everything at it and all the workloads will just be absorbed w.r.t. the back end -- it's a gift from Flash Land. The limiter will be the CPU utilisation in the node itself. For this type of scenario I strongly recommend you leverage FabricPool (you need an S3 back end). The AFF Ax00 or Cx00 will have all Storage Efficiency running all the time and this will be preserved when sent out into S3 Buckets. You can't run full Storage Efficiency Chain on your FAS8300 with slow NL-SAS and FlashPool. (It's supported AFAIK, but it will inevitably bite you.)
I haven't looked deeper into it [FabricPool], as we are not using S3 at all in the moment. This was some years ago. As we always had only one all-flash cluster, I haven't thought about it.
Well, if you happen to have NetApp gear (older FAS) incl lots of NL-SAS shelves, then definitely you should start running FabricPool on this one AFF based production cluster you have. You still have to have some sort of backup (SnapMirror/-Vault) just as you have now (I assume). If you have lots of NL-SAS shelves already, but lack controllers, you can buy some for a small sum of money. FP will automagically move all the "cold" WAFL blocks out to S3 based storage and ONTAP S3 is *fast*. No problem there ever, the (only) challenge for you is to make sure the network connection between the two clusters is rock solid.
Should fabricpool not also work on a 2-node cluster? So instead of using some SSDs for flashpool, we could create an aggregate on SSD and one on NL-SAS and use the NL-SAS one for S3 storage and then forlocal fabricpool?
Yes, this way of doing things (FabricPool internally inside the same cluster) should work. Not sure if you can do it within the same *node* though, it may be that you have to have the S3 Bucket on a different node than the S3 client (= the FabricPool back end).
Please anyone correct me if I'm wrong around this.
I agree that if this type of FP setup you describe is supported with a 2-node FAS8300, it's not a bad idea at all.
/M
-------- Original Message -------- Subject: Re: Question about flash pool maximum SSD size and local tiering Date: Tue, 17 Oct 2023 14:12:28 +0000 (UTC) From: Florian Schmid fschmid@ubimet.com To: Michael Bergman michael.bergman@norsborg.net CC: Toasters toasters@teaparty.net
Hi Michael,
wow, thank you very much for your time writing this very detailed explanation!
It will be one 2-node 8300 cluster, switchless. The cluster will be mainly used for long time archive storage until it is going to tape or for tape restores. For this, we want to take a huge amount of NL-SAS drives.
I thought for speeding the NL-SAS aggregates a little bit up, we also use some SSDs as flash-pool, like we have it now on our old dev NetApp cluster.
The other SSDs will be used for backup and DR purposes. We have a full production all-flash cluster for our normal workloads.
We think about moving some data to the 8300 cluster in the long term, because not all volumes we have now on SSD must be on flash and might consume there too much "expensive" space.
I will also have a deeper look on fabricpool. I had a look already on it in the past, but as I read S3 storage, I haven't looked deeper into it, as we are not using S3 at all in the moment. This was some years ago. As we always had only one all-flash cluster, I haven't thought about it.
Should fabricpool not also work on a 2-node cluster? So instead of using some SSDs for flashpool, we could create an aggregate on SSD and one on NL-SAS and use the NL-SAS one for S3 storage and then for local fabricpool?
Best regards, Florian