Hello,
We a V3170 cluster with a 512GB Flash Cache module installed in each controller. Each module is currently configured to cache normal data (default setting):
flexscale.enable on flexscale.lopri_blocks off flexscale.normal_data_blocks on
We have a vfiler running on one controller that does very little normal IO but has a very heavy metadata load due to poor application design. This vfiler has poor performance and its function is critical. We are thinking about switching to metadata only caching mode (flexscale.lopri_blocks off, flexscale.normal_data_blocks off) to improve its performance but have a couple of questions:
The Flash Cache best practices guide has the following verbage about enabling metadata only mode:
"Because of the much larger size of Flash Cache, this mode is more applicable to PAM I, the original 16GB DRAM–based Performance Acceleration Module, than Flash Cache."
Does that mean that this setting doesn't apply (not recommended/supported) for Flash Cache? but is for PAM I? Is using metadata only mode a bad idea with a large flash cache module? If so why?
The best practices guide also indicates that it's recommended to have a symmetrical number and size of modules between controllers in a cluster, but it doesn't say anything about symmetrical cache mode settings. Is it OK to have one controller's flash cache set to the normal data setting, but the others flash cache set to metadata only? During a failover the cache of the failed controller is essentially lost (doesn't follow it to the remaining controller) so it doesn't seem like it would matter as long as the cluster didn't stay in this failover state for a long period of time. What are your thoughts on this?
Thanks in advance,
-Robert
IMHO, that means that in the much smaller cards, going ONE way or the other was kinda the way to go, but with much larger cards, there should be plenty of room for both.
That'd be a LOT of metadata (disk) blocks to cache on top of main system hashed (processed) data.
On Mon, Feb 6, 2012 at 3:47 PM, Robert McDermott rmcdermo@fhcrc.org wrote:
Hello,
We a V3170 cluster with a 512GB Flash Cache module installed in each controller. Each module is currently configured to cache normal data (default setting):
flexscale.enable on flexscale.lopri_blocks off flexscale.normal_data_blocks on
We have a vfiler running on one controller that does very little normal IO but has a very heavy metadata load due to poor application design. This vfiler has poor performance and its function is critical. We are thinking about switching to metadata only caching mode (flexscale.lopri_blocks off, flexscale.normal_data_blocks off) to improve its performance but have a couple of questions:
The Flash Cache best practices guide has the following verbage about enabling metadata only mode:
"Because of the much larger size of Flash Cache, this mode is more applicable to PAM I, the original 16GB DRAM–based Performance Acceleration Module, than Flash Cache."
Does that mean that this setting doesn't apply (not recommended/supported) for Flash Cache? but is for PAM I? Is using metadata only mode a bad idea with a large flash cache module? If so why?
The best practices guide also indicates that it's recommended to have a symmetrical number and size of modules between controllers in a cluster, but it doesn't say anything about symmetrical cache mode settings. Is it OK to have one controller's flash cache set to the normal data setting, but the others flash cache set to metadata only? During a failover the cache of the failed controller is essentially lost (doesn't follow it to the remaining controller) so it doesn't seem like it would matter as long as the cluster didn't stay in this failover state for a long period of time. What are your thoughts on this?
Thanks in advance,
-Robert
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
On Feb 6, 2012, at 4:10 PM, Jeff Mohler speedtoys.racing@gmail.com wrote:
IMHO, that means that in the much smaller cards, going ONE way or the other was kinda the way to go, but with much larger cards, there should be plenty of room for both.
Agreed. On the 512mb's we didn't see any decrease in metadata hits when enabling normal blocks.
I don't know if this indicated a preference of metadata over normal block, or just a natural function of cache usage.
Efficacy of lopri remains very workload dependent, but I haven't seen a case where disabling normal was beneficial with the larger caches.
If you have different caching requirements for different volumes, you are better off using flexshare than changing mode globally. In this case settings are per volume and will also be in effect during takeover. See TR-3832 for detail description.
________________________________________ From: toasters-bounces@teaparty.net [toasters-bounces@teaparty.net] On Behalf Of Robert McDermott [rmcdermo@fhcrc.org] Sent: Tuesday, February 07, 2012 03:47 To: toasters@teaparty.net Subject: Flash Cache questions: symmetrical cache mode required? Metadata mode only with a 512GB Flash Cache?
Hello,
We a V3170 cluster with a 512GB Flash Cache module installed in each controller. Each module is currently configured to cache normal data (default setting):
flexscale.enable on flexscale.lopri_blocks off flexscale.normal_data_blocks on
We have a vfiler running on one controller that does very little normal IO but has a very heavy metadata load due to poor application design. This vfiler has poor performance and its function is critical. We are thinking about switching to metadata only caching mode (flexscale.lopri_blocks off, flexscale.normal_data_blocks off) to improve its performance but have a couple of questions:
The Flash Cache best practices guide has the following verbage about enabling metadata only mode:
"Because of the much larger size of Flash Cache, this mode is more applicable to PAM I, the original 16GB DRAM–based Performance Acceleration Module, than Flash Cache."
Does that mean that this setting doesn't apply (not recommended/supported) for Flash Cache? but is for PAM I? Is using metadata only mode a bad idea with a large flash cache module? If so why?
The best practices guide also indicates that it's recommended to have a symmetrical number and size of modules between controllers in a cluster, but it doesn't say anything about symmetrical cache mode settings. Is it OK to have one controller's flash cache set to the normal data setting, but the others flash cache set to metadata only? During a failover the cache of the failed controller is essentially lost (doesn't follow it to the remaining controller) so it doesn't seem like it would matter as long as the cluster didn't stay in this failover state for a long period of time. What are your thoughts on this?
Thanks in advance,
-Robert
_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Has anyone used FlexShare and had extremely positive results. IT seems that the improvement would be minor and misconfiguration could lead to no improvement for intended volumes and degradation for others. From reading, it seems that you need to assign priorities to each volume otherwise the volumes in the default queue will experience degradation.
Would anyone like to share details of how the implemented flexshare and the impact it had on the performance issue they were working to resolve?
Would also like to know if anyone has try to 'schedule flexshare' settings using rsh or api e.g. we want to give higher priorities to certain volumes from 6pm to 11pm nightly.
Thanks Jack
Hi Jack,
I have done some small testing with it, but I shied away from implementing it when a NetApp instructor said that to get the best results you have to apply it to ALL the volumes on the filer hosting the particular volume(s) you are looking to improve performance on. Not sure if that's true or not, but it was his suggestion. I figured if that were the case I would be spending a lot of time tweaking all volumes on a filer just to possibly squeeze a bit more out of one or two.
--- Scott Eno (First time poster, long time reader)
On Feb 8, 2012, at 7:20 AM, Jack Lyons jack1729@gmail.com wrote:
Has anyone used FlexShare and had extremely positive results. IT seems that the improvement would be minor and misconfiguration could lead to no improvement for intended volumes and degradation for others. From reading, it seems that you need to assign priorities to each volume otherwise the volumes in the default queue will experience degradation.
Would anyone like to share details of how the implemented flexshare and the impact it had on the performance issue they were working to resolve?
Would also like to know if anyone has try to 'schedule flexshare' settings using rsh or api e.g. we want to give higher priorities to certain volumes from 6pm to 11pm nightly.
Thanks Jack _______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
I am interested in this too, from the docs really it looks like the prioritization is mostly handled by queuing on the CPU. Is the overhead associated with managing the priorities worth paying if you have some volumes with very latency sensitive IO?
*Jeremy Page*|Senior Technical Architect|*Gilbarco Veeder-Root, A Danaher Company* *Office:*336-547-5399|*Cell:*336-601-7274|*24x7 Emergency:*336-430-8151
On 02/08/2012 07:20 AM, Jack Lyons wrote:
Has anyone used FlexShare and had extremely positive results. IT seems that the improvement would be minor and misconfiguration could lead to no improvement for intended volumes and degradation for others. From reading, it seems that you need to assign priorities to each volume otherwise the volumes in the default queue will experience degradation.
Would anyone like to share details of how the implemented flexshare and the impact it had on the performance issue they were working to resolve?
Would also like to know if anyone has try to 'schedule flexshare' settings using rsh or api e.g. we want to give higher priorities to certain volumes from 6pm to 11pm nightly.
Thanks Jack _______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Please be advised that this email may contain confidential information. If you are not the intended recipient, please notify us by email by replying to the sender and delete this message. The sender disclaims that the content of this email constitutes an offer to enter into, or the acceptance of, any agreement; provided that the foregoing does not invalidate the binding effect of any digital or other electronic reproduction of a manual signature that is included in any attachment.
We've had some positive results, but I wouldn't call them extreme. The flexshare scheduler only kicks in if the system runs out of resources (CPU etc...) so most of the time it's not actively doing anything. We have had good results on systems that we use for FlexCache . Without FlexShare enabled the flexcache volumes could end up slowing the system down, especially if the remote filer was slow to respond. We set all of our FlexCache vols for caching remote content to a very low priority. Other than that, it helps a little but we still find that one volume (or qtree) can just about monopolize the filer, especially if it's read workload that resides in cache.
If you do enable FlexShare, it is important you set priorities for all volumes. We have had cases where volumes have been missed and it leads to strange performance behavior. We actually run a script to check all of our filers have priority settings on all their volumes.
I had thought of using cron to change priorities at night to enable NDMP backups to get more resources, but we ended up doing away with NDMP backups from the primary filers by using snapvault so it became a non-issue.
--rdp
Rich Payne AMD
-----Original Message----- From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Jack Lyons Sent: Wednesday, February 08, 2012 7:21 AM To: toasters@teaparty.net Subject: FlexShare
Has anyone used FlexShare and had extremely positive results. IT seems that the improvement would be minor and misconfiguration could lead to no improvement for intended volumes and degradation for others. From reading, it seems that you need to assign priorities to each volume otherwise the volumes in the default queue will experience degradation.
Would anyone like to share details of how the implemented flexshare and the impact it had on the performance issue they were working to resolve?
Would also like to know if anyone has try to 'schedule flexshare' settings using rsh or api e.g. we want to give higher priorities to certain volumes from 6pm to 11pm nightly.
Thanks Jack _______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
I've used Flexshare in two ways, with success. First, I use it the way Rich does, to reduce the priority of volumes on high latency backend (in my case, SATA drives) that may otherwise create a drag on the rest of the filer. At the time, I had a V3140 cluster with some back end SATA. When write load was high to the volumes on those drives, I was getting a lot of cache overload with back-to-back CPs, high latency on other volumes, etc. When I set those to low priority, this eliminated a lot of those issues.
I also used flexshare to get lower latency on a volume that was fairly unused. Under load, the netapp won't give a lot of attention to low-use volumes, and the IO to that volume will be delayed. This may not be preferable (in my case, this was an Oracle OCFS volume, and it was causing the oracle cluster problems by not having its writes to it serviced quickly.
Fred
________________________________ From: Jack Lyons jack1729@gmail.com To: toasters@teaparty.net Sent: Wednesday, February 8, 2012 7:20 AM Subject: FlexShare
Has anyone used FlexShare and had extremely positive results. IT seems that the improvement would be minor and misconfiguration could lead to no improvement for intended volumes and degradation for others. From reading, it seems that you need to assign priorities to each volume otherwise the volumes in the default queue will experience degradation.
Would anyone like to share details of how the implemented flexshare and the impact it had on the performance issue they were working to resolve?
Would also like to know if anyone has try to 'schedule flexshare' settings using rsh or api e.g. we want to give higher priorities to certain volumes from 6pm to 11pm nightly.
Thanks Jack _______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters