Hello Toasters,
We've unfortunately had to reduce the frequency of our volume snapmirror updates in order to allow for our destination aggregate to deswizzle. We highly prefer hourly volume snapmirror updates but it turns our our source volumes are large enough and/or have enough snapshots that the deswizzle process never completes on the destination aggregate. Our volume snapmirror destination aggregate is a single tray of SATA. Prior to reducing the frequency of snapmirror updates, the SATA aggregate was running at 90-100% utilization 24x7 with little to no IO to the filer from active clients. Needless to say, serving data from said aggregate was VERY SLOW despite the light IO (<300 IOPS) required by the clients sourcing their primary data from the SATA aggregate.
We've done what we can to reduce the impact of deswizzling. Namely, cutting down on snapshots and reducing the volume size. I understand reducing volume size doesnt reduce the maxfiles setting which believe ultimately impact the amount of deswizzling necessary on the destination. I'm still digging into other options we can try but reducing the frequency of snapmirror updates seems to have the most impact.
How does one plan for IOPs or disk utilization resulting from the deswizzle process? If I recall correctly, during our planning sessions with NetApp, our Netapp SE never touched on IOPs or number of spindles required to handle deswizzling while serving data from the same aggregate. In fact, I think our aggregates were size purely based on the amount of IO generated from active clients (not active clients + deswizzle).
Thanks, Phil
My experience is that write performance to a single tray snapmirror destination with SATA drives (2 raid groups built form 24 drives) will be a problem. You need at least a second tray and a another raid group (16/16/15) to get decent performance with VSM or QSM.
Joel
On 01/27/2015 02:55 PM, Philbert Rupkins wrote:
Hello Toasters,
We've unfortunately had to reduce the frequency of our volume snapmirror updates in order to allow for our destination aggregate to deswizzle. We highly prefer hourly volume snapmirror updates but it turns our our source volumes are large enough and/or have enough snapshots that the deswizzle process never completes on the destination aggregate. Our volume snapmirror destination aggregate is a single tray of SATA. Prior to reducing the frequency of snapmirror updates, the SATA aggregate was running at 90-100% utilization 24x7 with little to no IO to the filer from active clients. Needless to say, serving data from said aggregate was VERY SLOW despite the light IO (<300 IOPS) required by the clients sourcing their primary data from the SATA aggregate.
We've done what we can to reduce the impact of deswizzling. Namely, cutting down on snapshots and reducing the volume size. I understand reducing volume size doesnt reduce the maxfiles setting which believe ultimately impact the amount of deswizzling necessary on the destination. I'm still digging into other options we can try but reducing the frequency of snapmirror updates seems to have the most impact.
How does one plan for IOPs or disk utilization resulting from the deswizzle process? If I recall correctly, during our planning sessions with NetApp, our Netapp SE never touched on IOPs or number of spindles required to handle deswizzling while serving data from the same aggregate. In fact, I think our aggregates were size purely based on the amount of IO generated from active clients (not active clients + deswizzle).
Thanks, Phil
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
That's just about the situation we're in right now. We VSM a lot of data to a single SATA aggregate (single tray, single 20 disk raid group) on the remote side. The source volumes range from the very small to large (up to 5 TB). There is also a large range of snapshots associated with each volume. Many volumes have 130+ snapshots associated with them.
Glad to hear the second tray helps. You mentioned problematic write performance when using a single tray of SATA - were you also seeing high disk utilization as a result of the deswizzle process?
On Tue, Feb 10, 2015 at 10:49 AM, Joel Krajden joelk@encs.concordia.ca wrote:
My experience is that write performance to a single tray snapmirror destination with SATA drives (2 raid groups built form 24 drives) will be a problem. You need at least a second tray and a another raid group (16/16/15) to get decent performance with VSM or QSM.
Joel
On 01/27/2015 02:55 PM, Philbert Rupkins wrote:
Hello Toasters,
We've unfortunately had to reduce the frequency of our volume snapmirror updates in order to allow for our destination aggregate to deswizzle. We highly prefer hourly volume snapmirror updates but it turns our our source volumes are large enough and/or have enough snapshots that the deswizzle process never completes on the destination aggregate. Our volume snapmirror destination aggregate is a single tray of SATA. Prior to reducing the frequency of snapmirror updates, the SATA aggregate was running at 90-100% utilization 24x7 with little to no IO to the filer from active clients. Needless to say, serving data from said aggregate was VERY SLOW despite the light IO (<300 IOPS) required by the clients sourcing their primary data from the SATA aggregate.
We've done what we can to reduce the impact of deswizzling. Namely, cutting down on snapshots and reducing the volume size. I understand reducing volume size doesnt reduce the maxfiles setting which believe ultimately impact the amount of deswizzling necessary on the destination. I'm still digging into other options we can try but reducing the frequency of snapmirror updates seems to have the most impact.
How does one plan for IOPs or disk utilization resulting from the deswizzle process? If I recall correctly, during our planning sessions with NetApp, our Netapp SE never touched on IOPs or number of spindles required to handle deswizzling while serving data from the same aggregate. In fact, I think our aggregates were size purely based on the amount of IO generated from active clients (not active clients + deswizzle).
Thanks, Phil
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
On 02/10/2015 12:25 PM, Philbert Rupkins wrote:
That's just about the situation we're in right now. We VSM a lot of data to a single SATA aggregate (single tray, single 20 disk raid group) on the remote side. The source volumes range from the very small to large (up to 5 TB). There is also a large range of snapshots associated with each volume. Many volumes have 130+ snapshots associated with them.
Glad to hear the second tray helps. You mentioned problematic write performance when using a single tray of SATA - were you also seeing high disk utilization as a result of the deswizzle process?
Yes. Sysstat -u (disk) was close to 100% most of the time with VSM or QSM running.
Much better now. Source and destination volumes were all converted to 64 bit on the aggregates and sysstat -u on average is now about 70% on the destination filer.
Joel
On Tue, Feb 10, 2015 at 10:49 AM, Joel Krajden <joelk@encs.concordia.ca mailto:joelk@encs.concordia.ca> wrote:
My experience is that write performance to a single tray snapmirror destination with SATA drives (2 raid groups built form 24 drives) will be a problem. You need at least a second tray and a another raid group (16/16/15) to get decent performance with VSM or QSM. Joel On 01/27/2015 02:55 PM, Philbert Rupkins wrote: Hello Toasters, We've unfortunately had to reduce the frequency of our volume snapmirror updates in order to allow for our destination aggregate to deswizzle. We highly prefer hourly volume snapmirror updates but it turns our our source volumes are large enough and/or have enough snapshots that the deswizzle process never completes on the destination aggregate. Our volume snapmirror destination aggregate is a single tray of SATA. Prior to reducing the frequency of snapmirror updates, the SATA aggregate was running at 90-100% utilization 24x7 with little to no IO to the filer from active clients. Needless to say, serving data from said aggregate was VERY SLOW despite the light IO (<300 IOPS) required by the clients sourcing their primary data from the SATA aggregate. We've done what we can to reduce the impact of deswizzling. Namely, cutting down on snapshots and reducing the volume size. I understand reducing volume size doesnt reduce the maxfiles setting which believe ultimately impact the amount of deswizzling necessary on the destination. I'm still digging into other options we can try but reducing the frequency of snapmirror updates seems to have the most impact. How does one plan for IOPs or disk utilization resulting from the deswizzle process? If I recall correctly, during our planning sessions with NetApp, our Netapp SE never touched on IOPs or number of spindles required to handle deswizzling while serving data from the same aggregate. In fact, I think our aggregates were size purely based on the amount of IO generated from active clients (not active clients + deswizzle). Thanks, Phil _________________________________________________ Toasters mailing list Toasters@teaparty.net <mailto:Toasters@teaparty.net> http://www.teaparty.net/__mailman/listinfo/toasters <http://www.teaparty.net/mailman/listinfo/toasters> _________________________________________________ Toasters mailing list Toasters@teaparty.net <mailto:Toasters@teaparty.net> http://www.teaparty.net/__mailman/listinfo/toasters <http://www.teaparty.net/mailman/listinfo/toasters>