Hey Tony,

thanks for your efforts, greatly appreciated.

We’re using a switchless cluster here with 2x10G links which are not saturated as far as I can see, but how could I tell?

Sysstat on the destination controller clearly shows 100% disk utilization when I’m running the vol move, once I stop it, disk utilization goes down to 20% again, so it’s reproducible – the longest I tried to have it running was about 35 minutes since I was in the hope that it would settle down and would stop firing so hard on the destination aggregate, but after this period of time I had to stop it since the lag experienced was too much and I was getting latency issues on some clients.

I did not specify the –foreground flag while running the vol move, instead I was monitoring the progress using vol move show periodically and there I could see that it was trying to replicate within a range of 150-200MBps which was clearly above the set QoS policy – but I’ve read somewhere else during the last days that the QoS policies do only apply to client initiated workloads, not to system initiated workloads. That said, I was checking on the system defined QoS policy groups and found some interesting policies there, but am not sure if I could just create a new system-defined QoS policy group which would apply to the vol move (read snapmirror) operation here then and I’m not keen enough to modify existing system-defined QoS policy groups J

The sad thing is, that I thought I could take the burden of the storage tier migration off the client by simply moving the volume for him, but if there’s no way to throttle that process, I will have to present new LUNs on the new destination aggregate to the client and ask him to replicate the data on his own; therefore I can limit the available bandwidth with preset QoS policies on the volumes.

Best,

Alexander Griesser

Head of Systems Operations

ANEXIA Internetdienstleistungs GmbH

E-Mail: ag@anexia.at

Web: http://www.anexia.at

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

Von: Tony Bar [mailto:tbar@BERKCOM.com]
Gesendet: Freitag, 25. September 2015 22:16
An: Alexander Griesser <AGriesser@anexia-it.com>
Cc: toasters@teaparty.net
Betreff: Re: AW: Vol Move Throttling in cDOT

Alexander -

I've been in touch with NetApp about this and they're telling me that this shouldn't be happening unless you're using the -foreground flag on the command, and that it should never interfere with a workload that's already running on the destination aggregate as far as disk utilization. What can happen though is if you don't have enough links for the cluster management network that the interfaces can get bogged down. I guess the question would be whether what you're seeing is a network issue or a disk issue? Can I ask how many cluster management connections you're currently using and do you possibly have free ports on the filer and your cluster management switches to add more links if the network links are getting saturated?

I'm a little suspicious of their explanation and like you, I believe there should be a way to set a throttle on vol move. What they're saying however is that the flag to ignore throttling doesn't refer to a user definable setting, but instead an internal mechanism that's supposed to intelligently manage the process and automatically throttle. The use case then for the ignore option is to ignore that internal mechanism and give vol move as much IO as possible.

It's interesting that you brought this up though, and if NetApp takes notice perhaps they will consider implementing throttling as a user land variable so tuning becomes possible.

Anthony Bar
tbar@berkcom.com
Berkeley Communications

www.berkcom.com

On Sep 24, 2015, at 12:37 AM, Alexander Griesser <AGriesser@anexia-it.com> wrote:

Well, that obviously doesn’t work then – the vol move causes the disk utilization on the destination aggregate to go up to 100% constantly and OCPM is sending me tons of noticiations of slow data processing nodes due to replication, etc. – so I definitely need to be able to throttle this process but still haven’t found a way to do that :-/

Best,

Alexander Griesser

Head of Systems Operations

ANEXIA Internetdienstleistungs GmbH

E-Mail: ag@anexia.at

Web: http://www.anexia.at

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

Von: andrei.borzenkov@ts.fujitsu.com [mailto:andrei.borzenkov@ts.fujitsu.com]
Gesendet: Donnerstag, 24. September 2015 09:02
An: Alexander Griesser <AGriesser@anexia-it.com>; Tony Bar <tbar@BERKCOM.com>
Cc: toasters@teaparty.net
Betreff: RE: Vol Move Throttling in cDOT

Reading documentation, it looks like -bypass-throttling applies to internal throttling performed by Data ONTAP:

A volume move operation might take more time than expected because moves are designed to

occur nondisruptively in the background in a manner that preserves client access and overall

system performance.

For example, Data ONTAP throttles the resources available to the volume move operation.

IOW volume move is expected to not impact normal client activity. Do you observe any slowdown during volume move?

---

With best regards

Andrei Borzenkov

Senior system engineer

FTS WEMEAI RUC RU SC TMS FOS

<image001.gif>

FUJITSU

Zemlyanoy Val Street, 9, 105 064 Moscow, Russian Federation

Tel.: +7 495 730 62 20 ( reception)

Mob.: +7 916 678 7208

Fax: +7 495 730 62 14

E-mail: Andrei.Borzenkov@ts.fujitsu.com

Web: ru.fujitsu.com

Company details: ts.fujitsu.com/imprint

This communication contains information that is confidential, proprietary in nature and/or privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s) or the person responsible for delivering it to the intended recipient(s), please note that any form of dissemination, distribution or copying of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender and delete the original communication. Thank you for your cooperation.

Please be advised that neither Fujitsu, its affiliates, its employees or agents accept liability for any errors, omissions or damages caused by delays of receipt or by any virus infection in this message or its attachments, or which may otherwise arise as a result of this e-mail transmission.

From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Alexander Griesser
Sent: Wednesday, September 23, 2015 3:13 PM
To: Tony Bar
Cc: toasters@teaparty.net
Subject: AW: Vol Move Throttling in cDOT

Tony,

thanks, QoS Policy on the volume does not seem to work – I just set a QoS Policy down to 10MBps but the transfer was still running with 200MBps+, so I’ve aborted it again.

I found a few websites talking about `option replication.throttle.enable` et al, but that doesn’t seem to apply to cDOT systems anymore.

The vol move uses Snapmirror in the background, AFAIK, so I was also checking snapmirror policies (only Default Policies available if you haven’t done anything with Snapmirror) and in the default policies, the only thing I can configure there is the transfer priority:

*> snapmirror policy show -instance

                   Vserver: Cluster

    SnapMirror Policy Name: DPDefault

              Policy Owner: cluster-admin

               Tries Limit: 8

         Transfer Priority: normal

Ignore accesstime Enabled: false

   Transfer Restartability: always

                   Comment: Default policy for DP relationship.

     Total Number of Rules: 0

                Total Keep: 0

                     Rules: Snapmirror-label                 Keep Preserve Warn

                            -------------------------------- ---- -------- ----

                            -                                   - -           -

                   Vserver: Cluster

    SnapMirror Policy Name: XDPDefault

              Policy Owner: cluster-admin

               Tries Limit: 8

         Transfer Priority: normal

Ignore accesstime Enabled: false

   Transfer Restartability: always

                   Comment: Default policy for XDP relationship with daily and weekly rules.

     Total Number of Rules: 2

                Total Keep: 59

                     Rules: Snapmirror-label                 Keep Preserve Warn

                            -------------------------------- ---- -------- ----

                            daily                               7 false       0

                            weekly                             52 false       0

2 entries were displayed.

Doesn’t seem to be the right place either…

Alexander Griesser

Head of Systems Operations

ANEXIA Internetdienstleistungs GmbH

E-Mail: ag@anexia.at

Web: http://www.anexia.at

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

Von: Tony Bar [mailto:tbar@BERKCOM.com]
Gesendet: Mittwoch, 23. September 2015 13:43
An: Alexander Griesser <AGriesser@anexia-it.com>
Cc: toasters@teaparty.net
Betreff: Re: Vol Move Throttling in cDOT

Alexander -

I believe this is accomplished with the volume QoS policy tool, which is why you see the option to bypass throttling but not set an option on the operation itself.

I would have to test this in my lab environment to be sure 100% but I am pretty sure that is where you should be looking next.

Regards,
Anthony Bar
tbar@berkcom.com
Berkeley Communications

www.berkcom.com

On Sep 23, 2015, at 4:26 AM, Alexander Griesser <AGriesser@anexia-it.com> wrote:

Hey there,

I did some research already but wasn’t able to find what I was looking for, so I’m trying a quick shot here:

Does anyone know if it’s actually possible to throttle a vol move on cDOT?

vol move start does not really list an option for that and once the move is running, there’s also no vol move modify or anything like that.

*> vol move start ?

(volume move start)

    -vserver <vserver name>                                           Vserver Name

   [-volume] <volume name>                                            Volume Name

   [-destination-aggregate] <aggregate name>                          Destination Aggregate

[[-cutover-window] {30..300}]                                       Cutover time window in seconds (default: 45)

[ -cutover-attempts {1..25} ]                                       Number of Cutover attempts (default: 3)

[ -cutover-action {abort_on_failure|defer_on_failure|force|wait} ] Action for Cutover (default: defer_on_failure)

[ -perform-validation-only [true] ]                                 Performs validation checks only (default: false)

[ -foreground {true|false} ]                                        Foreground Process

[ -bypass-throttling {true|false} ]                                 *Bypass Replication Engine Throttling

[ -skip-delta-calculation {true|false} ]                            *Skip the Delta Calculation

I’m currently migrating quite some big volumes from SAS to SATA across heads and the SATA aggregate is of course experiencing some lag now, so I’d love to throttle that a bit if possible.

Any idea?

Would a QoS policy on the souce volume help here or does NetApp internal stuff (like a vol move) override QoS quotas?

Best,

Alexander Griesser

Head of Systems Operations

ANEXIA Internetdienstleistungs GmbH

E-Mail: ag@anexia.at

Web: http://www.anexia.at

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters