Hi All,
I have a customer with a fabric MetroCluster. It consists of two FAS8200 Systems, each with dual controllers plus associated ATTO 7500N bridges and Brocade 6510 switches. These are currently all installed and in production in a single location. (All running 9.7P4)
They plan to move one node of the MetroCluster to a new DC around 40 Kilometres away i.e. one of the 8200 systems and its associated shelves, bridges and switches will be relocated.
As I understand it, the plan is to keep one MetroCluster node running while the other is being moved. Once the relocated node is up and running, then the data will be re-synced.
I haven't been involved in relocating half of a MetroCluster before and naturally I am wondering what there is to watch out for. For example:
- Is there a documented procedure for doing something like this? (I mean for example something similar to that which the Upgrade Advisor might generate)
- How best to split / stop / start / reconnect the MetroCluster nodes?
- What's the best way to manage the (re-)distribution of the SVMs (if this is manually required / necessary)?
- The aggregates of the remaining, running, node will need to store more data during the move, right? - What's the best way to assess how much space will be required? - Should the aggregate % reserves also be increased during the move?
- Anything to stop or to avoid prior to / during the move?
- Any other advice gratefully received :-)
Thanks in advance!
Yours, Waflherder
Hi Waflherder,
Seems straightforward to me:
* Planned Switchover before to the cluster that stays. (Best to "-simulate" before) * Verify successful switchover * Shutdown of the switches, controllers, bridges, shelves that go * Move HW * connect switches, bridges, shelves, controllers * switch on shelves, bridges, switches, controllers * Heal Aggregates (might take some time, depending on the time the move took and the rate of change) * Heal Root Aggregates * Switchback
See attached MCC Guide (from the ONTAP documentation) for details, e.g. being nice and sending an autosupport when you start (p. 28ff)
Other things:
* No relocating of SVMs necessary, the switchover/switchback does that automatically * No *extra* space on the aggregates necessary, since everything that will be written during the move would have been mirrored to the remaining plex anyway. o BUT to be safe, the aggregate snap reserve will usually grab half of the free space in the aggregate, so o if you have a very full aggregate and/or a high rate of change and a long move time, you might want to take precautions (e.g. configure FabricPool(-Mirror) for both sides, then any Overflow can go to AWS/Azure/StorageGrid/...) o Assess the space needed by checking the Aggregate Snapshot space used, inferring the Rate of Change and estimating the time necessary for the move.
That's roughly it, but maybe someone can pitch in their own DC move experience. I just teach MCC and had students, that moved MCCs...
All the best
Sebastian
On 27.04.2021 15:47, Walfherder wrote:
Hi All,
I have a customer with a fabric MetroCluster. It consists of two FAS8200 Systems, each with dual controllers plus associated ATTO 7500N bridges and Brocade 6510 switches. These are currently all installed and in production in a single location. (All running 9.7P4)
They plan to move one node of the MetroCluster to a new DC around 40 Kilometres away i.e. one of the 8200 systems and its associated shelves, bridges and switches will be relocated.
As I understand it, the plan is to keep one MetroCluster node running while the other is being moved. Once the relocated node is up and running, then the data will be re-synced.
I haven't been involved in relocating half of a MetroCluster before and naturally I am wondering what there is to watch out for. For example:
Is there a documented procedure for doing something like this? (I mean for example something similar to that which the Upgrade Advisor might generate)
How best to split / stop / start / reconnect the MetroCluster nodes?
What's the best way to manage the (re-)distribution of the SVMs (if this is manually required / necessary)?
The aggregates of the remaining, running, node will need to store more data during the move, right?
- What's the best way to assess how much space will be required?
- Should the aggregate % reserves also be increased during the move?
Anything to stop or to avoid prior to / during the move?
Any other advice gratefully received :-)
Thanks in advance!
Yours, Waflherder
Toasters mailing list Toasters@teaparty.net https://www.teaparty.net/mailman/listinfo/toasters
Hi Waflherder,
Seems straightforward to me:
* Planned Switchover before to the cluster that stays. (Best to "-simulate" before) * Verify successful switchover * Shutdown of the switches, controllers, bridges, shelves that go * Move HW * connect switches, bridges, shelves, controllers * switch on shelves, bridges, switches, controllers * Heal Aggregates (might take some time, depending on the time the move took and the rate of change) * Heal Root Aggregates * Switchback
See attached links MCC Guide (from the ONTAP documentation) for details, e.g. being nice and sending an autosupport when you start (p. 28ff)
https://docs.netapp.com/ontap-9/topic/com.netapp.doc.onc-sm-help-960/GUID-8B...
https://docs.netapp.com/us-en/ontap-metrocluster/manage/task_perform_switcho...
https://docs.netapp.com/us-en/ontap-metrocluster/pdfs/sidebar/Performing_swi...
Other things:
* No relocating of SVMs necessary, the switchover/switchback does that automatically * No *extra* space on the aggregates necessary, since everything that will be written during the move would have been mirrored to the remaining plex anyway. o BUT to be safe, the aggregate snap reserve will usually grab half of the free space in the aggregate, so o if you have a very full aggregate and/or a high rate of change and a long move time, you might want to take precautions (e.g. configure FabricPool(-Mirror) for both sides, then any Overflow can go to AWS/Azure/StorageGrid/...) o Assess the space needed by checking the Aggregate Snapshot space used, inferring the Rate of Change and estimating the time necessary for the move.
That's roughly it, but maybe someone can pitch in their own DC move experience. I just teach MCC and had students, that moved MCCs...
All the best
Sebastian
On 27.04.2021 15:47, Walfherder wrote:
Hi All,
I have a customer with a fabric MetroCluster. It consists of two FAS8200 Systems, each with dual controllers plus associated ATTO 7500N bridges and Brocade 6510 switches. These are currently all installed and in production in a single location. (All running 9.7P4)
They plan to move one node of the MetroCluster to a new DC around 40 Kilometres away i.e. one of the 8200 systems and its associated shelves, bridges and switches will be relocated.
As I understand it, the plan is to keep one MetroCluster node running while the other is being moved. Once the relocated node is up and running, then the data will be re-synced.
I haven't been involved in relocating half of a MetroCluster before and naturally I am wondering what there is to watch out for. For example:
Is there a documented procedure for doing something like this? (I mean for example something similar to that which the Upgrade Advisor might generate)
How best to split / stop / start / reconnect the MetroCluster nodes?
What's the best way to manage the (re-)distribution of the SVMs (if this is manually required / necessary)?
The aggregates of the remaining, running, node will need to store more data during the move, right?
- What's the best way to assess how much space will be required?
- Should the aggregate % reserves also be increased during the move?
Anything to stop or to avoid prior to / during the move?
Any other advice gratefully received :-)
Thanks in advance!
Yours, Waflherder
Toasters mailing list Toasters@teaparty.net https://www.teaparty.net/mailman/listinfo/toasters