Moving a 4 node cluster in two pairs?

List overview All Threads
Download

newer

older

SV:

Admin access to CIFS shares

John Stoffel

2 Mar 2021 2 Mar '21

6:23 p.m.

Guys, We're getting ready to move our 4 node FAS8060 cluster to a new data center. As part of our due diligence, we're thinking that we would snapmirror the most critical business volumes between the two pairs.

The idea would be that if the truck holding pair A+B doesn't make it for some reason, we can still bring up the cluster with nodes C+D and still have those snapmirrored volumes available to continue working.

So my questions are:

1. Can I boot a cluster with half the nodes missing? I'm sure I can...

2. Has anyone else had to do this half assed method of shipping DR?

Cheers, John

Show replies by date

Heino Walther

2 Mar 2 Mar

6:33 p.m.

New subject: SV: Moving a 4 node cluster in two pairs?

Hi John

I’m not sure if this helps… I am also sure you can get the cluster running with half the nodes. I am actually migrating between two HA-pairs as we speak. We choose to setup a link between the nodes with two cluster switches, so the two HA-pairs is about 500M apart… (I’m not sure when latency becomes an issue) But what we did was to add the two new nodes (temporary nodes) to the existing cluster, and we are then able to use the “vol move” operation to move volumes from one HA-pair to another, and of cause also the LIFs. Works a charm so far. Huge NFS datastores have been moved with a hitch. We did a “vol move start” with the “-cutover-action wait” option, which does the mirroring but waits until we tell it to do the cut-over… (in a service window) There is however some “dedupe processes” which makes the cut-over very slow on larger volumes… it keeps telling us that it is waiting for a dedupe process to complete… (both systems are AFFs)…. But after 10-30 minutes it completes OK.

Once we have emptied the source HA-nodes, we will move them to the new DC, and do it all over again back to the original system again…

So far no down time at all, which is nice 😊

I realize that you may not be a lucky as to where you have to move the systems 😉 So drive safely, and if you are running spinning disks, be prepared to replace a few as you startup the system 😉

/Heino

Fra: Toasters toasters-bounces@teaparty.net på vegne af John Stoffel john@stoffel.org Dato: tirsdag, 2. marts 2021 kl. 19.24 Til: toasters@teaparty.net toasters@teaparty.net Emne: Moving a 4 node cluster in two pairs?

So my questions are:

1. Can I boot a cluster with half the nodes missing? I'm sure I can...

2. Has anyone else had to do this half assed method of shipping DR?

Cheers, John _______________________________________________ Toasters mailing list Toasters@teaparty.net https://www.teaparty.net/mailman/listinfo/toasters

Sebastian P. Goetze

3 Mar 3 Mar

7:19 a.m.

just to add to Heino: I had a student in one of my courses just lately, that moved his datacenter some 4-5 km with exactly this setup (4 cluster switches and good redundant connectivity) and some swing gear, if I remember correctly, because he didn't have enough capacity to completely evacuate a pair before the move. Completely non-disruptive, nobody noticed anything...

Sebastian

sent from my mobile, spellchecker might have messed up...

On Tue, 2 Mar 2021, 19:36 Heino Walther hw@beardmann.dk wrote:

...

Hi John

I’m not sure if this helps…

I am also sure you can get the cluster running with half the nodes.

I am actually migrating between two HA-pairs as we speak.

We choose to setup a link between the nodes with two cluster switches, so the two HA-pairs is about 500M apart… (I’m not sure when latency becomes an issue)

But what we did was to add the two new nodes (temporary nodes) to the existing cluster, and we are then able to use the “vol move” operation to move volumes from one HA-pair to another, and of cause also the LIFs.

Works a charm so far. Huge NFS datastores have been moved with a hitch.

We did a “vol move start” with the “-cutover-action wait” option, which does the mirroring but waits until we tell it to do the cut-over… (in a service window)

There is however some “dedupe processes” which makes the cut-over very slow on larger volumes… it keeps telling us that it is waiting for a dedupe process to complete… (both systems are AFFs)…. But after 10-30 minutes it completes OK.

Once we have emptied the source HA-nodes, we will move them to the new DC, and do it all over again back to the original system again…

So far no down time at all, which is nice 😊

I realize that you may not be a lucky as to where you have to move the systems 😉

So drive safely, and if you are running spinning disks, be prepared to replace a few as you startup the system 😉

/Heino

*Fra: *Toasters toasters-bounces@teaparty.net på vegne af John Stoffel < john@stoffel.org> *Dato: *tirsdag, 2. marts 2021 kl. 19.24 *Til: *toasters@teaparty.net toasters@teaparty.net *Emne: *Moving a 4 node cluster in two pairs?

Guys, We're getting ready to move our 4 node FAS8060 cluster to a new data center. As part of our due diligence, we're thinking that we would snapmirror the most critical business volumes between the two pairs.

The idea would be that if the truck holding pair A+B doesn't make it for some reason, we can still bring up the cluster with nodes C+D and still have those snapmirrored volumes available to continue working.

So my questions are:

Can I boot a cluster with half the nodes missing? I'm sure I can...

Has anyone else had to do this half assed method of shipping DR?

Cheers, John _______________________________________________ Toasters mailing list Toasters@teaparty.net https://www.teaparty.net/mailman/listinfo/toasters _______________________________________________ Toasters mailing list Toasters@teaparty.net https://www.teaparty.net/mailman/listinfo/toasters

tmac

12:53 p.m.

Another idea would be to use SVM-DR Setup a cluster peer. Then create the destination SVM and create a vserver peer between the source and destination. If you do not wish to mirror all volumes, I believe there is a per-volume option you can modify to prevent an unwanted snapmirror relationship in the SVM-DR. If the networking is the same between both locations, you can even allow the SVM-DR to copy the identity (LIFs, Addresses, etc)

You could then "cut over" to the new location and use it.

Make sense?

--tmac

*Tim McCarthy, **Principal Consultant*

*Proud Member of the #NetAppATeam https://twitter.com/NetAppATeam*

*I Blog at TMACsRack https://tmacsrack.wordpress.com/*

On Wed, Mar 3, 2021 at 2:22 AM Sebastian P. Goetze spgoetze@gmail.com wrote:

...

just to add to Heino: I had a student in one of my courses just lately, that moved his datacenter some 4-5 km with exactly this setup (4 cluster switches and good redundant connectivity) and some swing gear, if I remember correctly, because he didn't have enough capacity to completely evacuate a pair before the move. Completely non-disruptive, nobody noticed anything...

Sebastian

sent from my mobile, spellchecker might have messed up...

On Tue, 2 Mar 2021, 19:36 Heino Walther hw@beardmann.dk wrote:

...
Hi John

I’m not sure if this helps…

I am also sure you can get the cluster running with half the nodes.

I am actually migrating between two HA-pairs as we speak.

We choose to setup a link between the nodes with two cluster switches, so the two HA-pairs is about 500M apart… (I’m not sure when latency becomes an issue)

But what we did was to add the two new nodes (temporary nodes) to the existing cluster, and we are then able to use the “vol move” operation to move volumes from one HA-pair to another, and of cause also the LIFs.

Works a charm so far. Huge NFS datastores have been moved with a hitch.

We did a “vol move start” with the “-cutover-action wait” option, which does the mirroring but waits until we tell it to do the cut-over… (in a service window)

There is however some “dedupe processes” which makes the cut-over very slow on larger volumes… it keeps telling us that it is waiting for a dedupe process to complete… (both systems are AFFs)…. But after 10-30 minutes it completes OK.

Once we have emptied the source HA-nodes, we will move them to the new DC, and do it all over again back to the original system again…

So far no down time at all, which is nice 😊

I realize that you may not be a lucky as to where you have to move the systems 😉

So drive safely, and if you are running spinning disks, be prepared to replace a few as you startup the system 😉

/Heino

*Fra: *Toasters toasters-bounces@teaparty.net på vegne af John Stoffel john@stoffel.org *Dato: *tirsdag, 2. marts 2021 kl. 19.24 *Til: *toasters@teaparty.net toasters@teaparty.net *Emne: *Moving a 4 node cluster in two pairs?

Guys, We're getting ready to move our 4 node FAS8060 cluster to a new data center. As part of our due diligence, we're thinking that we would snapmirror the most critical business volumes between the two pairs.

The idea would be that if the truck holding pair A+B doesn't make it for some reason, we can still bring up the cluster with nodes C+D and still have those snapmirrored volumes available to continue working.

So my questions are:

Can I boot a cluster with half the nodes missing? I'm sure I can...

Has anyone else had to do this half assed method of shipping DR?

Cheers, John _______________________________________________ Toasters mailing list Toasters@teaparty.net https://www.teaparty.net/mailman/listinfo/toasters _______________________________________________ Toasters mailing list Toasters@teaparty.net https://www.teaparty.net/mailman/listinfo/toasters

Toasters mailing list Toasters@teaparty.net https://www.teaparty.net/mailman/listinfo/toasters

John Stoffel

9:05 p.m.

...

...
...
...
...
"tmac" == tmac tmacmd@gmail.com writes:

tmac> Another idea would be to use SVM-DR

I don't think that will work for us...

tmac> Setup a cluster peer. Then create the destination SVM and create tmac> a vserver peer between the source and destination. If you do tmac> not wish to mirror all volumes, I believe there is a per-volume tmac> option you can modify to prevent an unwanted snapmirror tmac> relationship in the SVM-DR. If the networking is the same tmac> between both locations, you can even allow the SVM-DR to copy tmac> the identity (LIFs, Addresses, etc)

So I'd first have to move all the volumes onto one pair, split off the other pair, set it up as a standalone... the do the SVM-DR setup... swing things over. Hmm... tempting.

Except a bunch of networking needs to change, and we can't run current networks in two seperate DCs. Blech.

tmac> You could then "cut over" to the new location and use it.

tmac> Make sense?

tmac> --tmac

tmac> Tim McCarthy, Principal Consultant

tmac> Proud Member of the #NetAppATeam

tmac> I Blog at TMACsRack

tmac> On Wed, Mar 3, 2021 at 2:22 AM Sebastian P. Goetze spgoetze@gmail.com wrote:

tmac> just to add to Heino: I had a student in one of my courses just lately, that moved his tmac> datacenter some 4-5 km with exactly this setup (4 cluster switches and good redundant tmac> connectivity) and some swing gear, if I remember correctly, because he didn't have enough tmac> capacity to completely evacuate a pair before the move. Completely non-disruptive, nobody tmac> noticed anything...

tmac> Sebastian

tmac> sent from my mobile, spellchecker might have messed up...

tmac> On Tue, 2 Mar 2021, 19:36 Heino Walther hw@beardmann.dk wrote:

tmac> Hi John

tmac>

tmac> I’m not sure if this helps…

tmac> I am also sure you can get the cluster running with half the nodes.

tmac> I am actually migrating between two HA-pairs as we speak.

tmac> We choose to setup a link between the nodes with two cluster switches, so the two HA-pairs tmac> is about 500M apart… (I’m not sure when latency becomes an issue)

tmac> But what we did was to add the two new nodes (temporary nodes) to the existing cluster, tmac> and we are then able to use the “vol move” operation to move volumes from one HA-pair to tmac> another, and of cause also the LIFs.

tmac> Works a charm so far. Huge NFS datastores have been moved with a hitch.

tmac> We did a “vol move start” with the “-cutover-action wait” option, which does the mirroring tmac> but waits until we tell it to do the cut-over… (in a service window)

tmac> There is however some “dedupe processes” which makes the cut-over very slow on larger tmac> volumes… it keeps telling us that it is waiting for a dedupe process to complete… (both tmac> systems are AFFs)…. But after 10-30 minutes it completes OK.

tmac>

tmac> Once we have emptied the source HA-nodes, we will move them to the new DC, and do it all tmac> over again back to the original system again…

tmac>

tmac> So far no down time at all, which is nice 😊

tmac>

tmac> I realize that you may not be a lucky as to where you have to move the systems 😉

tmac> So drive safely, and if you are running spinning disks, be prepared to replace a few as tmac> you startup the system 😉

tmac>

tmac> /Heino

tmac>

tmac> Fra: Toasters toasters-bounces@teaparty.net på vegne af John Stoffel john@stoffel.org tmac> Dato: tirsdag, 2. marts 2021 kl. 19.24 tmac> Til: toasters@teaparty.net toasters@teaparty.net tmac> Emne: Moving a 4 node cluster in two pairs?

tmac> Guys, tmac> We're getting ready to move our 4 node FAS8060 cluster to a new data tmac> center. As part of our due diligence, we're thinking that we would tmac> snapmirror the most critical business volumes between the two pairs.

tmac> The idea would be that if the truck holding pair A+B doesn't make it tmac> for some reason, we can still bring up the cluster with nodes C+D and tmac> still have those snapmirrored volumes available to continue working.

tmac> So my questions are:

tmac> 1. Can I boot a cluster with half the nodes missing? I'm sure I tmac> can...

tmac> 2. Has anyone else had to do this half assed method of shipping DR?

tmac> Cheers, tmac> John tmac> _______________________________________________ tmac> Toasters mailing list tmac> Toasters@teaparty.net tmac> https://www.teaparty.net/mailman/listinfo/toasters

tmac> _______________________________________________ tmac> Toasters mailing list tmac> Toasters@teaparty.net tmac> https://www.teaparty.net/mailman/listinfo/toasters

John Stoffel

8:55 p.m.

We're doing a full move, everything must go! type of event. And I don't think I'll have to space to replicate everything between pairs. We're a smaller org now, so it's hard to justify swing gear cost. But it's also hard to get people to clean up. :-)

John

Sebastian> just to add to Heino: I had a student in one of my courses Sebastian> just lately, that moved his datacenter some 4-5 km with Sebastian> exactly this setup (4 cluster switches and good redundant Sebastian> connectivity) and some swing gear, if I remember correctly, Sebastian> because he didn't have enough capacity to completely Sebastian> evacuate a pair before the move. Completely non-disruptive, Sebastian> nobody noticed anything...

Sebastian> On Tue, 2 Mar 2021, 19:36 Heino Walther hw@beardmann.dk wrote:

Sebastian> Hi John

Sebastian>

Sebastian> I’m not sure if this helps…

Sebastian> I am also sure you can get the cluster running with half the nodes.

Sebastian> I am actually migrating between two HA-pairs as we speak.

Sebastian> We choose to setup a link between the nodes with two cluster switches, so the two HA-pairs is Sebastian> about 500M apart… (I’m not sure when latency becomes an issue)

Sebastian> But what we did was to add the two new nodes (temporary nodes) to the existing cluster, and we Sebastian> are then able to use the “vol move” operation to move volumes from one HA-pair to another, and Sebastian> of cause also the LIFs.

Sebastian> Works a charm so far. Huge NFS datastores have been moved with a hitch.

Sebastian> We did a “vol move start” with the “-cutover-action wait” option, which does the mirroring but Sebastian> waits until we tell it to do the cut-over… (in a service window)

Sebastian> There is however some “dedupe processes” which makes the cut-over very slow on larger volumes… Sebastian> it keeps telling us that it is waiting for a dedupe process to complete… (both systems are Sebastian> AFFs)…. But after 10-30 minutes it completes OK.

Sebastian>

Sebastian> Once we have emptied the source HA-nodes, we will move them to the new DC, and do it all over Sebastian> again back to the original system again…

Sebastian>

Sebastian> So far no down time at all, which is nice 😊

Sebastian>

Sebastian> I realize that you may not be a lucky as to where you have to move the systems 😉

Sebastian> So drive safely, and if you are running spinning disks, be prepared to replace a few as you Sebastian> startup the system 😉

Sebastian>

Sebastian> /Heino

Sebastian>

Sebastian> Fra: Toasters toasters-bounces@teaparty.net på vegne af John Stoffel john@stoffel.org Sebastian> Dato: tirsdag, 2. marts 2021 kl. 19.24 Sebastian> Til: toasters@teaparty.net toasters@teaparty.net Sebastian> Emne: Moving a 4 node cluster in two pairs?

Sebastian> Guys, Sebastian> We're getting ready to move our 4 node FAS8060 cluster to a new data Sebastian> center. As part of our due diligence, we're thinking that we would Sebastian> snapmirror the most critical business volumes between the two pairs.

Sebastian> The idea would be that if the truck holding pair A+B doesn't make it Sebastian> for some reason, we can still bring up the cluster with nodes C+D and Sebastian> still have those snapmirrored volumes available to continue working.

Sebastian> So my questions are:

Sebastian> 1. Can I boot a cluster with half the nodes missing? I'm sure I Sebastian> can...

Sebastian> 2. Has anyone else had to do this half assed method of shipping DR?

Sebastian> Cheers, Sebastian> John Sebastian> _______________________________________________ Sebastian> Toasters mailing list Sebastian> Toasters@teaparty.net Sebastian> https://www.teaparty.net/mailman/listinfo/toasters

Sebastian> _______________________________________________ Sebastian> Toasters mailing list Sebastian> Toasters@teaparty.net Sebastian> https://www.teaparty.net/mailman/listinfo/toasters

John Stoffel

9 p.m.

New subject: SV: Moving a 4 node cluster in two pairs?

...

...
...
...
...
"Heino" == Heino Walther hw@beardmann.dk writes:

Heino> Hi John Heino> I’m not sure if this helps…

Heino> I am also sure you can get the cluster running with half the nodes.

Heino> I am actually migrating between two HA-pairs as we speak.

Heino> We choose to setup a link between the nodes with two cluster switches, so the two HA-pairs is Heino> about 500M apart… (I’m not sure when latency becomes an issue)

Heino> But what we did was to add the two new nodes (temporary nodes) to the existing cluster, and we are Heino> then able to use the “vol move” operation to move volumes from one HA-pair to another, and of Heino> cause also the LIFs.

Heino> Works a charm so far. Huge NFS datastores have been moved with a hitch.

Heino> We did a “vol move start” with the “-cutover-action wait” Heino> option, which does the mirroring but waits until we tell it to Heino> do the cut-over… (in a service window)

Heino> There is however some “dedupe processes” which makes the Heino> cut-over very slow on larger volumes… it keeps telling us that Heino> it is waiting for a dedupe process to complete… (both systems Heino> are AFFs)…. But after 10-30 minutes it completes OK.

Maybe the answer is to turn off dedupe job(s) while you're in the pending cutover state? Or even before you do the initial copy? Not sure... it's an interesting question for sure.

Heino> Once we have emptied the source HA-nodes, we will move them to Heino> the new DC, and do it all over again back to the original Heino> system again…

Heino> So far no down time at all, which is nice 😊

Heino> I realize that you may not be a lucky as to where you have to Heino> move the systems 😉

Heino> So drive safely, and if you are running spinning disks, be Heino> prepared to replace a few as you startup the system 😉

Yeah, its almost all spinning disks, so I'm expecting to replace a bunch, but luckily I have a ton of spares from shelves not making the move to the new DC.

*grin*

Scott Miller

2 Mar 2 Mar

9:08 p.m.

New subject: [EXTERNAL] Moving a 4 node cluster in two pairs?

why not vol move everything to one pair, then remove the other pair from the cluster. make a new 2-node cluster, snap mirror everything, then ship. then re-make the 4 node cluster once you safely arrive at the destination.

or, ship the temp 2-node pair first, *then* snapmirror everything to it (requires a decent WAN) so the data is known safe at the destination before you move the origin cluster.

choices

-skottie

On Tue, Mar 2, 2021 at 10:25 AM John Stoffel john@stoffel.org wrote:

...

Guys, We're getting ready to move our 4 node FAS8060 cluster to a new data center. As part of our due diligence, we're thinking that we would snapmirror the most critical business volumes between the two pairs.

The idea would be that if the truck holding pair A+B doesn't make it for some reason, we can still bring up the cluster with nodes C+D and still have those snapmirrored volumes available to continue working.

So my questions are:

Can I boot a cluster with half the nodes missing? I'm sure I can...

Has anyone else had to do this half assed method of shipping DR?

Cheers, John _______________________________________________ Toasters mailing list Toasters@teaparty.net https://www.teaparty.net/mailman/listinfo/toasters

John Stoffel

3 Mar 3 Mar

5:53 p.m.

New subject: [EXTERNAL] Moving a 4 node cluster in two pairs?

...

...
...
...
...
"Scott" == Scott Miller scott.miller@dreamworks.com writes:

Scott> why not vol move everything to one pair, then remove the other Scott> pair from the cluster. make a new 2-node cluster, snap mirror Scott> everything, then ship. then re-make the 4 node cluster once Scott> you safely arrive at the destination.

That might be an option, if we have enough space on the cluster. But we might only snapmirror the critical stuff and pray on the rest.

Scott> or, ship the temp 2-node pair first, *then* snapmirror Scott> everything to it (requires a decent WAN) so the data is known Scott> safe at the destination before you move the origin cluster.

It's certainly an idea.

Scott> On Tue, Mar 2, 2021 at 10:25 AM John Stoffel john@stoffel.org wrote:

Scott> Guys, Scott> We're getting ready to move our 4 node FAS8060 cluster to a new data Scott> center. As part of our due diligence, we're thinking that we would Scott> snapmirror the most critical business volumes between the two pairs.

Scott> The idea would be that if the truck holding pair A+B doesn't make it Scott> for some reason, we can still bring up the cluster with nodes C+D and Scott> still have those snapmirrored volumes available to continue working.

Scott> So my questions are:

Scott> 1. Can I boot a cluster with half the nodes missing? I'm sure I Scott> can...

Scott> 2. Has anyone else had to do this half assed method of shipping DR?

Scott> Cheers, Scott> John Scott> _______________________________________________ Scott> Toasters mailing list Scott> Toasters@teaparty.net Scott> https://www.teaparty.net/mailman/listinfo/toasters

1774

Age (days ago)

1775

Last active (days ago)

toasters@lists.teaparty.net

8 comments

5 participants

tags (0)

participants (5)

Heino Walther
John Stoffel
Scott Miller
Sebastian P. Goetze
tmac