I have used both DP and LS over the years and am back to using LS more often for reasons Justin wrote, and also for an NAE/NVE workaround where DP make-vsroot had some hoops to jump through to re-create the mirrors after a failover test. LS mirror promote and recreate after had no issues with NAE/NVE in my testing. In all the years doing this, I've never had to recover svm root, but to follow best practices for NAS, still implement them. I don't create mirrors on all nodes and use 1-2 copies depending on cluster size.

An interesting test in mirror activation, is that the mirror picks up the existing SVM junctions regardless of the state of SVM root mirror. For example:

1)  An SVM has 4 junction paths
2)  SVM root mirror LS or DP to protect SVM root
3)  unmount 3 of the junction paths leaving 1 junction path
4) failover to the root mirror (promote LS or break/make-vsroot DP)
5) SVM root running on the failed over volume has the 1 junction path, not the 4 that existed at the time of the mirror... there was no real failure, and the procedure with the SVM running keeps the current state. If a real disaster, I would expect recovery to what was in the mirror, but have never had to recover svm root.

An RFE on my wish list is to have the SVM root virtualized in the RDB, then we don't need to manage, replicate or ever move SVM root. I know this isn't an easy task and would use mroot/vol0, and cause more cluster traffic, but still would welcome seeing a change to do this if feasible. Not a show stopper or requirement, nor high priority.

On Wednesday, April 28, 2021, 11:24:12 AM PDT, John Stoffel <john@stoffel.org> wrote:



Justin> Another pretty major difference between LS and DP methods;
Justin> DP method requires manual intervention when a failover/restore is needed.

This is fine in my case, because I'm really trying to protect against
a shipping failure, though it's tempting to do more to protect against
root volume failures as well.  Though I've honestly never had one, nor
had a netapp fail so badly in 22+ years of using them that I lost data
from hardware failures.

Closest I came was on a F740 (I think) using the DEC StorageWorks
canisters and shelves.  I had a two disk failure in an aggregate.  One
disk you could hear scrapping the heads on the platter, the other was
a controller board failure.  Since I had nothing to lose, I took the
good controller board off the head crash drive and put it onto the
other disk.  System came up and found the data and started
rebuilding.  Whew!  Raid-DP is a good thing today for sure.


Justin> LS Mirrors are running in parallel and incoming reads/access
Justin> requests (other than NFSv4) hit the LS mirrors rather than the
Justin> source volume, so if one fails, you don’t have to do anything
Justin> right away; you’d just need to resolve the issue at some
Justin> point, but no interruption to service.

That's a decent reason to use them.

Justin> LS mirrors can also have a schedule to run to avoid needing to
Justin> update them regularly. And, if you need to write to the SVM
Justin> root for some reason, you’d need to access the .admin path in
Justin> the vsroot; LS mirrors are readonly (like DP mirrors).

The default for 9.3 seems to be 1 hour, but I bumped it to every 5
minutes, because I have Netbackup backups which use snapshots and 'vol
clone ...' to mount Oracle volumes for backups.  I had to hack my
backuppolicy.sh script to put in a 'sleep 305' to make it work
properly.

Trying to make it work generically with 'snapmirror update-ls-set
<vserver>:<source>' wasn't working for some reason, so the quick hack
of a sleep got me working.


But I am thinking of dropping the LS mirrors and just going with DP
mirrors of all my rootvols instead, just because of this issue.


But let's do a survey, how many people on here are using LS mirrors of
your rootvols on your clusters?  I certainly wasn't across multiple
clusters.

Jhn


_______________________________________________
Toasters mailing list
Toasters@teaparty.net
https://www.teaparty.net/mailman/listinfo/toasters