"does not work between cluster pairs" - you mean "between heads in cluster pair"? May be these are subtleties of English language that non-native speaker misses. ________________________________________ From: toasters-bounces@teaparty.net [toasters-bounces@teaparty.net] On Behalf Of Gelb, Scott [sgelb@insightinvestments.com] Sent: Wednesday, February 08, 2012 23:59 To: Randy Rue; toasters@teaparty.net Subject: RE: vfiler migrate: overview and thoughts
Really good discussion here. You are using the –m nocopy method which is a method with disk reassign. This does not work from NMC. NMC is data motion which guarantees a 120 second failover but does not work between cluster pairs. Regular vfiler migrate uses snapmirror similar to this but doesn’t guarantee failover in 120 seconds. The –m nocopy method was formerly called snapmover and is between cluster pairs or v-series neighborhoods. Vfiler migrate works between cluster pairs and other clusters, but for –m nocopy, all nodes must see the disks (cluster pair or vseries neighborhood).
The cifs share from vfiler0 would be a hangup so good to test that first like you did.
I really like the NMC method with data motion but unless you have another cluster pair to migrate to that isn’t an option and migrate –m nocopy should be faster than that with no data movement (but still no guarantee on timing but knowing no copy it should always be faster).
Also, when you recreate a vfiler there is no network and the interfaces are unconfigured. You have to ifconfig them then vfiler run vfiler name route add default as well. I wouldn’t use filerview or the setup –e wizard since that does whack exports, hosts.equiv, hosts, etc. Running ifconfig is much easier since it just configures and binds the unconfigured ip that matches the one in the ifconfig without modifying files like setup. Then confirm the entry in /etc/rc of vfiler0 in case or reboot for persistence. Ideally there would be more automation but it is quick to reconcile when you check vfiler status –a and /etc/rc.
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Randy Rue Sent: Wednesday, February 08, 2012 11:12 AM To: toasters@teaparty.net Subject: RE: vfiler migrate: overview and thoughts
I've been working via the CLI, good to know the NMC will give better options…
From: Fletcher Cocquyt [mailto:fcocquyt@stanford.edu]mailto:[mailto:fcocquyt@stanford.edu] Sent: Wednesday, February 08, 2012 10:56 AM To: Randy Rue Cc: toasters@teaparty.netmailto:toasters@teaparty.net Subject: Re: vfiler migrate: overview and thoughts
Hi Randy, Are you managing vfiler migrations the NMC or are you initiating via command line 'vfiler migrate'? In my experience the NMC is more robust in terms of error checking and cleanup if the migration fails.
There are bugs, but by using the NMC I found the risks are mitigated - I wrote up our experiences here
http://www.vmadmin.info/2011/02/vfiler-non-disruptive-migration.html
We just upgraded both our clusters to 8.1RC2 and plan to use vFiler migration (now datamotion) to evacuate the prod cluster to upgrade its disk trays non-disruptively.
good luck,
Fletcher
On Feb 8, 2012, at 10:26 AM, Randy Rue wrote:
vfiler migrate: overview and thoughts
When I first read there was a way to move a vFiler from one node of a NetApp cluster to another I was excited. I was imagining something akin to VMWare's vMotion, a transparent movement of services. Digging a little deeper showed that NetApp's "vfiler migrate" functionality isn't nearly as automagic as I'd hoped.
Here are some observations.
* Disk ownership of the resources must be software based whether your filer is using actual disks or array LUNs (on a vSeries filer like our 3170). We had some concerns that the feature might not work well with array LUNs but it appears that Data OnTap doesn't know any difference between an Array LUN and an actual disk in this context.
* The vfiler migrate command effectively moves complete aggregates from one filer head to another. This means that all volumes on the aggregate(s) involved must be tied only to the vfiler being moved, with no LUNs, exports or shares presented from the context of the root filer or any other vFiler (in our environment we already had a standard of creating separate aggregates for each vFiler so this wasn't a problem). For example, after one failed attempt to migrate a filer, I had added a CIFS share to the root volume of the vFiler via the root filer, to gain access to the etc folder of the vFiler. I forgot to remove that share, and broke later migration attempts for a new reason.
* We've tested the vfiler migrate command dozens of times now on three different vFilers, in preparation for the migration of a production vFiler later this week. Two of those vFilers have migrated flawlessly every time, and one seems to fail about 30% of the time for various reasons which we can sometimes identify and sometimes not.
* Reasons for failure include: - A CIFS share from the root filer head to the vFiler's root volume. My bad. - Possible FC noise between the root filer and the SAN behind it. - Possible SCSI reservations issues between the root filer and the SAN. - Invalid credentials (fat-fingered a password, I think) for the "source" remote root filer. Oddly, the migrate command still stopped the vFiler, offlined its volumes and aggregates, and removed the vFiler from the source root filer before the process failed. - Poor alignment of the planets? Bad karma?
* In general, it seems that the vfiler migrate just fails sometimes. In every failure, however, recovery has been straightforward. The "vfiler create <vFiler name> -r <path to vFiler root volume>" command recovers the vFiler every time, albeit without a proper network configuration. The vFiler comes back up but with its virtual NIC having no subnet mask or assignment to a physical interface. Makes sense, I guess, as the migrate command never got to the part where it normally asks what mask and interface to use. This needs to be reassigned, either from the CLI using ifconfig or the "manage vFiler" wizard in FilerView (note that this will also overwrite etc/exports with a default but will save a backup first).
Given that there are arguably a slew more shops out there running HA VMWare clusters than there are running HA NetApp clusters, it's probably not fair to expect that vfiler migrate is going to be as slick or even as well understood/documented in the wild as vMotion. Also, inherent limitations in things like the CIFS protocol make it necessary that some services will have to be interrupted. But overall I'd claim that the feature is useful and even when it doesn't work as hoped recovery is straightforward and reliable. We're planning on proceeding with our production move later this week.
Hope this helps anyone in the same situation,
Randy _______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters