"Will easily queryable stats about misaligned VMDKs on NFS ever be available?"
It already is actually. Download the latest version of the ESX Host Utility Kit and use the mbralign utility that's included with it. It can be run from either the VMware hosts themselves or from a Linux/Unix server with mount access. If you take a NetApp snapshot of the datastore then you can run the mbralign scan against the snapshotted -flat.vmdk(s) without having to power off the virtual machines.
Also, some VMware align/misalignment statistics are included in the output of nfsstat -d in version 7.3.5.1 or later. I may be slightly off on the OnTap release version...
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Eugene Vilensky Sent: Friday, August 26, 2011 2:39 PM Cc: toasters@teaparty.net Subject: Re: Sources of unaligned IO other that Vmware? - pw.over_limit persists
On Fri, Aug 26, 2011 at 11:34 AM, Jeff Mohler <speedtoys.racing@gmail.commailto:speedtoys.racing@gmail.com> wrote:
"By the way, from ONTAP 8.0.1+, you can directly check alignment with the "lun show -v" command:" ---
Gotta be careful here, this only works on the -direct- lun.
This will not tell you how healthy any underlying virtual filesystems within the LUN may or may not be, only the high level direct LUN itself.
Will easily queryable stats about misaligned VMDKs on NFS ever be available? Without mounting the datastores on Linux and checking with fdisk...
That's right. I blogged on it a bit here:
http://communities.netapp.com/community/netapp-blogs/getvirtical/blog/20 11/05/13/new-vmdk-misalignment-detection-tools-in-data-ontap-735-and-802
Share and enjoy!
Peter
From: Chris Muellner [mailto:chris@northlandusa.com] Sent: Friday, August 26, 2011 1:05 PM To: Eugene Vilensky Cc: toasters@teaparty.net Subject: RE: Sources of unaligned IO other that Vmware? - pw.over_limitpersists
"Will easily queryable stats about misaligned VMDKs on NFS ever be available?"
It already is actually. Download the latest version of the ESX Host Utility Kit and use the mbralign utility that's included with it. It can be run from either the VMware hosts themselves or from a Linux/Unix server with mount access. If you take a NetApp snapshot of the datastore then you can run the mbralign scan against the snapshotted -flat.vmdk(s) without having to power off the virtual machines.
Also, some VMware align/misalignment statistics are included in the output of nfsstat -d in version 7.3.5.1 or later. I may be slightly off on the OnTap release version...
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Eugene Vilensky Sent: Friday, August 26, 2011 2:39 PM Cc: toasters@teaparty.net Subject: Re: Sources of unaligned IO other that Vmware? - pw.over_limit persists
On Fri, Aug 26, 2011 at 11:34 AM, Jeff Mohler speedtoys.racing@gmail.com wrote:
"By the way, from ONTAP 8.0.1+, you can directly check alignment with the "lun show -v" command:"
---
Gotta be careful here, this only works on the -direct- lun.
This will not tell you how healthy any underlying virtual filesystems within the LUN may or may not be, only the high level direct LUN itself.
Will easily queryable stats about misaligned VMDKs on NFS ever be available? Without mounting the datastores on Linux and checking with fdisk...
On Fri, Aug 26, 2011 at 3:15 PM, Learmonth, Peter Peter.Learmonth@netapp.com wrote:
That's right. I blogged on it a bit here:
http://communities.netapp.com/community/netapp-blogs/getvirtical/blog/2011/0...
Very exciting. Can't wait to get up to 8.0.2!
What does it mean when nfsstat -d and mrbscan don't agree? Also why do the same files appear with different counter values in the output of nfsstat d? (is there a vmware guest OS type mismatch possibility mentioned earlier)?
I posted about it here:
http://communities.netapp.com/thread/15458
thanks
On 8/26/11 1:15 PM, "Learmonth, Peter" Peter.Learmonth@netapp.com wrote:
That's right. I blogged on it a bit here: http://communities.netapp.com/community/netapp-blogs/getvirtical/blog/2011/0... 13/new-vmdk-misalignment-detection-tools-in-data-ontap-735-and-802
Share and enjoy!
Peter
From: Chris Muellner [mailto:chris@northlandusa.com] Sent: Friday, August 26, 2011 1:05 PM To: Eugene Vilensky Cc: toasters@teaparty.net Subject: RE: Sources of unaligned IO other that Vmware? - pw.over_limitpersists
³Will easily queryable stats about misaligned VMDKs on NFS ever be available?²
It already is actually. Download the latest version of the ESX Host Utility Kit and use the mbralign utility that¹s included with it. It can be run from either the VMware hosts themselves or from a Linux/Unix server with mount access. If you take a NetApp snapshot of the datastore then you can run the mbralign scan against the snapshotted -flat.vmdk(s) without having to power off the virtual machines.
Also, some VMware align/misalignment statistics are included in the output of nfsstat -d in version 7.3.5.1 or later. I may be slightly off on the OnTap release version
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Eugene Vilensky Sent: Friday, August 26, 2011 2:39 PM Cc: toasters@teaparty.net Subject: Re: Sources of unaligned IO other that Vmware? - pw.over_limit persists
On Fri, Aug 26, 2011 at 11:34 AM, Jeff Mohler speedtoys.racing@gmail.com wrote:
"By the way, from ONTAP 8.0.1+, you can directly check alignment with the ³lun show v² command:"
Gotta be careful here, this only works on the -direct- lun.
This will not tell you how healthy any underlying virtual filesystems within the LUN may or may not be, only the high level direct LUN itself.
Will easily queryable stats about misaligned VMDKs on NFS ever be available? Without mounting the datastores on Linux and checking with fdisk...
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
As mentioned in the getvirtical blog: Every CPU in your filer has its own counter, that's why the counters are different.
Bye,
Alex
Von: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] Im Auftrag von Fletcher Cocquyt Gesendet: Samstag, 27. August 2011 01:53 An: Learmonth, Peter; Chris Muellner; Eugene Vilensky Cc: toasters@teaparty.net Betreff: Re: Sources of unaligned IO other that Vmware? - nfsstat -d andmrbscan don't agree
What does it mean when nfsstat -d and mrbscan don't agree? Also why do the same files appear with different counter values in the output of nfsstat -d? (is there a vmware guest OS type mismatch possibility mentioned earlier)?
I posted about it here:
http://communities.netapp.com/thread/15458
thanks
On 8/26/11 1:15 PM, "Learmonth, Peter" Peter.Learmonth@netapp.com wrote:
That's right. I blogged on it a bit here: http://communities.netapp.com/community/netapp-blogs/getvirtical/blog/20 11/05/13/new-vmdk-misalignment-detection-tools-in-data-ontap-735-and-802
Share and enjoy!
Peter
From: Chris Muellner [mailto:chris@northlandusa.com] Sent: Friday, August 26, 2011 1:05 PM To: Eugene Vilensky Cc: toasters@teaparty.net Subject: RE: Sources of unaligned IO other that Vmware? - pw.over_limitpersists
"Will easily queryable stats about misaligned VMDKs on NFS ever be available?"
It already is actually. Download the latest version of the ESX Host Utility Kit and use the mbralign utility that's included with it. It can be run from either the VMware hosts themselves or from a Linux/Unix server with mount access. If you take a NetApp snapshot of the datastore then you can run the mbralign scan against the snapshotted -flat.vmdk(s) without having to power off the virtual machines.
Also, some VMware align/misalignment statistics are included in the output of nfsstat -d in version 7.3.5.1 or later. I may be slightly off on the OnTap release version...
From: toasters-bounces@teaparty.net [ mailto:toasters-bounces@teaparty.net] On Behalf Of Eugene Vilensky Sent: Friday, August 26, 2011 2:39 PM Cc: toasters@teaparty.net Subject: Re: Sources of unaligned IO other that Vmware? - pw.over_limit persists
On Fri, Aug 26, 2011 at 11:34 AM, Jeff Mohler < speedtoys.racing@gmail.com> wrote:
"By the way, from ONTAP 8.0.1+, you can directly check alignment with the "lun show -v" command:" ---
Gotta be careful here, this only works on the -direct- lun.
This will not tell you how healthy any underlying virtual filesystems within the LUN may or may not be, only the high level direct LUN itself.
Will easily queryable stats about misaligned VMDKs on NFS ever be available? Without mounting the datastores on Linux and checking with fdisk...
________________________________
_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
the best thing about mbralign is that it will fix misaligned vmdk's too (with certain caveats) on any vendor's storage array.
it has been awhile since I ran the utility, but it works better if you run it on the esx host that "owns" the VM
Jack
On 8/26/2011 4:05 PM, Chris Muellner wrote:
"Will easily queryable stats about misaligned VMDKs on NFS ever be available?"
It already is actually. Download the latest version of the ESX Host Utility Kit and use the mbralign utility that's included with it. It can be run from either the VMware hosts themselves or from a Linux/Unix server with mount access. If you take a NetApp snapshot of the datastore then you can run the mbralign scan against the snapshotted -flat.vmdk(s) without having to power off the virtual machines.
Also, some VMware align/misalignment statistics are included in the output of nfsstat -d in version 7.3.5.1 or later. I may be slightly off on the OnTap release version...
*From:*toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] *On Behalf Of *Eugene Vilensky *Sent:* Friday, August 26, 2011 2:39 PM *Cc:* toasters@teaparty.net *Subject:* Re: Sources of unaligned IO other that Vmware? - pw.over_limit persists
On Fri, Aug 26, 2011 at 11:34 AM, Jeff Mohler <speedtoys.racing@gmail.com mailto:speedtoys.racing@gmail.com> wrote:
"By the way, from ONTAP 8.0.1+, you can directly check alignment with the "lun show --v" command:"
Gotta be careful here, this only works on the -direct- lun.
This will not tell you how healthy any underlying virtual filesystems within the LUN may or may not be, only the high level direct LUN itself.
Will easily queryable stats about misaligned VMDKs on NFS ever be available? Without mounting the datastores on Linux and checking with fdisk...
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Which is not possible anymore now that ESX is dying and the whole world migrates to ESXi.
Another thing to consider when using mbralign is that it kills most Linux bootloaders, so be careful - never had a broken Windows VM after running mbralign on its vmdk, but almost ever manage to break the Linux bootloader with it, so prepare for breakage and have a rescue cd at hand with which you can reinstall the bootloader afterwards.
Bye,
Alex
Von: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] Im Auftrag von Jack Lyons Gesendet: Samstag, 27. August 2011 01:26 An: toasters@teaparty.net Betreff: Re: Sources of unaligned IO other that Vmware? - pw.over_limitpersists
the best thing about mbralign is that it will fix misaligned vmdk's too (with certain caveats) on any vendor's storage array.
it has been awhile since I ran the utility, but it works better if you run it on the esx host that "owns" the VM
Jack
On 8/26/2011 4:05 PM, Chris Muellner wrote:
"Will easily queryable stats about misaligned VMDKs on NFS ever be available?"
It already is actually. Download the latest version of the ESX Host Utility Kit and use the mbralign utility that's included with it. It can be run from either the VMware hosts themselves or from a Linux/Unix server with mount access. If you take a NetApp snapshot of the datastore then you can run the mbralign scan against the snapshotted -flat.vmdk(s) without having to power off the virtual machines.
Also, some VMware align/misalignment statistics are included in the output of nfsstat -d in version 7.3.5.1 or later. I may be slightly off on the OnTap release version...
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Eugene Vilensky Sent: Friday, August 26, 2011 2:39 PM Cc: toasters@teaparty.net Subject: Re: Sources of unaligned IO other that Vmware? - pw.over_limit persists
On Fri, Aug 26, 2011 at 11:34 AM, Jeff Mohler speedtoys.racing@gmail.com wrote:
"By the way, from ONTAP 8.0.1+, you can directly check alignment with the "lun show -v" command:"
---
Gotta be careful here, this only works on the -direct- lun.
This will not tell you how healthy any underlying virtual filesystems within the LUN may or may not be, only the high level direct LUN itself.
Will easily queryable stats about misaligned VMDKs on NFS ever be available? Without mounting the datastores on Linux and checking with fdisk...
_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Holy Shnikies does this actually work? And is it supported? (running mbralign with a snapshot, then (presumably) deleting the snapshot to merge the delta files back into the newly aligned vmdk brilliant!) We took downtime on all VMs to align them the big ones took hours...even with 10gig networking can¹t believe I had not seen mention of this before. How do you deal with the linux VMs which need boot file modification post alignment to (re)boot successfully?
FWIW, all our VMs are aligned now we run a daily mbrscan report to pickup any stray misaligned vmdk files (usually they are virtual appliances)
On 8/26/11 1:05 PM, "Chris Muellner" chris@northlandusa.com wrote:
If you take a NetApp snapshot of the datastore then you can run the mbralign scan against the snapshotted -flat.vmdk(s) without having to power off the virtual machines.
I don't think that this will work - but I'm going to try that out on Monday unless the OP clarifies that he didn't mean it like that J
Running mbralign is a long-lasting process, especially for really big vmdks, but we always worked around that by having small "system" or "boot" vmdks and added additional disk space by adding additional vmdks which we initialized with GPT right from the start.
Later on, we created default templates with already aligned but empty disks which also works a treat.
Bye,
Alex
Von: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] Im Auftrag von Fletcher Cocquyt Gesendet: Samstag, 27. August 2011 01:47 An: Chris Muellner; Eugene Vilensky Cc: toasters@teaparty.net Betreff: Re: Sources of unaligned IO other that Vmware? - pw.over_limitpersists
Holy Shnikies - does this actually work? And is it supported? (running mbralign with a snapshot, then (presumably) deleting the snapshot to merge the delta files back into the newly aligned vmdk - brilliant!) We took downtime on all VMs to align them - the big ones took hours...even with 10gig networking - can't believe I had not seen mention of this before. How do you deal with the linux VMs which need boot file modification post alignment to (re)boot successfully?
FWIW, all our VMs are aligned now - we run a daily mbrscan report to pickup any stray misaligned vmdk files (usually they are virtual appliances)
On 8/26/11 1:05 PM, "Chris Muellner" chris@northlandusa.com wrote:
If you take a NetApp snapshot of the datastore then you can run the mbralign scan against the snapshotted -flat.vmdk(s) without having to power off the virtual machines.
you can run the scan on a snapshot to determine if you need to run an mbralign. I don't think you run the align while the vm is up - I think at best you would only get a crash consistent copy that was valid as of the time of the beginning of the align and have to take an outage to reboot the vm off the aligned vmdk.
Jack
On 8/26/2011 8:10 PM, Alexander Griesser wrote:
I don't think that this will work -- but I'm going to try that out on Monday unless the OP clarifies that he didn't mean it like that J
Running mbralign is a long-lasting process, especially for really big vmdks, but we always worked around that by having small "system" or "boot" vmdks and added additional disk space by adding additional vmdks which we initialized with GPT right from the start.
Later on, we created default templates with already aligned but empty disks which also works a treat.
Bye,
Alex
*Von:*toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] *Im Auftrag von *Fletcher Cocquyt *Gesendet:* Samstag, 27. August 2011 01:47 *An:* Chris Muellner; Eugene Vilensky *Cc:* toasters@teaparty.net *Betreff:* Re: Sources of unaligned IO other that Vmware? - pw.over_limitpersists
Holy Shnikies -- does this actually work? And is it supported? (running mbralign with a snapshot, then (presumably) deleting the snapshot to merge the delta files back into the newly aligned vmdk -- brilliant!) We took downtime on all VMs to align them -- the big ones took hours...even with 10gig networking -- can't believe I had not seen mention of this before. How do you deal with the linux VMs which need boot file modification post alignment to (re)boot successfully?
FWIW, all our VMs are aligned now -- we run a daily mbrscan report to pickup any stray misaligned vmdk files (usually they are virtual appliances)
On 8/26/11 1:05 PM, "Chris Muellner" chris@northlandusa.com wrote:
If you take a NetApp snapshot of the datastore then you can run the mbralign scan against the snapshotted -flat.vmdk(s) without having to power off the virtual machines.
-- Fletcher Cocquyt Principal Engineer Information Resources and Technology (IRT) Stanford University School of Medicine
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
This is correct. You can run the scan against the snapshot, but not the alignment process. The VM still has to come down for that. Also, for those that want to speed up the alignment process that aren't up to running the tools on a Linux host (for NFS only), storage controller offload has been added in the most recent HUK and can be used if you're running OnTap 7.3.5+ or 8.0.1+. This bypasses the memory limitation available in the Service Console of the ESX host and lets the controllers handle the heavy lifting. The only way around this previously was to use a Linux VM or physical box with 2GB+ RAM.
Also, many people aren't aware but there's an alignment best practices and procedures guide available here: http://media.netapp.com/documents/tr-3747.pdf
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Jack Lyons Sent: Friday, August 26, 2011 9:45 PM To: toasters@teaparty.net Subject: Re: AW: Sources of unaligned IO other that Vmware? - pw.over_limitpersists
you can run the scan on a snapshot to determine if you need to run an mbralign. I don't think you run the align while the vm is up - I think at best you would only get a crash consistent copy that was valid as of the time of the beginning of the align and have to take an outage to reboot the vm off the aligned vmdk.
Jack
On 8/26/2011 8:10 PM, Alexander Griesser wrote:
I don't think that this will work - but I'm going to try that out on Monday unless the OP clarifies that he didn't mean it like that J
Running mbralign is a long-lasting process, especially for really big vmdks, but we always worked around that by having small "system" or "boot" vmdks and added additional disk space by adding additional vmdks which we initialized with GPT right from the start.
Later on, we created default templates with already aligned but empty disks which also works a treat.
Bye,
Alex
Von: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] Im Auftrag von Fletcher Cocquyt Gesendet: Samstag, 27. August 2011 01:47 An: Chris Muellner; Eugene Vilensky Cc: toasters@teaparty.netmailto:toasters@teaparty.net Betreff: Re: Sources of unaligned IO other that Vmware? - pw.over_limitpersists
Holy Shnikies - does this actually work? And is it supported? (running mbralign with a snapshot, then (presumably) deleting the snapshot to merge the delta files back into the newly aligned vmdk - brilliant!) We took downtime on all VMs to align them - the big ones took hours...even with 10gig networking - can't believe I had not seen mention of this before. How do you deal with the linux VMs which need boot file modification post alignment to (re)boot successfully?
FWIW, all our VMs are aligned now - we run a daily mbrscan report to pickup any stray misaligned vmdk files (usually they are virtual appliances)
On 8/26/11 1:05 PM, "Chris Muellner" chris@northlandusa.com wrote:
If you take a NetApp snapshot of the datastore then you can run the mbralign scan against the snapshotted -flat.vmdk(s) without having to power off the virtual machines.
-- Fletcher Cocquyt Principal Engineer Information Resources and Technology (IRT) Stanford University School of Medicine [cid:image001.jpg@01CC64B1.1790DB20] http://vmadmin.info
_______________________________________________
Toasters mailing list
Toasters@teaparty.netmailto:Toasters@teaparty.net
He's talking about scanning, not aligning. When a VM is running, the VMDK is locked to the process that actually executes the VM. Nothing else can read or write. So, you either power off the VM, or take a VMware or NetApp snapshot. The VMware snapshot makes the VMDK read-only and readable by any process with sufficient permissions, and VM writes go into the delta file. So, you can scan the VMDK just fine. You can also read it to align. The problem is, while you fix the VMDK, the snapshot delta file also has blocks with addresses (LBA) and those LBA are relative to the original alignment. If you halt, swap VMDK, and start it back up, whether you consolidate (a.k.a. delete) the VMware snapshot before or after you boot, the VMDK and delta are not aligned the same and the guest panics, detects corruption or some other unpredictable behaviour. I've tested this a few times.
We've also looked at making mbralign able to align both the base VMDK and the snapshot delta files, and while technically possible, it's very tricky.
As for virtual appliances, yeah, they are usually misaligned - including the brand-new vSphere 5 vCenter Appliance. :-/
Peter
From: Fletcher Cocquyt [mailto:fcocquyt@stanford.edu] Sent: Friday, August 26, 2011 4:47 PM To: Chris Muellner; Eugene Vilensky Cc: toasters@teaparty.net Subject: Re: Sources of unaligned IO other that Vmware? - pw.over_limitpersists
Holy Shnikies - does this actually work? And is it supported? (running mbralign with a snapshot, then (presumably) deleting the snapshot to merge the delta files back into the newly aligned vmdk - brilliant!) We took downtime on all VMs to align them - the big ones took hours...even with 10gig networking - can't believe I had not seen mention of this before. How do you deal with the linux VMs which need boot file modification post alignment to (re)boot successfully?
FWIW, all our VMs are aligned now - we run a daily mbrscan report to pickup any stray misaligned vmdk files (usually they are virtual appliances)
On 8/26/11 1:05 PM, "Chris Muellner" chris@northlandusa.com wrote:
If you take a NetApp snapshot of the datastore then you can run the mbralign scan against the snapshotted -flat.vmdk(s) without having to power off the virtual machines.