Hi,
This article:
https://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb24492
talks about troublshooting LUN alignment issues. It mentions some statistics one can get from the stats command. Does anyone know how to interpret these, eg.:
lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.0:5% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.1:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.2:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.3:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.4:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.5:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.6:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.7:81% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo.0:2% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo.1:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo.2:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo.3:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo.4:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo.5:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo.6:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo.7:70% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_partial_blocks:13% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_partial_blocks:24%
How does one interpret the various *_align_histo.* counters ?
Regards, Filip
Filip,
The alignment histogram for reads and writes illustrates the relative percentage of operations with respect to the 512-byte boundaries within a 4k WAFL block. What we expect to see, for properly partitioned host file systems, is all reads and writes falling into "bucket" zero: :read_align_histo.0 ... :write_align_histo.0
In addition to that, an individual I/O is either counted in one of the eight buckets OR it gets counted as partial: :read_partial_blocks: :write_partial_blocks:
In other words, a single I/O into the LUN is either working on full 4k blocks and getting accounted for in the 0-7 buckets ... or it's doing I/O less than 4k and getting accounted for as "partial".
Given that you are seeing most of your writes and reads falling into bucket 7 ... it's probably due to an incorrect starting offset ... or possibly and extended partition?
-- errol
-----Original Message----- From: Filip Sneppe [mailto:filip.sneppe@gmail.com] Sent: Wednesday, February 04, 2009 9:52 AM To: NetApp list Subject: read_align_histo.XX/write_align_histo.XX output from stats command
Hi,
This article:
https://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb24492
talks about troublshooting LUN alignment issues. It mentions some statistics one can get from the stats command. Does anyone know how to interpret these, eg.:
lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.0 :5% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.1 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.2 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.3 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.4 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.5 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.6 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.7 :81% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 0:2% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 1:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 2:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 3:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 4:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 5:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 6:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 7:70% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_partial_block s:13% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_partial_bloc ks:24%
How does one interpret the various *_align_histo.* counters ?
Regards, Filip
Errol,
Thank you for explaining this. I had to read it twice, but the stats make a lot of sense now. So basically, you want to have as many IO operations as possible in the read/write_align_histo.0 bucket, and as few IO operations as possible in the read/write_align_histo.1->7 buckets.
Best regards, Filip
ps .In my example, the stats came from a VMware ESX LUN hosting a lot of P2V'd machines, so to the best of my knowledge, there's isn't any real way to fix this apart from reinstalling the machines from scratch on properly aligned disks...
On Wed, Feb 4, 2009 at 5:58 PM, Fouquet, Errol Errol.Fouquet@netapp.com wrote:
Filip,
The alignment histogram for reads and writes illustrates the relative percentage of operations with respect to the 512-byte boundaries within a 4k WAFL block. What we expect to see, for properly partitioned host file systems, is all reads and writes falling into "bucket" zero: :read_align_histo.0 ... :write_align_histo.0
In addition to that, an individual I/O is either counted in one of the eight buckets OR it gets counted as partial: :read_partial_blocks: :write_partial_blocks:
In other words, a single I/O into the LUN is either working on full 4k blocks and getting accounted for in the 0-7 buckets ... or it's doing I/O less than 4k and getting accounted for as "partial".
Given that you are seeing most of your writes and reads falling into bucket 7 ... it's probably due to an incorrect starting offset ... or possibly and extended partition?
-- errol
-----Original Message----- From: Filip Sneppe [mailto:filip.sneppe@gmail.com] Sent: Wednesday, February 04, 2009 9:52 AM To: NetApp list Subject: read_align_histo.XX/write_align_histo.XX output from stats command
Hi,
This article:
https://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb24492
talks about troublshooting LUN alignment issues. It mentions some statistics one can get from the stats command. Does anyone know how to interpret these, eg.:
lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.0 :5% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.1 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.2 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.3 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.4 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.5 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.6 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.7 :81% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 0:2% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 1:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 2:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 3:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 4:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 5:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 6:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 7:70% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_partial_block s:13% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_partial_bloc ks:24%
How does one interpret the various *_align_histo.* counters ?
Regards, Filip
Errol,
Thank you for explaining this. I had to read it twice, but the stats make a lot of sense now. So basically, you want to have as many IO operations as possible in the read/write_align_histo.0 bucket, and as few IO operations as possible in the read/write_align_histo.1->7 buckets.
Best regards, Filip
ps .In my example, the stats came from a VMware ESX LUN hosting a lot of P2V'd machines, so to the best of my knowledge, there's isn't any real way to fix this apart from reinstalling the machines from scratch on properly aligned disks...
On Wed, Feb 4, 2009 at 5:58 PM, Fouquet, Errol Errol.Fouquet@netapp.com wrote:
Filip,
The alignment histogram for reads and writes illustrates the relative percentage of operations with respect to the 512-byte boundaries within a 4k WAFL block. What we expect to see, for properly partitioned host file systems, is all reads and writes falling into "bucket" zero: :read_align_histo.0 ... :write_align_histo.0
In addition to that, an individual I/O is either counted in one of the eight buckets OR it gets counted as partial: :read_partial_blocks: :write_partial_blocks:
In other words, a single I/O into the LUN is either working on full 4k blocks and getting accounted for in the 0-7 buckets ... or it's doing I/O less than 4k and getting accounted for as "partial".
Given that you are seeing most of your writes and reads falling into bucket 7 ... it's probably due to an incorrect starting offset ... or possibly and extended partition?
-- errol
-----Original Message----- From: Filip Sneppe [mailto:filip.sneppe@gmail.com] Sent: Wednesday, February 04, 2009 9:52 AM To: NetApp list Subject: read_align_histo.XX/write_align_histo.XX output from stats command
Hi,
This article:
https://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb24492
talks about troublshooting LUN alignment issues. It mentions some statistics one can get from the stats command. Does anyone know how to interpret these, eg.:
lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.0 :5% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.1 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.2 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.3 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.4 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.5 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.6 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.7 :81% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 0:2% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 1:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 2:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 3:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 4:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 5:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 6:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 7:70% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_partial_block s:13% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_partial_bloc ks:24%
How does one interpret the various *_align_histo.* counters ?
Regards, Filip
Filip,
Nick's blog at http://blogs.netapp.com/storage_nuts_n_bolts/2009/01/mbrscanmbralign.html explains the issue at length.
He states that version 5.0 of the Host Utilities Kit should contain the MBRScan/MBRAlign tool which should be able to help aligning already created VMs.
-Tim Hollingworth- -ePlus Technology Inc.- -678.462.6698-
-----Original Message----- From: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com] On Behalf Of Filip Sneppe Sent: Wednesday, February 04, 2009 7:11 PM To: Fouquet, Errol Cc: NetApp list Subject: Re: read_align_histo.XX/write_align_histo.XX output from stats command
Errol,
Thank you for explaining this. I had to read it twice, but the stats make a lot of sense now. So basically, you want to have as many IO operations as possible in the read/write_align_histo.0 bucket, and as few IO operations as possible in the read/write_align_histo.1->7 buckets.
Best regards, Filip
ps .In my example, the stats came from a VMware ESX LUN hosting a lot of P2V'd machines, so to the best of my knowledge, there's isn't any real way to fix this apart from reinstalling the machines from scratch on properly aligned disks...
On Wed, Feb 4, 2009 at 5:58 PM, Fouquet, Errol Errol.Fouquet@netapp.com wrote:
Filip,
The alignment histogram for reads and writes illustrates the relative percentage of operations with respect to the 512-byte boundaries within a 4k WAFL block. What we expect to see, for properly partitioned host file systems, is all reads and writes falling into "bucket" zero: :read_align_histo.0 ... :write_align_histo.0
In addition to that, an individual I/O is either counted in one of the eight buckets OR it gets counted as partial: :read_partial_blocks: :write_partial_blocks:
In other words, a single I/O into the LUN is either working on full 4k blocks and getting accounted for in the 0-7 buckets ... or it's doing I/O less than 4k and getting accounted for as "partial".
Given that you are seeing most of your writes and reads falling into bucket 7 ... it's probably due to an incorrect starting offset ... or possibly and extended partition?
-- errol
-----Original Message----- From: Filip Sneppe [mailto:filip.sneppe@gmail.com] Sent: Wednesday, February 04, 2009 9:52 AM To: NetApp list Subject: read_align_histo.XX/write_align_histo.XX output from stats command
Hi,
This article:
https://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb24492
talks about troublshooting LUN alignment issues. It mentions some statistics one can get from the stats command. Does anyone know how to interpret these, eg.:
lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.0 :5% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.1 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.2 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.3 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.4 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.5 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.6 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.7 :81% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 0:2% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 1:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 2:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 3:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 4:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 5:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 6:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 7:70% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_partial_block s:13% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_partial_bloc ks:24%
How does one interpret the various *_align_histo.* counters ?
Regards, Filip
Hi,
Thank you all for explaining these counters to me, and thanks to all the people pointing me to the mbralign tool, in both on-list and off-list replies.
Best regards, Filip
On Fri, Feb 6, 2009 at 4:20 AM, Timothy Hollingworth thollingsworth@eplus.com wrote:
Filip,
Nick's blog at http://blogs.netapp.com/storage_nuts_n_bolts/2009/01/mbrscanmbralign.html explains the issue at length.
He states that version 5.0 of the Host Utilities Kit should contain the MBRScan/MBRAlign tool which should be able to help aligning already created VMs.
-Tim Hollingworth- -ePlus Technology Inc.- -678.462.6698-
-----Original Message----- From: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com] On Behalf Of Filip Sneppe Sent: Wednesday, February 04, 2009 7:11 PM To: Fouquet, Errol Cc: NetApp list Subject: Re: read_align_histo.XX/write_align_histo.XX output from stats command
Errol,
Thank you for explaining this. I had to read it twice, but the stats make a lot of sense now. So basically, you want to have as many IO operations as possible in the read/write_align_histo.0 bucket, and as few IO operations as possible in the read/write_align_histo.1->7 buckets.
Best regards, Filip
ps .In my example, the stats came from a VMware ESX LUN hosting a lot of P2V'd machines, so to the best of my knowledge, there's isn't any real way to fix this apart from reinstalling the machines from scratch on properly aligned disks...
On Wed, Feb 4, 2009 at 5:58 PM, Fouquet, Errol Errol.Fouquet@netapp.com wrote:
Filip,
The alignment histogram for reads and writes illustrates the relative percentage of operations with respect to the 512-byte boundaries within a 4k WAFL block. What we expect to see, for properly partitioned host file systems, is all reads and writes falling into "bucket" zero: :read_align_histo.0 ... :write_align_histo.0
In addition to that, an individual I/O is either counted in one of the eight buckets OR it gets counted as partial: :read_partial_blocks: :write_partial_blocks:
In other words, a single I/O into the LUN is either working on full 4k blocks and getting accounted for in the 0-7 buckets ... or it's doing I/O less than 4k and getting accounted for as "partial".
Given that you are seeing most of your writes and reads falling into bucket 7 ... it's probably due to an incorrect starting offset ... or possibly and extended partition?
-- errol
-----Original Message----- From: Filip Sneppe [mailto:filip.sneppe@gmail.com] Sent: Wednesday, February 04, 2009 9:52 AM To: NetApp list Subject: read_align_histo.XX/write_align_histo.XX output from stats command
Hi,
This article:
https://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb24492
talks about troublshooting LUN alignment issues. It mentions some statistics one can get from the stats command. Does anyone know how to interpret these, eg.:
lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.0 :5% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.1 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.2 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.3 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.4 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.5 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.6 :0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_align_histo.7 :81% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 0:2% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 1:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 2:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 3:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 4:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 5:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 6:0% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_align_histo. 7:70% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:read_partial_block s:13% lun:/vol/esx_lun_data1/luns/esx_lun1.lun-P3NsiJNHOSbN:write_partial_bloc ks:24%
How does one interpret the various *_align_histo.* counters ?
Regards, Filip