Snapvault slow on one specific volume?

List overview All Threads
Download

newer

older

RE: Removing a shelf.

Re: writes through NVRAM *also* in...

Page, Jeremy

13 Oct 2008 13 Oct '08

12:14 p.m.

CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk

ops/s in out read write read write age hit time ty util

12% 1007 2970 8996 15769 9117 0 0 8 93% 49% : 41%

18% 920 2792 6510 11715 6924 0 0 8 99% 84% T 39%

15% 1276 3580 10469 15942 8041 0 0 10 92% 33% T 36%

13% 1487 3416 11347 15632 4907 0 0 11 89% 42% : 43%

17% 1417 3180 9890 14000 9444 0 0 9 98% 79% T 41%

13% 972 3704 9705 15427 9934 0 0 7 92% 46% T 51%

18% 1087 2947 11911 17717 4640 0 0 9 98% 33% T 47%

11% 1204 3358 11219 14090 5159 0 0 7 88% 100% : 50%

12% 1161 2808 9085 12640 5936 0 0 9 90% 33% T 44%

13% 981 4735 11919 16125 7097 0 0 9 92% 45% : 43%

15% 1158 5780 12480 17565 8266 0 0 10 92% 88% T 41%

I'm just having difficulty trying to determine why two volumes on the same spindles would be so different in the time it takes to do their initial transfer. Also, the VM's do not seem slower then those hosted on other aggregates (this one is 3 RG of 11 disks each, Ontap 7.2.4 on a 3070A IBM rebranded).

This message (including any attachments) contains confidential and/or proprietary information intended only for the addressee. Any unauthorized disclosure, copying, distribution or reliance on the contents of this information is strictly prohibited and may constitute a violation of law. If you are not the intended recipient, please notify the sender immediately by responding to this e-mail, and delete the message from your system. If you have any questions about this e-mail please notify the sender immediately.

Attachments:

attachment.html (text/html — 7.6 KB)

Show replies by date

George T Chen

13 Oct 13 Oct

4:43 p.m.

What's the inode count on each of those volumes?

________________________________ From: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com] On Behalf Of Page, Jeremy Sent: Monday, October 13, 2008 5:15 AM To: toasters@mathworks.com Subject: Snapvault slow on one specific volume?

CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk ops/s in out read write read write age hit time ty util 12% 1007 2970 8996 15769 9117 0 0 8 93% 49% : 41% 18% 920 2792 6510 11715 6924 0 0 8 99% 84% T 39% 15% 1276 3580 10469 15942 8041 0 0 10 92% 33% T 36% 13% 1487 3416 11347 15632 4907 0 0 11 89% 42% : 43% 17% 1417 3180 9890 14000 9444 0 0 9 98% 79% T 41% 13% 972 3704 9705 15427 9934 0 0 7 92% 46% T 51% 18% 1087 2947 11911 17717 4640 0 0 9 98% 33% T 47% 11% 1204 3358 11219 14090 5159 0 0 7 88% 100% : 50% 12% 1161 2808 9085 12640 5936 0 0 9 90% 33% T 44% 13% 981 4735 11919 16125 7097 0 0 9 92% 45% : 43% 15% 1158 5780 12480 17565 8266 0 0 10 92% 88% T 41%

Darren Sykes

6:10 p.m.

The ESX inode count should be really low, i.e ~10 files per VM and probably a max of a few hundred VM's.

________________________________

From: owner-toasters@mathworks.com on behalf of George T Chen Sent: Mon 10/13/2008 17:43 To: Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume?

What's the inode count on each of those volumes?

________________________________

From: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com] On Behalf Of Page, Jeremy Sent: Monday, October 13, 2008 5:15 AM To: toasters@mathworks.com Subject: Snapvault slow on one specific volume?

CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk

ops/s in out read write read write age hit time ty util

12% 1007 2970 8996 15769 9117 0 0 8 93% 49% : 41%

18% 920 2792 6510 11715 6924 0 0 8 99% 84% T 39%

15% 1276 3580 10469 15942 8041 0 0 10 92% 33% T 36%

13% 1487 3416 11347 15632 4907 0 0 11 89% 42% : 43%

17% 1417 3180 9890 14000 9444 0 0 9 98% 79% T 41%

13% 972 3704 9705 15427 9934 0 0 7 92% 46% T 51%

18% 1087 2947 11911 17717 4640 0 0 9 98% 33% T 47%

11% 1204 3358 11219 14090 5159 0 0 7 88% 100% : 50%

12% 1161 2808 9085 12640 5936 0 0 9 90% 33% T 44%

13% 981 4735 11919 16125 7097 0 0 9 92% 45% : 43%

15% 1158 5780 12480 17565 8266 0 0 10 92% 88% T 41%

To report this email as spam click here https://www.mailcontrol.com/sr/8L0wuK0vUCvTndxI!oX7UhHqCnr9FF1iJJz!+FvOxo63ApoYY9oQ0yv3aNs+BtqudHzYUgy2lOy2eN6FcXjHgg== .

Darren Sykes

7:51 p.m.

Jeremy/All,

Following on from our conversation offline:

It would seem you (and I) have been suffering from the bug described here: http://now.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=281669

We saw it on template volumes. I'm planning to disable ASIS on that volume to attempt to speed up access.

Obviously, that solution may be less than useful in your environment where it's live data volumes which benefit from ASIS.

Darren

________________________________

From: owner-toasters@mathworks.com on behalf of Page, Jeremy Sent: Mon 10/13/2008 13:14 To: toasters@mathworks.com Subject: Snapvault slow on one specific volume?

CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk

ops/s in out read write read write age hit time ty util

12% 1007 2970 8996 15769 9117 0 0 8 93% 49% : 41%

18% 920 2792 6510 11715 6924 0 0 8 99% 84% T 39%

15% 1276 3580 10469 15942 8041 0 0 10 92% 33% T 36%

13% 1487 3416 11347 15632 4907 0 0 11 89% 42% : 43%

17% 1417 3180 9890 14000 9444 0 0 9 98% 79% T 41%

13% 972 3704 9705 15427 9934 0 0 7 92% 46% T 51%

18% 1087 2947 11911 17717 4640 0 0 9 98% 33% T 47%

11% 1204 3358 11219 14090 5159 0 0 7 88% 100% : 50%

12% 1161 2808 9085 12640 5936 0 0 9 90% 33% T 44%

13% 981 4735 11919 16125 7097 0 0 9 92% 45% : 43%

15% 1158 5780 12480 17565 8266 0 0 10 92% 88% T 41%

To report this email as spam click here https://www.mailcontrol.com/sr/zzgL!cim+5DTndxI!oX7UlnwUb8+3gKcXgEEExNzfWq8foXmVffzqaGqghvuFfK3dHzYUgy2lOyjfskgLULbMA== .

Glenn Walker

14 Oct 14 Oct

2:23 a.m.

I was under the impression that ESX over NFS used thin-provisioned VMDKs by default (that's how it is in our environment, and all of the files are appearing as thin-provisioned). Would this then be not the same bug? Thin-provisioned VMDKs means that the portion of the VMDK not allocated to the guest would be treated as a sparse file, not a file filled with zeros. (unless someone decided to perform a full format on the 'disk', perhaps?)

________________________________

From: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com] On Behalf Of Darren Sykes Sent: Monday, October 13, 2008 3:51 PM To: Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume?

Jeremy/All,

Following on from our conversation offline:

It would seem you (and I) have been suffering from the bug described here: http://now.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=281669

We saw it on template volumes. I'm planning to disable ASIS on that volume to attempt to speed up access.

Obviously, that solution may be less than useful in your environment where it's live data volumes which benefit from ASIS.

Darren

________________________________

From: owner-toasters@mathworks.com on behalf of Page, Jeremy Sent: Mon 10/13/2008 13:14 To: toasters@mathworks.com Subject: Snapvault slow on one specific volume?

CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk

ops/s in out read write read write age hit time ty util

12% 1007 2970 8996 15769 9117 0 0 8 93% 49% : 41%

18% 920 2792 6510 11715 6924 0 0 8 99% 84% T 39%

15% 1276 3580 10469 15942 8041 0 0 10 92% 33% T 36%

13% 1487 3416 11347 15632 4907 0 0 11 89% 42% : 43%

17% 1417 3180 9890 14000 9444 0 0 9 98% 79% T 41%

13% 972 3704 9705 15427 9934 0 0 7 92% 46% T 51%

18% 1087 2947 11911 17717 4640 0 0 9 98% 33% T 47%

11% 1204 3358 11219 14090 5159 0 0 7 88% 100% : 50%

12% 1161 2808 9085 12640 5936 0 0 9 90% 33% T 44%

13% 981 4735 11919 16125 7097 0 0 9 92% 45% : 43%

15% 1158 5780 12480 17565 8266 0 0 10 92% 88% T 41%

To report this email as spam click here https://www.mailcontrol.com/sr/zzgL!cim+5DTndxI!oX7UlnwUb8+3gKcXgEEExNz fWq8foXmVffzqaGqghvuFfK3dHzYUgy2lOyjfskgLULbMA== .

Darren Sykes

6:59 a.m.

Glenn,

That's true; by default all new VM's created on an NFS volume would be thin provisioned. I'm not sure if that's the case for templates though (I thought they were created fat on purpose for performance reasons when deploying them).

Also, we migrated from FC and iSCSI LUNS (which is basically a file copy) so most of our VM's are fat anyway. From what I understand using SVMOTION also results in a fat filed VM, though that's not officially supported on NFS in ESX3.5.

So, in summary there are a few reasons why you might end up with non-thin provisioned VM's on NFS and may therefore hit this bug.

Darren

________________________________

From: Glenn Walker [mailto:ggwalker@mindspring.com] Sent: Tue 10/14/2008 03:23 To: Darren Sykes; Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume?

________________________________

Jeremy/All,

Following on from our conversation offline:

It would seem you (and I) have been suffering from the bug described here: http://now.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=281669

We saw it on template volumes. I'm planning to disable ASIS on that volume to attempt to speed up access.

Obviously, that solution may be less than useful in your environment where it's live data volumes which benefit from ASIS.

Darren

________________________________

From: owner-toasters@mathworks.com on behalf of Page, Jeremy Sent: Mon 10/13/2008 13:14 To: toasters@mathworks.com Subject: Snapvault slow on one specific volume?

CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk

ops/s in out read write read write age hit time ty util

12% 1007 2970 8996 15769 9117 0 0 8 93% 49% : 41%

18% 920 2792 6510 11715 6924 0 0 8 99% 84% T 39%

15% 1276 3580 10469 15942 8041 0 0 10 92% 33% T 36%

13% 1487 3416 11347 15632 4907 0 0 11 89% 42% : 43%

17% 1417 3180 9890 14000 9444 0 0 9 98% 79% T 41%

13% 972 3704 9705 15427 9934 0 0 7 92% 46% T 51%

18% 1087 2947 11911 17717 4640 0 0 9 98% 33% T 47%

11% 1204 3358 11219 14090 5159 0 0 7 88% 100% : 50%

12% 1161 2808 9085 12640 5936 0 0 9 90% 33% T 44%

13% 981 4735 11919 16125 7097 0 0 9 92% 45% : 43%

15% 1158 5780 12480 17565 8266 0 0 10 92% 88% T 41%

To report this email as spam click here https://www.mailcontrol.com/sr/wQw0zmjPoHdJTZGyOCrrhg== .

Glenn Walker

12:43 p.m.

FC and iSCSI does mean FAT VMDK, unless you create them manually and specify thin provisioned (not typical). The Storage VMotion info is good to know - I hope they get that fixed soon.

Thanks for the additional info - it's something for us to watch out for. We went NFS from the start (and performed P2V and V2V into the NFS-based datastores), but I know that SVMotion has been used and templates as well. I'll try to check our use of templates a bit later today...

Glenn

________________________________

From: Darren Sykes [mailto:Darren.Sykes@csr.com] Sent: Tuesday, October 14, 2008 3:00 AM To: Glenn Walker; Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume?

Glenn,

So, in summary there are a few reasons why you might end up with non-thin provisioned VM's on NFS and may therefore hit this bug.

Darren

________________________________

From: Glenn Walker [mailto:ggwalker@mindspring.com] Sent: Tue 10/14/2008 03:23 To: Darren Sykes; Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume?

________________________________

Jeremy/All,

Following on from our conversation offline:

It would seem you (and I) have been suffering from the bug described here: http://now.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=281669

We saw it on template volumes. I'm planning to disable ASIS on that volume to attempt to speed up access.

Obviously, that solution may be less than useful in your environment where it's live data volumes which benefit from ASIS.

Darren

________________________________

From: owner-toasters@mathworks.com on behalf of Page, Jeremy Sent: Mon 10/13/2008 13:14 To: toasters@mathworks.com Subject: Snapvault slow on one specific volume?

CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk

ops/s in out read write read write age hit time ty util

12% 1007 2970 8996 15769 9117 0 0 8 93% 49% : 41%

18% 920 2792 6510 11715 6924 0 0 8 99% 84% T 39%

15% 1276 3580 10469 15942 8041 0 0 10 92% 33% T 36%

13% 1487 3416 11347 15632 4907 0 0 11 89% 42% : 43%

17% 1417 3180 9890 14000 9444 0 0 9 98% 79% T 41%

13% 972 3704 9705 15427 9934 0 0 7 92% 46% T 51%

18% 1087 2947 11911 17717 4640 0 0 9 98% 33% T 47%

11% 1204 3358 11219 14090 5159 0 0 7 88% 100% : 50%

12% 1161 2808 9085 12640 5936 0 0 9 90% 33% T 44%

13% 981 4735 11919 16125 7097 0 0 9 92% 45% : 43%

15% 1158 5780 12480 17565 8266 0 0 10 92% 88% T 41%

To report this email as spam click here https://www.mailcontrol.com/sr/wQw0zmjPoHdJTZGyOCrrhg== .

Darren Sykes

1:45 p.m.

SMotion - you'd hope (without breaking any NDA's) that they would address that in the next version, and possibly give you the option to specify thin or fat disks explicitly.

Out of interest - I removed dedup on our templates volume and a VM provisioning job that took 16 mins yesterday took less than 5 mins today.

Darren.

________________________________

From: Glenn Walker [mailto:ggwalker@mindspring.com] Sent: 14 October 2008 13:43 To: Darren Sykes; Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume?

FC and iSCSI does mean FAT VMDK, unless you create them manually and specify thin provisioned (not typical). The Storage VMotion info is good to know - I hope they get that fixed soon.

Glenn

________________________________

From: Darren Sykes [mailto:Darren.Sykes@csr.com] Sent: Tuesday, October 14, 2008 3:00 AM To: Glenn Walker; Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume?

Glenn,

So, in summary there are a few reasons why you might end up with non-thin provisioned VM's on NFS and may therefore hit this bug.

Darren

________________________________

From: Glenn Walker [mailto:ggwalker@mindspring.com] Sent: Tue 10/14/2008 03:23 To: Darren Sykes; Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume?

________________________________

Jeremy/All,

Following on from our conversation offline:

It would seem you (and I) have been suffering from the bug described here: http://now.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=281669

We saw it on template volumes. I'm planning to disable ASIS on that volume to attempt to speed up access.

Obviously, that solution may be less than useful in your environment where it's live data volumes which benefit from ASIS.

Darren

________________________________

From: owner-toasters@mathworks.com on behalf of Page, Jeremy Sent: Mon 10/13/2008 13:14 To: toasters@mathworks.com Subject: Snapvault slow on one specific volume?

CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk

ops/s in out read write read write age hit time ty util

12% 1007 2970 8996 15769 9117 0 0 8 93% 49% : 41%

18% 920 2792 6510 11715 6924 0 0 8 99% 84% T 39%

15% 1276 3580 10469 15942 8041 0 0 10 92% 33% T 36%

13% 1487 3416 11347 15632 4907 0 0 11 89% 42% : 43%

17% 1417 3180 9890 14000 9444 0 0 9 98% 79% T 41%

13% 972 3704 9705 15427 9934 0 0 7 92% 46% T 51%

18% 1087 2947 11911 17717 4640 0 0 9 98% 33% T 47%

11% 1204 3358 11219 14090 5159 0 0 7 88% 100% : 50%

12% 1161 2808 9085 12640 5936 0 0 9 90% 33% T 44%

13% 981 4735 11919 16125 7097 0 0 9 92% 45% : 43%

15% 1158 5780 12480 17565 8266 0 0 10 92% 88% T 41%

To report this email as spam click here https://www.mailcontrol.com/sr/wQw0zmjPoHdJTZGyOCrrhg== .

Glenn Walker

11:43 p.m.

Guessing that it sped up access due to the bug (and/or regular performance degradation from hitting the same blocks multiple times due to de-dupe)?

With some of the dedupe improvements rumored in 7.3, I'd expect that to potentially improve.

________________________________

From: Darren Sykes [mailto:Darren.Sykes@csr.com] Sent: Tuesday, October 14, 2008 9:45 AM To: Glenn Walker; Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume?

SMotion - you'd hope (without breaking any NDA's) that they would address that in the next version, and possibly give you the option to specify thin or fat disks explicitly.

Out of interest - I removed dedup on our templates volume and a VM provisioning job that took 16 mins yesterday took less than 5 mins today.

Darren.

________________________________

From: Glenn Walker [mailto:ggwalker@mindspring.com] Sent: 14 October 2008 13:43 To: Darren Sykes; Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume?

FC and iSCSI does mean FAT VMDK, unless you create them manually and specify thin provisioned (not typical). The Storage VMotion info is good to know - I hope they get that fixed soon.

Glenn

________________________________

From: Darren Sykes [mailto:Darren.Sykes@csr.com] Sent: Tuesday, October 14, 2008 3:00 AM To: Glenn Walker; Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume?

Glenn,

So, in summary there are a few reasons why you might end up with non-thin provisioned VM's on NFS and may therefore hit this bug.

Darren

________________________________

From: Glenn Walker [mailto:ggwalker@mindspring.com] Sent: Tue 10/14/2008 03:23 To: Darren Sykes; Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume?

________________________________

Jeremy/All,

Following on from our conversation offline:

It would seem you (and I) have been suffering from the bug described here: http://now.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=281669

We saw it on template volumes. I'm planning to disable ASIS on that volume to attempt to speed up access.

Obviously, that solution may be less than useful in your environment where it's live data volumes which benefit from ASIS.

Darren

________________________________

size=2 width="100%" align=center tabIndex=-1>

From: owner-toasters@mathworks.com on behalf of Page, Jeremy Sent: Mon 10/13/2008 13:14 To: toasters@mathworks.com Subject: Snapvault slow on one specific volume?

CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk

ops/s in out read write read write age hit time ty util

12% 1007 2970 8996 15769 9117 0 0 8 93% 49% : 41%

18% 920 2792 6510 11715 6924 0 0 8 99% 84% T 39%

15% 1276 3580 10469 15942 8041 0 0 10 92% 33% T 36%

13% 1487 3416 11347 15632 4907 0 0 11 89% 42% : 43%

17% 1417 3180 9890 14000 9444 0 0 9 98% 79% T 41%

13% 972 3704 9705 15427 9934 0 0 7 92% 46% T 51%

18% 1087 2947 11911 17717 4640 0 0 9 98% 33% T 47%

11% 1204 3358 11219 14090 5159 0 0 7 88% 100% : 50%

12% 1161 2808 9085 12640 5936 0 0 9 90% 33% T 44%

13% 981 4735 11919 16125 7097 0 0 9 92% 45% : 43%

15% 1158 5780 12480 17565 8266 0 0 10 92% 88% T 41%

To report this email as spam click here https://www.mailcontrol.com/sr/wQw0zmjPoHdJTZGyOCrrhg== .

Darren Sykes

15 Oct 15 Oct

8:10 a.m.

I¹d say it was probably the bug; in theory dedup should increase performance in that situation as the block would be stored in the cache and therefore we wouldn¹t need to go to disk to get that data. The degradation is also really pronounced doing a plain old rsync file copy of the VMDK¹s shows great performance right until it hits the sequence of dedup¹d blank blocks, then the transfer literally almost stops.

From speaking to a couple of people, I wouldn¹t expect 7.2.5.1 or 7.3 to provide many improvements over the latest P releases of of 7.2.4. The bug is more likely to be fixed in 7.3.1, so it¹s a good 4 months away.

However, as the NOW case mentioned, this won¹t affect the normal operation of a VM, assuming you¹re not attempting to do anything horrible like a block for block disk image. Someone also suggested that a disk format might cause problems, however I¹d query that as writes are unaffected, it¹s just reading back those blank blocks. In normal operation a guest OS would never do that, as far as it¹s concerned (due to the information stored within the guest¹s file system) they¹re just unallocated blocks on the disk that¹ll never be read.

Darren

On 15/10/2008 00:43, "Glenn Walker" ggwalker@mindspring.com wrote:

...

Guessing that it sped up access due to the bug (and/or regular performance degradation from hitting the same blocks multiple times due to de-dupe)?

With some of the dedupe improvements rumored in 7.3, I¹d expect that to potentially improve.

From: Darren Sykes [mailto:Darren.Sykes@csr.com] Sent: Tuesday, October 14, 2008 9:45 AM To: Glenn Walker; Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume?

SMotion - you'd hope (without breaking any NDA's) that they would address that in the next version, and possibly give you the option to specify thin or fat disks explicitly.

Out of interest - I removed dedup on our templates volume and a VM provisioning job that took 16 mins yesterday took less than 5 mins today.

Darren.

From: Glenn Walker [mailto:ggwalker@mindspring.com] Sent: 14 October 2008 13:43 To: Darren Sykes; Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume? FC and iSCSI does mean FAT VMDK, unless you create them manually and specify thin provisioned (not typical). The Storage VMotion info is good to know I hope they get that fixed soon. Thanks for the additional info it¹s something for us to watch out for. We went NFS from the start (and performed P2V and V2V into the NFS-based datastores), but I know that SVMotion has been used and templates as well. I¹ll try to check our use of templates a bit later today Glenn

From: Darren Sykes [mailto:Darren.Sykes@csr.com] Sent: Tuesday, October 14, 2008 3:00 AM To: Glenn Walker; Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume?

Glenn,

That's true; by default all new VM's created on an NFS volume would be thin provisioned. I'm not sure if that's the case for templates though (I thought they were created fat on purpose for performance reasons when deploying them).

Also, we migrated from FC and iSCSI LUNS (which is basically a file copy) so most of our VM's are fat anyway. From what I understand using SVMOTION also results in a fat filed VM, though that's not officially supported on NFS in ESX3.5.

So, in summary there are a few reasons why you might end up with non-thin provisioned VM's on NFS and may therefore hit this bug.

Darren

From: Glenn Walker [mailto:ggwalker@mindspring.com] Sent: Tue 10/14/2008 03:23 To: Darren Sykes; Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume?

I was under the impression that ESX over NFS used thin-provisioned VMDKs by default (that¹s how it is in our environment, and all of the files are appearing as thin-provisioned). Would this then be not the same bug? Thin-provisioned VMDKs means that the portion of the VMDK not allocated to the guest would be treated as a sparse file, not a file filled with zeros. (unless someone decided to perform a full format on the disk¹, perhaps?)

From: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com] On Behalf Of Darren Sykes Sent: Monday, October 13, 2008 3:51 PM To: Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume?

Jeremy/All,

Following on from our conversation offline:

It would seem you (and I) have been suffering from the bug described here: http://now.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=281669

We saw it on template volumes. I'm planning to disable ASIS on that volume to attempt to speed up access.

Obviously, that solution may be less than useful in your environment where it's live data volumes which benefit from ASIS.

Darren

size=2 width="100%" align=center tabIndex=-1> From: owner-toasters@mathworks.com on behalf of Page, Jeremy Sent: Mon 10/13/2008 13:14 To: toasters@mathworks.com Subject: Snapvault slow on one specific volume?

I have an aggr with two volumes on it. One of them is a 3.5 TB CIFS/NFS share that is reasonably fast to snapvault and a 1 TB NFS share (ESX VMs) that is exceptionally slow. As in it¹s been doing it¹s initial copy for over a week and still has not finished. NDMP backups of this volume are also quite slow, does anyone know why it would be so much slower then the other volume using the same spindles? The filer is not under extreme load, although occasionally it¹s pretty busy. Here is a ³normal² sysstat: CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk ops/s in out read write read write age hit time ty util 12% 1007 2970 8996 15769 9117 0 0 8 93% 49% : 41% 18% 920 2792 6510 11715 6924 0 0 8 99% 84% T 39% 15% 1276 3580 10469 15942 8041 0 0 10 92% 33% T 36% 13% 1487 3416 11347 15632 4907 0 0 11 89% 42% : 43% 17% 1417 3180 9890 14000 9444 0 0 9 98% 79% T 41% 13% 972 3704 9705 15427 9934 0 0 7 92% 46% T 51% 18% 1087 2947 11911 17717 4640 0 0 9 98% 33% T 47% 11% 1204 3358 11219 14090 5159 0 0 7 88% 100% : 50% 12% 1161 2808 9085 12640 5936 0 0 9 90% 33% T 44% 13% 981 4735 11919 16125 7097 0 0 9 92% 45% : 43% 15% 1158 5780 12480 17565 8266 0 0 10 92% 88% T 41% I¹m just having difficulty trying to determine why two volumes on the same spindles would be so different in the time it takes to do their initial transfer. Also, the VM¹s do not seem slower then those hosted on other aggregates (this one is 3 RG of 11 disks each, Ontap 7.2.4 on a 3070A IBM rebranded). To report this email as spam click here https://www.mailcontrol.com/sr/wQw0zmjPoHdJTZGyOCrrhg== .

This message (including any attachments) contains confidential and/or proprietary information intended only for the addressee. Any unauthorized disclosure, copying, distribution or reliance on the contents of this information is strictly prohibited and may constitute a violation of law. If you are not the intended recipient, please notify the sender immediately by responding to this e-mail, and delete the message from your system. If you have any questions about this e-mail please notify the sender immediately.

Milazzo Giacomo

1:21 p.m.

New subject: R: Snapvault slow on one specific volume?

Reading about this they seem the same issues with Exchange on WAFL LUNs on a volume WITHOUT the "vol options extent on".

You certainly know that if you have an old Data Ontap without that options (it came fm 7.2 if I well remember) or if you leave it off and you have Exchange LUNs this can cause terribly bad performances (and fragmentation nightmares) as the one here reported.

Waiting for a fix I put it so...why don't you put the extent vol options to on and try again? J

Regards

Da: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com] Per conto di Darren Sykes Inviato: mercoledì 15 ottobre 2008 10.10 A: Glenn Walker; Page, Jeremy; toasters@mathworks.com Oggetto: Re: Snapvault slow on one specific volume?

I'd say it was probably the bug; in theory dedup should increase performance in that situation as the block would be stored in the cache and therefore we wouldn't need to go to disk to get that data. The degradation is also really pronounced - doing a plain old rsync file copy of the VMDK's shows great performance right until it hits the sequence of dedup'd blank blocks, then the transfer literally almost stops.

From speaking to a couple of people, I wouldn't expect 7.2.5.1 or 7.3 to provide many improvements over the latest P releases of of 7.2.4. The bug is more likely to be fixed in 7.3.1, so it's a good 4 months away.

However, as the NOW case mentioned, this won't affect the normal operation of a VM, assuming you're not attempting to do anything horrible - like a block for block disk image. Someone also suggested that a disk format might cause problems, however I'd query that as writes are unaffected, it's just reading back those blank blocks. In normal operation a guest OS would never do that, as far as it's concerned (due to the information stored within the guest's file system) they're just unallocated blocks on the disk that'll never be read.

Darren

On 15/10/2008 00:43, "Glenn Walker" ggwalker@mindspring.com wrote:

Guessing that it sped up access due to the bug (and/or regular performance degradation from hitting the same blocks multiple times due to de-dupe)?

With some of the dedupe improvements rumored in 7.3, I'd expect that to potentially improve.

________________________________

From: Darren Sykes [mailto:Darren.Sykes@csr.com] Sent: Tuesday, October 14, 2008 9:45 AM To: Glenn Walker; Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume?

SMotion - you'd hope (without breaking any NDA's) that they would address that in the next version, and possibly give you the option to specify thin or fat disks explicitly.

Out of interest - I removed dedup on our templates volume and a VM provisioning job that took 16 mins yesterday took less than 5 mins today.

Darren.

________________________________

From: Glenn Walker [mailto:ggwalker@mindspring.com] Sent: 14 October 2008 13:43 To: Darren Sykes; Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume? FC and iSCSI does mean FAT VMDK, unless you create them manually and specify thin provisioned (not typical). The Storage VMotion info is good to know - I hope they get that fixed soon. Thanks for the additional info - it's something for us to watch out for. We went NFS from the start (and performed P2V and V2V into the NFS-based datastores), but I know that SVMotion has been used and templates as well. I'll try to check our use of templates a bit later today... Glenn

________________________________

From: Darren Sykes [mailto:Darren.Sykes@csr.com] Sent: Tuesday, October 14, 2008 3:00 AM To: Glenn Walker; Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume?

Glenn,

So, in summary there are a few reasons why you might end up with non-thin provisioned VM's on NFS and may therefore hit this bug.

Darren

________________________________

From: Glenn Walker [mailto:ggwalker@mindspring.com] Sent: Tue 10/14/2008 03:23 To: Darren Sykes; Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume?

________________________________

Jeremy/All,

Following on from our conversation offline:

It would seem you (and I) have been suffering from the bug described here: http://now.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=281669

We saw it on template volumes. I'm planning to disable ASIS on that volume to attempt to speed up access.

Obviously, that solution may be less than useful in your environment where it's live data volumes which benefit from ASIS.

Darren

________________________________

size=2 width="100%" align=center tabIndex=-1>

From: owner-toasters@mathworks.com on behalf of Page, Jeremy Sent: Mon 10/13/2008 13:14 To: toasters@mathworks.com Subject: Snapvault slow on one specific volume?

I have an aggr with two volumes on it. One of them is a 3.5 TB CIFS/NFS share that is reasonably fast to snapvault and a 1 TB NFS share (ESX VMs) that is exceptionally slow. As in it's been doing it's initial copy for over a week and still has not finished. NDMP backups of this volume are also quite slow, does anyone know why it would be so much slower then the other volume using the same spindles? The filer is not under extreme load, although occasionally it's pretty busy. Here is a "normal" sysstat: CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk ops/s in out read write read write age hit time ty util 12% 1007 2970 8996 15769 9117 0 0 8 93% 49% : 41% 18% 920 2792 6510 11715 6924 0 0 8 99% 84% T 39% 15% 1276 3580 10469 15942 8041 0 0 10 92% 33% T 36% 13% 1487 3416 11347 15632 4907 0 0 11 89% 42% : 43% 17% 1417 3180 9890 14000 9444 0 0 9 98% 79% T 41% 13% 972 3704 9705 15427 9934 0 0 7 92% 46% T 51% 18% 1087 2947 11911 17717 4640 0 0 9 98% 33% T 47% 11% 1204 3358 11219 14090 5159 0 0 7 88% 100% : 50% 12% 1161 2808 9085 12640 5936 0 0 9 90% 33% T 44% 13% 981 4735 11919 16125 7097 0 0 9 92% 45% : 43% 15% 1158 5780 12480 17565 8266 0 0 10 92% 88% T 41% I'm just having difficulty trying to determine why two volumes on the same spindles would be so different in the time it takes to do their initial transfer. Also, the VM's do not seem slower then those hosted on other aggregates (this one is 3 RG of 11 disks each, Ontap 7.2.4 on a 3070A IBM rebranded).

To report this email as spam click here https://www.mailcontrol.com/sr/wQw0zmjPoHdJTZGyOCrrhg== .

Todd C. Merrill

13 Nov 13 Nov

5:03 p.m.

New subject: de-dup of memory? (was Re: Snapvault slow on one specific volume?)

On Wed, 15 Oct 2008, Darren Sykes wrote:

...

I�d say it was probably the bug; in theory dedup should increase performance in that situation as the block would be stored in the cache and therefore we wouldn�t need to go to disk to get that data.

Is this true?

Is memory "de-dup'ed" as well? That is, when you access a file through the cache, if it ends up at a de-dup'ed block, is it really the same block in *memory*, too? Or, is the path through memory/cache unique and only the final disk store de-dup'ed?

Until next time...

The MathWorks, Inc. 508-647-7000 x7792 3 Apple Hill Drive, Natick, MA 01760-2098 508-647-7001 FAX tmerrill@mathworks.com http://www.mathworks.com ---

Page, Jeremy

5:05 p.m.

New subject: de-dup of memory? (was Re: Snapvault slow on one specific volume?)

I don't know :)

In theory it should work. When you request data you're not requesting it from the disk, your asking for a specific block. If that block is in cache it should not matter if it's deduped (and pretending to be several blocks) or not.

-----Original Message----- From: Todd C. Merrill [mailto:tmerrill@mathworks.com] Sent: Thursday, November 13, 2008 12:04 PM To: Darren Sykes Cc: Glenn Walker; Page, Jeremy; toasters@mathworks.com Subject: de-dup of memory? (was Re: Snapvault slow on one specific volume?)

On Wed, 15 Oct 2008, Darren Sykes wrote:

...

I¹d say it was probably the bug; in theory dedup should increase performance in that situation as the block would be stored in the cache and therefore we wouldn¹t need to go to disk to get that data.

Is this true?

Until next time...

The MathWorks, Inc. 508-647-7000 x7792 3 Apple Hill Drive, Natick, MA 01760-2098 508-647-7001 FAX tmerrill@mathworks.com http://www.mathworks.com ---

Please be advised that this email may contain confidential information. If you are not the intended recipient, please do not read, copy or re-transmit this email. If you have received this email in error, please notify us by email by replying to the sender and by telephone (call us collect at +1 202-828-0850) and delete this message and any attachments. Thank you in advance for your cooperation and assistance.

In addition, Danaher and its subsidiaries disclaim that the content of this email constitutes an offer to enter into, or the acceptance of, any contract or agreement or any amendment thereto; provided that the foregoing disclaimer does not invalidate the binding effect of any digital or other electronic reproduction of a manual signature that is included in any attachment to this email.

Darren Sykes

10:34 p.m.

New subject: de-dup of memory? (was Re: Snapvault slow on one specific volume?)

No, not true (yet).

I was misinformed, though I do believe it's a planned feature. At the moment the filer's not bright enough to store the block once in cache.

________________________________

From: Page, Jeremy [mailto:jeremy.page@gilbarco.com] Sent: Thu 11/13/2008 17:05 To: Todd C. Merrill; Darren Sykes Cc: Glenn Walker; toasters@mathworks.com Subject: RE: de-dup of memory? (was Re: Snapvault slow on one specific volume?)

I don't know :)

On Wed, 15 Oct 2008, Darren Sykes wrote:

...

I¹d say it was probably the bug; in theory dedup should increase performance in that situation as the block would be stored in the cache and therefore we wouldn¹t need to go to disk to get that data.

Is this true?

Until next time...

The MathWorks, Inc. 508-647-7000 x7792 3 Apple Hill Drive, Natick, MA 01760-2098 508-647-7001 FAX tmerrill@mathworks.com http://www.mathworks.com http://www.mathworks.com/ ---

To report this email as spam click https://www.mailcontrol.com/sr/9!QLsXAQzo7TndxI!oX7UtTXwRoU2H8X2u0YC71a7uG!m... .

Chris Muellner

5:41 p.m.

New subject: de-dup of memory? (was Re: Snapvault slow on one specific volume?)

From what I understand the filer isn't smart enough to know that multiple reads are coming in to attack a deduplicated block so it will attack the disk for every read request. This can be alleviated through the use of PAM cards in the filer currently, otherwise I believe it is mostly "fixed" in 7.2.6 (before it was pulled) and the upcoming 7.3.1.

It's really pronounced in VMware environments if you test it with a boot storm (fire up a bunch of deduplicated VMs at once) or deploy VMs from a template.

-----Original Message----- From: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com] On Behalf Of Todd C. Merrill Sent: Thursday, November 13, 2008 11:04 AM To: Darren Sykes Cc: Glenn Walker; Page, Jeremy; toasters@mathworks.com Subject: de-dup of memory? (was Re: Snapvault slow on one specific volume?)

On Wed, 15 Oct 2008, Darren Sykes wrote:

...

I¹d say it was probably the bug; in theory dedup should increase performance in that situation as the block would be stored in the cache and therefore we wouldn¹t need to go to disk to get that data.

Is this true?

Until next time...

The MathWorks, Inc. 508-647-7000 x7792 3 Apple Hill Drive, Natick, MA 01760-2098 508-647-7001 FAX tmerrill@mathworks.com http://www.mathworks.com ---

Page, Jeremy

14 Oct 14 Oct

12:19 p.m.

I think that if you clone from template it creates a full disk, although I'm not certain of that.

________________________________

From: Glenn Walker [mailto:ggwalker@mindspring.com] Sent: Monday, October 13, 2008 10:24 PM To: Darren Sykes; Page, Jeremy; toasters@mathworks.com Subject: RE: Snapvault slow on one specific volume?

________________________________

Jeremy/All,

Following on from our conversation offline:

It would seem you (and I) have been suffering from the bug described here: http://now.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=281669

We saw it on template volumes. I'm planning to disable ASIS on that volume to attempt to speed up access.

Obviously, that solution may be less than useful in your environment where it's live data volumes which benefit from ASIS.

Darren

________________________________

From: owner-toasters@mathworks.com on behalf of Page, Jeremy Sent: Mon 10/13/2008 13:14 To: toasters@mathworks.com Subject: Snapvault slow on one specific volume?

CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk

ops/s in out read write read write age hit time ty util

12% 1007 2970 8996 15769 9117 0 0 8 93% 49% : 41%

18% 920 2792 6510 11715 6924 0 0 8 99% 84% T 39%

15% 1276 3580 10469 15942 8041 0 0 10 92% 33% T 36%

13% 1487 3416 11347 15632 4907 0 0 11 89% 42% : 43%

17% 1417 3180 9890 14000 9444 0 0 9 98% 79% T 41%

13% 972 3704 9705 15427 9934 0 0 7 92% 46% T 51%

18% 1087 2947 11911 17717 4640 0 0 9 98% 33% T 47%

11% 1204 3358 11219 14090 5159 0 0 7 88% 100% : 50%

12% 1161 2808 9085 12640 5936 0 0 9 90% 33% T 44%

13% 981 4735 11919 16125 7097 0 0 9 92% 45% : 43%

15% 1158 5780 12480 17565 8266 0 0 10 92% 88% T 41%

To report this email as spam click here https://www.mailcontrol.com/sr/zzgL!cim+5DTndxI!oX7UlnwUb8+3gKcXgEEExNz fWq8foXmVffzqaGqghvuFfK3dHzYUgy2lOyjfskgLULbMA== .

6125

Age (days ago)

6156

Last active (days ago)

toasters@lists.teaparty.net

15 comments

7 participants

tags (0)

participants (7)

Chris Muellner
Darren Sykes
George T Chen
Glenn Walker
Milazzo Giacomo
Page, Jeremy
Todd C. Merrill