Lessons Learned - Upgrade to 7.2.2 (i2p) - toasters

List overview All Threads
Download

newer

Lessons Learned - Upgrade to 7.2.2 (i2p)

older

VMware ESX and NFS Performance...

NetApp Thin Provisioning and...

Glenn Walker

11 Jul 2007 11 Jul '07

2:37 a.m.

While the subject is 'i2p', this isn't an email telling you about the mapping that causes some high utilization, rather it is about the aftermath of the i2p mapping process.

I recently performed an upgrade of some of our nearstore systems (used for email compliance, among other things), and was expecting a pretty serious utilization curve while the i2p mapping was performed as part of the upgrade (we upgraded from 7.0.6, not 7.1.X). This came and passed with high utilization of CPU and DISK, but no real issues. However, the i2p mapping apparently keeps the inode-2-path map for every file\inode in the metadata - we have volumes with snaplocked data in the terabytes, the most heavily utilized of which is about 110million files (the 2nd highest utilized is about 82million). As this data has been marked as 'archive' in the Enterprise Vault system, it isn't being written to anymore, so I'm not terribly concerned with the snapmirror update taking a while - however, it is throttled down pretty low as our WAN is not yet up to snuff (coming later this year). Apparently, because i2p touches each inode, we have about 110million*4KB to transfer (~420GB) for the 'update' to our 4.2TB volume. Likewise, the 3.2TB volume with 82million files had about 313GB to transfer.

If you have volumes with a significant amount of files, be aware that the snapmirror update after the i2p process completes post-7.2.2 upgrade (or 7.1.1 even, I suppose) will be fairly significant in size.

Hope this helps some of you out there so that you are not caught unaware.

'evening

Glenn

Attachments:

attachment.html (text/html — 3.9 KB)

Show replies by date

Mike Partyka

11 Jul 11 Jul

2:03 p.m.

Thanks for the warning Glenn, that would surely have bitten me before too long.

Sent via BlackBerry from T-Mobile

-----Original Message----- From: "Glenn Walker" ggwalker@mindspring.com

Date: Tue, 10 Jul 2007 22:37:51 To:toasters@mathworks.com Subject: Lessons Learned - Upgrade to 7.2.2 (i2p)

While the subject is ‘i2p’, this isn’t an email telling you about the mapping that causes some high utilization, rather it is about the aftermath of the i2p mapping process. I recently performed an upgrade of some of our nearstore systems (used for email compliance, among other things), and was expecting a pretty serious utilization curve while the i2p mapping was performed as part of the upgrade (we upgraded from 7.0.6, not 7.1.X). This came and passed with high utilization of CPU and DISK, but no real issues. However, the i2p mapping apparently keeps the inode-2-path map for every file\inode in the metadata – we have volumes with snaplocked data in the terabytes, the most heavily utilized of which is about 110million files (the 2nd highest utilized is about 82million). As this data has been marked as ‘archive’ in the Enterprise Vault system, it isn’t being written to anymore, so I’m not terribly concerned with the snapmirror update taking a while – however, it is throttled down pretty low as our WAN is not yet up to snuff (coming later this year). Apparently, because i2p touches each inode, we have about 110million*4KB to transfer (~420GB) for the ‘update’ to our 4.2TB volume. Likewise, the 3.2TB volume with 82million files had about 313GB to transfer. If you have volumes with a significant amount of files, be aware that the snapmirror update after the i2p process completes post-7.2.2 upgrade (or 7.1.1 even, I suppose) will be fairly significant in size. Hope this helps some of you out there so that you are not caught unaware. ‘evening Glenn

Filip Sneppe

18 Jul 18 Jul

4:16 p.m.

On 7/11/07, Glenn Walker ggwalker@mindspring.com wrote:

...

I recently performed an upgrade of some of our nearstore systems (used for email compliance, among other things), and was expecting a pretty serious utilization curve while the i2p mapping was performed as part of the upgrade (we upgraded from 7.0.6, not 7.1.X). This came and passed with high utilization of CPU and DISK, but no real issues.

Hi, this is an interesting topic. A colleague and I upgraded a FAS3020c and an R200 to 7.2.3 last night, and the upgrade itself went smooth. The FAS3020c had about 40 volumes per controller, totalling about 10-15 Tb of data, most of it Oracle on NFS databases and medical imaging (PACS) volumes with many, many inodes per volume.

After the reboot, CPU and disk utilization were very high on both controllers. On one controller, we had console access, but NFS/CIFS/Web interface response were so slow that we couldn't mount anything for over an hour while all the volumes were doing the i2p mapping. The other controller had a high load too, but was able to serve clients from immediately after the reboot.

On the "slow" controller, it took until 5h30 this morning until all volumes were done with the i2p mapping (we performed the upgrade at around 21h30 the day before).

Fortunately, we were more or less prepared for this. I do have two additional questions though:

- We were kind of surprised that one controller did so much worse than the other one after the reboot, since from our observations the load on both systems in terms of data sizes, inodes used, etc. is pretty similar.

What determines the time required for the i2p rcalculation to finish ? Does it depend on the number of inodes used, the number of blocks used, the level of fragmentation ? Free volume or aggregate space ? What are determining factors, so we can factor this in for future upgrades.

- Would there have been any way to throttle this, especially since we were on 7.2 (FlexShare)? All we could basically do, was monitor the progress with "wafl scan status", but is it possible to run this as a low(er) priority task or suspend it for a number of non-critical volumes ?

Thanks in advance, Filip

Blake Golliher

6:10 p.m.

Wow, that's no good at all. Is there burt open for this at netapp?

-Blake

On 7/18/07, Filip Sneppe filip.sneppe@gmail.com wrote:

...

On 7/11/07, Glenn Walker ggwalker@mindspring.com wrote:

...
I recently performed an upgrade of some of our nearstore systems (used for email compliance, among other things), and was expecting a pretty serious utilization curve while the i2p mapping was performed as part of the

upgrade

...
(we upgraded from 7.0.6, not 7.1.X). This came and passed with high utilization of CPU and DISK, but no real issues.

Hi, this is an interesting topic. A colleague and I upgraded a FAS3020c and an R200 to 7.2.3 last night, and the upgrade itself went smooth. The FAS3020c had about 40 volumes per controller, totalling about 10-15 Tb of data, most of it Oracle on NFS databases and medical imaging (PACS) volumes with many, many inodes per volume.

After the reboot, CPU and disk utilization were very high on both controllers. On one controller, we had console access, but NFS/CIFS/Web interface response were so slow that we couldn't mount anything for over an hour while all the volumes were doing the i2p mapping. The other controller had a high load too, but was able to serve clients from immediately after the reboot.

On the "slow" controller, it took until 5h30 this morning until all volumes were done with the i2p mapping (we performed the upgrade at around 21h30 the day before).

Fortunately, we were more or less prepared for this. I do have two additional questions though:

We were kind of surprised that one controller did so much worse than

the other one after the reboot, since from our observations the load on both systems in terms of data sizes, inodes used, etc. is pretty similar.

What determines the time required for the i2p rcalculation to finish ? Does it depend on the number of inodes used, the number of blocks used, the level of fragmentation ? Free volume or aggregate space ? What are determining factors, so we can factor this in for future upgrades.

Would there have been any way to throttle this, especially since we

were on 7.2 (FlexShare)? All we could basically do, was monitor the progress with "wafl scan status", but is it possible to run this as a low(er) priority task or suspend it for a number of non-critical volumes ?

Thanks in advance, Filip

Glenn Walker

11:07 p.m.

I believe there to be a BURT for the i2p mapping taking a while (or at least the upgrade from 7.1 to 7.2 causing the trashing of the current i2p map and recreating it when probably not necessary).

I don't believe there is a BURT open for the huge metadata changes to the files - this should probably be a KB instead, but I've been unable to create one via NOW.

Perhaps some enterprising young TSE will create one for me?? I know they read this list ;)

-----Original Message----- From: Blake Golliher [mailto:thelastman@gmail.com] Sent: Wednesday, July 18, 2007 2:11 PM To: Filip Sneppe; Glenn Walker; toasters@mathworks.com Subject: Re: Lessons Learned - Upgrade to 7.2.2 (i2p)

Wow, that's no good at all. Is there burt open for this at netapp?

-Blake

On 7/18/07, Filip Sneppe filip.sneppe@gmail.com wrote:

...

On 7/11/07, Glenn Walker ggwalker@mindspring.com wrote:

...
I recently performed an upgrade of some of our nearstore systems

(used for

...

...
email compliance, among other things), and was expecting a pretty

serious

...

...
utilization curve while the i2p mapping was performed as part of the

upgrade

...
(we upgraded from 7.0.6, not 7.1.X). This came and passed with high utilization of CPU and DISK, but no real issues.

Hi, this is an interesting topic. A colleague and I upgraded a FAS3020c and an R200 to 7.2.3 last night, and the upgrade itself went smooth. The FAS3020c had about 40 volumes per controller, totalling about 10-15 Tb of data, most of it Oracle on NFS databases and medical imaging (PACS) volumes with many, many inodes per volume.

After the reboot, CPU and disk utilization were very high on both controllers. On one controller, we had console access, but NFS/CIFS/Web interface response were so slow that we couldn't mount anything for over an hour while

all the

...

volumes were doing the i2p mapping. The other controller had a high

load

...

too, but was able to serve clients from immediately after the reboot.

On the "slow" controller, it took until 5h30 this morning until all volumes were done with the i2p mapping (we performed the upgrade at around 21h30 the day before).

Fortunately, we were more or less prepared for this. I do have two additional questions though:

We were kind of surprised that one controller did so much worse than

the other one after the reboot, since from our observations the load on both systems in terms of data sizes, inodes used, etc. is pretty similar.

What determines the time required for the i2p rcalculation to finish ? Does it depend on the number of inodes used, the number of blocks used, the level of fragmentation ? Free volume or aggregate space ? What are determining factors, so we can factor this in for future upgrades.

Would there have been any way to throttle this, especially since we

were on 7.2 (FlexShare)? All we could basically do, was monitor the progress with "wafl scan status", but is it possible to run this as a low(er) priority task or suspend it for a number of non-critical volumes ?

Thanks in advance, Filip

Glenn Walker

19 Jul 19 Jul

2:39 a.m.

Filip, to answer your questions (that I completely glanced over):

We did not notice an actual degradation in service: while the load was pretty significant (CPU\Disk), it appeared that it was a low-priority process as the load was unfettered. That said, the system as a whole is not typically heavily utilized from a general I/O perspective.

I know of no way to throttle this, but hoping that a NetApp person can point out one.

-----Original Message----- From: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com] On Behalf Of Glenn Walker Sent: Wednesday, July 18, 2007 7:07 PM To: Blake Golliher; Filip Sneppe; toasters@mathworks.com Subject: RE: Lessons Learned - Upgrade to 7.2.2 (i2p)

I believe there to be a BURT for the i2p mapping taking a while (or at least the upgrade from 7.1 to 7.2 causing the trashing of the current i2p map and recreating it when probably not necessary).

I don't believe there is a BURT open for the huge metadata changes to the files - this should probably be a KB instead, but I've been unable to create one via NOW.

Perhaps some enterprising young TSE will create one for me?? I know they read this list ;)

Wow, that's no good at all. Is there burt open for this at netapp?

-Blake

On 7/18/07, Filip Sneppe filip.sneppe@gmail.com wrote:

...

On 7/11/07, Glenn Walker ggwalker@mindspring.com wrote:

...
I recently performed an upgrade of some of our nearstore systems

(used for

...

...
email compliance, among other things), and was expecting a pretty

serious

...

...
utilization curve while the i2p mapping was performed as part of the

upgrade

...
(we upgraded from 7.0.6, not 7.1.X). This came and passed with high utilization of CPU and DISK, but no real issues.

Hi, this is an interesting topic. A colleague and I upgraded a FAS3020c and an R200 to 7.2.3 last night, and the upgrade itself went smooth. The FAS3020c had about 40 volumes per controller, totalling about 10-15 Tb of data, most of it Oracle on NFS databases and medical imaging (PACS) volumes with many, many inodes per volume.

After the reboot, CPU and disk utilization were very high on both controllers. On one controller, we had console access, but NFS/CIFS/Web interface response were so slow that we couldn't mount anything for over an hour while

all the

...

volumes were doing the i2p mapping. The other controller had a high

load

...

too, but was able to serve clients from immediately after the reboot.

On the "slow" controller, it took until 5h30 this morning until all volumes were done with the i2p mapping (we performed the upgrade at around 21h30 the day before).

Fortunately, we were more or less prepared for this. I do have two additional questions though:

We were kind of surprised that one controller did so much worse than

the other one after the reboot, since from our observations the load on both systems in terms of data sizes, inodes used, etc. is pretty similar.

What determines the time required for the i2p rcalculation to finish ? Does it depend on the number of inodes used, the number of blocks used, the level of fragmentation ? Free volume or aggregate space ? What are determining factors, so we can factor this in for future upgrades.

Would there have been any way to throttle this, especially since we

were on 7.2 (FlexShare)? All we could basically do, was monitor the progress with "wafl scan status", but is it possible to run this as a low(er) priority task or suspend it for a number of non-critical volumes ?

Thanks in advance, Filip

6714

Age (days ago)

6722

Last active (days ago)

toasters@lists.teaparty.net

5 comments

4 participants

tags (0)

participants (4)

Blake Golliher
Filip Sneppe
Glenn Walker
Mike Partyka