Two scenarios:
1. R200, 7.2.2, two 12 TB aggregates, 85 volumes of various sizes, 70-80% full snapvault secondary, ALWAYS receiving snapvaults and nearly ALWAYS spinning to tape.
2. 6070, 7.2.4, two 8 TB aggregates, 27 volumes between 150 and 700 GB, 60-70% full recent migration (via volume snapmirrors) from 980's, serves home directories to NFS and CIFS clients.
We stopped all snapmirrors/snapvaults to the R200, terminated CIFS, and shut off NFS. CPU utilization remained 90%+ and disk reads remained high for the next 6-7-8 hours until it finally quieted down.
After volume snapmirroring the home dirs from the 980's to the 6070, we re-pointed the snapvaults (modify; start -r) and allowed NFS and CIFS access. 5 days later, file server performance still blows (worse than the 980's -- D'oh! Egg, meet Face; Face, meet Egg), with moderate CPU load 30-40-50% but high disk reads.
In both cases, we determined that lots of WAFL scans were going on:
container block reclamation volume deswizzling active bitmap rearrangement (of course...this NEVER seems to stop)
and a few other types that have already scrolled out of my buffer.
So, we figure--eventually--the WAFL scans will catch up and performance will go back to normal on the 6070. The performance of the R200 always blows.
So...here finally is the question: do toasters need a day off?
Said another way, to maintain acceptable performance and low/fast response times, do toasters need a goodly amount of (relatively) idle time in order for these WAFL scans to complete?
Could the sub-optimal performance of another group of six 6070's that we have, that we *pound on* 7x24, be explained in part by always having at least the container block reclamation scan going on?
Until next time...
The MathWorks, Inc. 508-647-7000 x7792 3 Apple Hill Drive, Natick, MA 01760-2098 508-647-7001 FAX tmerrill@mathworks.com http://www.mathworks.com ---
Hi
I don't know whether toasters need a day off, though I always appreciate one.
My experience has been that there are several bugs in pre-7.2.4 releases which can cause lots of WAFL scans and generally poor performance; so my advice here would be to upgrade (preferably both) the boxes to 7.2.5.1 and then recheck performance.
cheers Kenneth
Date: Fri, 26 Sep 2008 12:41:54 -0400 From: tmerrill@mathworks.com To: toasters@mathworks.com Subject: do toaster need a day off? (WAFL scans)
Two scenarios:
R200, 7.2.2, two 12 TB aggregates, 85 volumes of various sizes, 70-80% full snapvault secondary, ALWAYS receiving snapvaults and nearly ALWAYS spinning to tape.
6070, 7.2.4, two 8 TB aggregates, 27 volumes between 150 and 700 GB, 60-70% full recent migration (via volume snapmirrors) from 980's, serves home directories to NFS and CIFS clients.
We stopped all snapmirrors/snapvaults to the R200, terminated CIFS, and shut off NFS. CPU utilization remained 90%+ and disk reads remained high for the next 6-7-8 hours until it finally quieted down.
After volume snapmirroring the home dirs from the 980's to the 6070, we re-pointed the snapvaults (modify; start -r) and allowed NFS and CIFS access. 5 days later, file server performance still blows (worse than the 980's -- D'oh! Egg, meet Face; Face, meet Egg), with moderate CPU load 30-40-50% but high disk reads.
In both cases, we determined that lots of WAFL scans were going on:
container block reclamation volume deswizzling active bitmap rearrangement (of course...this NEVER seems to stop)
and a few other types that have already scrolled out of my buffer.
So, we figure--eventually--the WAFL scans will catch up and performance will go back to normal on the 6070. The performance of the R200 always blows.
So...here finally is the question: do toasters need a day off?
Said another way, to maintain acceptable performance and low/fast response times, do toasters need a goodly amount of (relatively) idle time in order for these WAFL scans to complete?
Could the sub-optimal performance of another group of six 6070's that we have, that we *pound on* 7x24, be explained in part by always having at least the container block reclamation scan going on?
Until next time...
The MathWorks, Inc. 508-647-7000 x7792 3 Apple Hill Drive, Natick, MA 01760-2098 508-647-7001 FAX tmerrill@mathworks.com http://www.mathworks.com
_________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
Container block reclaimation (CBR) scans kick off when a snapshot is deleted. It's an asynchronus (in 99% of cases) way to free blocks that are no longer owned by any snapshot or the active file system. The only time this should become a problem is if there is a lot of data churn and a very tight snapshot schedule. Also, having many volumes take, and therefore roll off, snapshots at the same time can create some CPU based performance issues.
Deswizzling of course, has to do with flexvol VSM destinations. If you complete a massive baseline with a lot of high-delta snapshots, it may be worth giving the filer a bit of down time to work its way up the snapshot list and complete these scans. Once data is deswizzled, it should reap a read performance benefit if you are reading from the VSM destinations or cascading the snapmirror. We added a "checkpoint" system in 7.2.4p6 which makes the deswizzler much more efficient in that it does not reset back to the earlist snapshot each time a new VSM transfer completes. It remembers which snapshot it was working on, and where it was at.
Active Bitmap scans, *from what I understand*, prevent a theoretical problem from occuring (meaning one that has NEVER happened, but someone deemed that it could). I have never seen these scans create any type of performance problem whatsoever. I learned to block them out after a couple years of looking at perfstat output.
(need to verify the 7.2.4p6 version on the deswizzler checkpoint, I can't for the life of me find that information right now)
In short, the answer is "no, not usually", but if you have some concerns, I would open a case with support and have them take a look at things.
Hope this helps a bit.
- Michael Strickland Netapp Support
-----Original Message----- From: Todd C. Merrill [mailto:tmerrill@mathworks.com] Sent: Friday, September 26, 2008 12:42 PM To: toasters@mathworks.com Subject: do toaster need a day off? (WAFL scans)
Two scenarios:
1. R200, 7.2.2, two 12 TB aggregates, 85 volumes of various sizes, 70-80% full snapvault secondary, ALWAYS receiving snapvaults and nearly ALWAYS spinning to tape.
2. 6070, 7.2.4, two 8 TB aggregates, 27 volumes between 150 and 700 GB, 60-70% full recent migration (via volume snapmirrors) from 980's, serves home directories to NFS and CIFS clients.
We stopped all snapmirrors/snapvaults to the R200, terminated CIFS, and shut off NFS. CPU utilization remained 90%+ and disk reads remained high for the next 6-7-8 hours until it finally quieted down.
After volume snapmirroring the home dirs from the 980's to the 6070, we re-pointed the snapvaults (modify; start -r) and allowed NFS and CIFS access. 5 days later, file server performance still blows (worse than the 980's -- D'oh! Egg, meet Face; Face, meet Egg), with moderate CPU load 30-40-50% but high disk reads.
In both cases, we determined that lots of WAFL scans were going on:
container block reclamation volume deswizzling active bitmap rearrangement (of course...this NEVER seems to stop)
and a few other types that have already scrolled out of my buffer.
So, we figure--eventually--the WAFL scans will catch up and performance will go back to normal on the 6070. The performance of the R200 always blows.
So...here finally is the question: do toasters need a day off?
Said another way, to maintain acceptable performance and low/fast response times, do toasters need a goodly amount of (relatively) idle time in order for these WAFL scans to complete?
Could the sub-optimal performance of another group of six 6070's that we have, that we *pound on* 7x24, be explained in part by always having at least the container block reclamation scan going on?
Until next time...
The MathWorks, Inc. 508-647-7000 x7792 3 Apple Hill Drive, Natick, MA 01760-2098 508-647-7001 FAX tmerrill@mathworks.com http://www.mathworks.com ---
Folks, Before I write one myself or modify the NFS top tool on the NOW toolchest:
http://now.netapp.com/NOW/download/tools/ntaptop/
does anybody have an equivalent one for `qtree stats`?
Until next time...
The MathWorks, Inc. 508-647-7000 x7792 3 Apple Hill Drive, Natick, MA 01760-2098 508-647-7001 FAX tmerrill@mathworks.com http://www.mathworks.com ---
I don't have a filer in front of me to test, but isn't there a stats show qtree?
On 12/9/08 6:16 PM, "Todd C. Merrill" tmerrill@mathworks.com wrote:
Folks, Before I write one myself or modify the NFS top tool on the NOW toolchest:
http://now.netapp.com/NOW/download/tools/ntaptop/
does anybody have an equivalent one for `qtree stats`?
Until next time...
The MathWorks, Inc. 508-647-7000 x7792 3 Apple Hill Drive, Natick, MA 01760-2098 508-647-7001 FAX tmerrill@mathworks.com http://www.mathworks.com
You mean qtree stats?
It shows CIFS and NFS ops per qtree - doesn't work at the vol level (wish it would).
Glenn
-----Original Message----- From: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com] On Behalf Of Nicholas Bernstein Sent: Wednesday, December 10, 2008 1:00 PM To: Toasters Subject: Re: qtree "top" tool
I don't have a filer in front of me to test, but isn't there a stats show qtree?
On 12/9/08 6:16 PM, "Todd C. Merrill" tmerrill@mathworks.com wrote:
Folks, Before I write one myself or modify the NFS top tool on the NOW toolchest:
http://now.netapp.com/NOW/download/tools/ntaptop/
does anybody have an equivalent one for `qtree stats`?
Until next time...
The MathWorks, Inc. 508-647-7000 x7792 3 Apple Hill Drive, Natick, MA 01760-2098 508-647-7001 FAX tmerrill@mathworks.com http://www.mathworks.com
Yes, I know about `qtree stats`. But, just like nfsstat, it gives you cumulative stats.
The NFS top tool I mentioned does an incremental nfsstat. I'm looking for the same--*incremental* qtree stats, in order to debug high usage as it is happening.
On Wed, 10 Dec 2008, Glenn Walker wrote:
You mean qtree stats?
It shows CIFS and NFS ops per qtree - doesn't work at the vol level (wish it would).
Glenn
-----Original Message----- From: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com] On Behalf Of Nicholas Bernstein Sent: Wednesday, December 10, 2008 1:00 PM To: Toasters Subject: Re: qtree "top" tool
I don't have a filer in front of me to test, but isn't there a stats show qtree?
On 12/9/08 6:16 PM, "Todd C. Merrill" tmerrill@mathworks.com wrote:
Folks, Before I write one myself or modify the NFS top tool on the NOW toolchest:
http://now.netapp.com/NOW/download/tools/ntaptop/
does anybody have an equivalent one for `qtree stats`?
Until next time...
The MathWorks, Inc. 508-647-7000 x7792 3 Apple Hill Drive, Natick, MA 01760-2098 508-647-7001 FAX tmerrill@mathworks.com http://www.mathworks.com
-- Nicholas Bernstein Technologist, Consultant, Instructor http://nicholasbernstein.com
Until next time...
The MathWorks, Inc. 508-647-7000 x7792 3 Apple Hill Drive, Natick, MA 01760-2098 508-647-7001 FAX tmerrill@mathworks.com http://www.mathworks.com ---
This is a good point.
Using the perfstat tool will give you some running incrementals (it zeroes for each iteration), but I think you want more of a tool like cifs top, no? I don't see why NetApp couldn't easily write something like this, and it would be a good tool to have: sysstat only goes so far.
Glenn
-----Original Message----- From: Todd C. Merrill [mailto:tmerrill@mathworks.com] Sent: Friday, December 12, 2008 2:25 PM To: Glenn Walker Cc: Nicholas Bernstein; Toasters Subject: RE: qtree "top" tool
Yes, I know about `qtree stats`. But, just like nfsstat, it gives you cumulative stats.
The NFS top tool I mentioned does an incremental nfsstat. I'm looking for the same--*incremental* qtree stats, in order to debug high usage as it is happening.
On Wed, 10 Dec 2008, Glenn Walker wrote:
You mean qtree stats?
It shows CIFS and NFS ops per qtree - doesn't work at the vol level (wish it would).
Glenn
-----Original Message----- From: owner-toasters@mathworks.com
[mailto:owner-toasters@mathworks.com]
On Behalf Of Nicholas Bernstein Sent: Wednesday, December 10, 2008 1:00 PM To: Toasters Subject: Re: qtree "top" tool
I don't have a filer in front of me to test, but isn't there a stats show qtree?
On 12/9/08 6:16 PM, "Todd C. Merrill" tmerrill@mathworks.com wrote:
Folks, Before I write one myself or modify the NFS top tool on the NOW toolchest:
http://now.netapp.com/NOW/download/tools/ntaptop/
does anybody have an equivalent one for `qtree stats`?
Until next time...
The MathWorks, Inc. 508-647-7000 x7792 3 Apple Hill Drive, Natick, MA 01760-2098 508-647-7001 FAX tmerrill@mathworks.com http://www.mathworks.com
-- Nicholas Bernstein Technologist, Consultant, Instructor http://nicholasbernstein.com
Until next time...
The MathWorks, Inc. 508-647-7000 x7792 3 Apple Hill Drive, Natick, MA 01760-2098 508-647-7001 FAX tmerrill@mathworks.com http://www.mathworks.com ---