Hello fellow Toaster Users,
Last night I notice that our F760 got a bit sluggish for the very first time. We are running 6.2.1R1D5. Around 1:00 when the scrub started the disk utilization went up to around 9600bB/s and no more than 300NFS ops and practically no network traffic. After the scrub completed it found no errors or problems with any of the disks. Here is a out put of a sysstat
CPU NFS CIFS HTTP Net kB/s Disk kB/s Tape kB/s Cache in out read write read write age 12% 52 0 0 13 12 1938 418 0 0 0s 14% 55 0 0 13 11 2705 327 0 0 0s 13% 34 0 0 8 8 2915 194 0 0 0s 18% 55 0 0 11 9 3612 161 0 0 0s 10% 18 0 0 4 3 2400 102 0 0 0s 17% 80 0 0 17 14 3371 189 0 0 0s 15% 106 0 0 24 22 2459 538 0 0 0s 14% 39 0 0 11 16 1787 569 0 0 0s 17% 65 0 0 15 22 2680 448 0 0 0s 13% 54 0 0 14 16 1740 544 0 0 0s 16% 54 0 0 11 15 2521 430 0 0 0s 17% 90 0 0 19 22 2178 627 0 0 0s 15% 62 0 0 18 27 1984 697 0 0 0s 13% 50 0 0 12 15 1990 397 0 0 0s
As you can see there is practically no network traffic and very low ops. Something that dose have me a bit confused is the Cache Age... there is none? This system has been online for 400 days, 11:40 2000225709 NFS ops and I think it might just be time for a reboot. But before I reboot the system I wanted to check with fellow users to see if this is an DOT bug or just a common netapp twitch. Thanks for your help and advice.
Best regards, Blake Folgner
for me, the disk activity is directly related to the scrub process scrub consist of read and re-write of what was baddly read so.
Blake Folgner wrote:
Hello fellow Toaster Users,
Last night I notice that our F760 got a bit sluggish for the very first time. We are running 6.2.1R1D5. Around 1:00 when the scrub started the disk utilization went up to around 9600bB/s and no more than 300NFS ops and practically no network traffic. After the scrub completed it found no errors or problems with any of the disks. Here is a out put of a sysstat
CPU NFS CIFS HTTP Net kB/s Disk kB/s Tape kB/s Cache in out read write read write age 12% 52 0 0 13 12 1938 418 0 0 0s 14% 55 0 0 13 11 2705 327 0 0 0s 13% 34 0 0 8 8 2915 194 0 0 0s 18% 55 0 0 11 9 3612 161 0 0 0s 10% 18 0 0 4 3 2400 102 0 0 0s 17% 80 0 0 17 14 3371 189 0 0 0s 15% 106 0 0 24 22 2459 538 0 0 0s 14% 39 0 0 11 16 1787 569 0 0 0s 17% 65 0 0 15 22 2680 448 0 0 0s 13% 54 0 0 14 16 1740 544 0 0 0s 16% 54 0 0 11 15 2521 430 0 0 0s 17% 90 0 0 19 22 2178 627 0 0 0s 15% 62 0 0 18 27 1984 697 0 0 0s 13% 50 0 0 12 15 1990 397 0 0 0s
As you can see there is practically no network traffic and very low ops. Something that dose have me a bit confused is the Cache Age... there is none? This system has been online for 400 days, 11:40 2000225709 NFS ops and I think it might just be time for a reboot. But before I reboot the system I wanted to check with fellow users to see if this is an DOT bug or just a common netapp twitch. Thanks for your help and advice.
Best regards,
Blake Folgner
Stephane Bentebba stephane.bentebba@fps.fr writes:
for me, the disk activity is directly related to the scrub process scrub consist of read and re-write of what was baddly read so.
I'm not clear whether Blake's sysstat extract is from during the scrub or after it had finished. The disk read rates are tiny for an F760 doing a scrub: even the "around 9600bB/s" (should that be kB/s?) sounds very low. A scrub won't do any significant writing unless your discs are in a dire state! (or its an upgraded volume and this is the scrub that finishes off doing the zone checksums).
The (very) significant performance issue that the sysstat shows is the "cache age" being 0s. A scrub doesn't normally have much effect on the cache age.
So I'm left to speculate about
This system has been online for 400 days, 11:40 2000225709 NFS ops
and I think it might just be time for a reboot.
There could, I suppose, be something awful that happens after either 400 days or 2 * 10^9 NFS operations! (How does Blake manage so exactly 5000000 NFS operations a day, I wonder?) I admit to never achieving that much between reboots on any filer here.
(Munge through archived logs... I see I made 292 days and 1296 million NFS ops back in 1998 but that was on an FAServer 450 and FASware 3.x. In more recent years I don't seem to have got above 133 days and 731 million NFS ops. Must be the frantic desire to keep up to date on ONTAP versions, I suppose.)
Chris Thompson Email: cet1@cam.ac.uk