We saw this on a 3070 running 8.0.2, but the culprit was *aggregate* snapshots. Every day at 2pm, I/O on the filer would literally crawl to a stop for 30-60 seconds, while disk I/O ran up to 100%. Your problem sounds very much like ours, so I would make sure to clarify which type of snapshots they mean. Aggregate snapshots served no purpose for us, so we turned them off, and the problem went away. As a bonus, we recovered the reserved space.
Are you seeing this every hour? Or just 2 or 3 times a day. Type "snap sched -A' on your command line and see if the times there match the times at which you are seeing the delays.
On 10/9/12 1:07 AM, Peter D. Gray wrote:
I have a question. A month or so back we had massive performance problems on our 3170A and disks were 100% for no apparent reason (SATA drives).
Netapp eventually tracked down the problem to snapshot deletion. It seems that blocks are zeroed as they are returned to the aggregate. It was taking longer than 60 mins to delete the hourly snapshots.
So, my questions are:
has anybody else seen this problem and been forced to avoid hourly snaps?
is there any way to disable the zeroing of freed blocks?
has anybody heard of a plan for a fix which would allow us to use hourly snaps again? It seems bizarre that a useful feature, which we have used for a long time, has become unusable.
Basically, it appears I/O is doubled becuase the blocks are written when used, then written when freed.
Regards, pdg