Hi,
I will try to address a couple of points, although I am sure that the folks at netapp will give you a much more academic answer:
wackz is similar to wack, but it works on zombie files as well as the regular files (or should I say inodes). You might ask what is a zombie file/inode - great question, I can try and say that it is a file that was semi erased from the system - something marked as a "bad block". I expect NA to come up with a more sophisticated answer. The regular wack command does not check these files/inodes (at least in the current Ontap versions available today).
On the system that we ran wackz we had 9345431 inodes and about 600GB of data it took somewhere in the vicinity of 60 minutes using Ontap 5.3.2. This is a great improvement in time related to the change in Ontap verions and not in the change in the wack command.
I have heard that the wackz command will very likely become default in some future undisclosed Ontap release.
Jay Soffian wrote:
We recently upgraded our clustered F740 pair to 5.2.3 from 5.2.2. This necessitated an upgrade of the firmware from 2.1_a2 to 2.2_a2, as well as a disk firmware upgrade from FB37 to FB59.
On one of our filers, the first disk scrub that ran after the upgrade (6 days after the upgrade) found a bunch on parity inconsistencies:
Sun Oct 3 02:50:39 EDT [viking: consumer]: Scrub found 212 parity inconsistencies Sun Oct 3 02:50:39 EDT [viking: consumer]: Scrub found 0 media errors Sun Oct 3 02:50:39 EDT [viking: consumer]: Disk scrubbing finished...
Apparently, with 5.2.3 the filer now generates autosupport mail on disk scrubs, so a couple days after this scrub, I received email from NetApp that they had noticed the error. (This is two anomalous autosupport emails in a row that NetApp has opened cases on, so I'd like to acknowledge NA on that.)
NA's recommendation was to re-run a disk scrub (which I did, and it completed w/o finding any errors) and then to run wackz (which I haven't done).
Unforunately, I cannot find _any_ documentation on wackz on NOW. I've also searched the toasters archive for wackz and didn't find much. In particular, I'd like to know how much downtime I'm going to be saddled with. NA claims that wackz can process 25 million inodes/hour. Given:
viking> df -i Filesystem iused ifree %iused Mounted on /vol/cim0a/ 1057771 2593665 29% /vol/cim0a/
that means the wackz should complete in < 10 minutes. From what I've heard about wack, I find this hard to believe (I've heard stories of wack running for 8+ hours). Or I'm missing something?
I'm interested in hearing from anyone who has run wackz on a similar configuration and how long it took to complete. This filer has 21 9GB disks. The cim0a volume is composed of 3 raid groups (5+1, 5+1, 4+1). The vol0 volume is composed of a single raid group (1+1).
BTW - I'm still not clear on the difference between wack and wackz. The previous explanation to toasters (wackz runs faster, does things in an optimized fashion) wasn't very enlightening. wack must do something above and beyond wackz (or the other way around), otherwise why would both still be included in DOT?
RFE: It would be nice if the filer could run a wack (read-only) and report inconsistencies so that I could check the file system w/o downtime. If the filer requires an immutable filesystem to run wack, then DOT should allow you to tag a whole volume as read-only.
I'd like to thank Ron Thibault of NA for his assistance so far.
j.
Jay Soffian jay@cimedia.com UNIX Systems Engineer 404.572.1941 Cox Interactive Media