It was burt 18506, and is fixed in 5.3.5 (and all subsequent releases).
Wafl_check will not detect it.
Joan Pearson At 03:02 PM 8/29/00 , you wrote:
On Tue, 29 Aug 2000, Steve Losen wrote:
In some of the older versions of DOT were some bugs in the WAFL directory format. One bug allowed two different files to be created in the same directory with the exact same name. Even though our version of DOT no longer has this bug, our volume still had three such pairs of files on it.
I forgot to mention earlier that when running our giant "find" to find duplicate filenames, we hit some files that could not be accessed via NFS. All the files had characters in their names with numeric codes greater than 0177 (dec 127), i.e., the high order bit was set. Not all files with such characters displayed the problem. The filenames had extensions such as .doc which indicates they were probably created with CIFS.
You could list the directory and see the files, but if you tried anything that accessed that particular file, you got "file not found". So if you did "ls" you saw the file, but if you did "ls -l", you got the error "foo: file not found". This is because ls -l must stat() each file to get owner, permissions, etc., and the stat() call was failing.
Fortunately, dump/restore did not have problems with these files. We needed to move the old volume and after we copied it to a new volume the files worked properly on the new volume. I presume this is another WAFL directory format bug left over from an earlier release.
So from our experience, it appears that volumes created prior to 5.3.5 may have some "cruft" in them and one way to ferret it out is to run a big find and see if find reports any "not found" errors or lists any duplicates. It might be best to run this on a snapshot since a user could remove a file out from under find on the live filesystem.
We have found that running several finds in parallel on different subtrees of the volume is much faster than a single find.
Unfortunately, the "inaccessible from NFS" file problem cannot be fixed with NFS. In our case, copying the volume fixed it. You may need to use ndmpcopy to make a new copy of the afflicted directory or you may be able to fix it by copying or renaming the file with CIFS.
When netapp fixes a bug in WAFL, it would be nice if they would also provide some warning that old volumes may still have problems, and also provide a means to remedy them. As I pointed out earlier, incremental dumps of volumes with duplicate filenames cannot be restored with "restore -r". Anyone backing up their filers with incremental dumps will want to be sure that their volumes are free of duplicates.
Anyone backing up their filers with NFS will want to be sure that all files are accessible from NFS. We only saw the problem on regular files, but I don't see why a directory name could not have the same problem, making that whole subtree inaccessible.
Steve Losen scl@virginia.edu phone: 804-924-0640
University of Virginia ITC Unix Support