Just thought I'd share a caveat that we ran into in the recent past in upgrading from 5.2.x to 5.3.x. It took two separate cases and 50+ emails and phone calls to figure out, so hopefully this info will save somebody else quite a bit of time. (Many thanks to Mike Smith in tech support for his patience and perseverance in resolving it.)
5.3 (?) introduced two new options, wafl.convert_ucode and wafl.create_ucode, to convert directories upon access and create directories from scratch with unicode bits, respectively, even if accessed/created via an NFS client. We updated our rc files with these options and proceeded with business as usual after the upgrade.
Hours later, our entire automated testing environment came to a screeching halt with "permission denied" and other bizarre errors. For instance, an `rm -rf` of a directory would fail, leaving files around, when all the file and directory permissions would have allowed the removal:
% rm -rf build.removeme rm: Unable to remove directory build.removeme/test/toolbox/map/map: File exists rm: Unable to remove directory build.removeme/test/toolbox/map: File exists ...
Cut to the chase: after getting packet snoops and trying various things on different filers, and the usual troubleshooting, NetApp was able to reproduce the "problem." It wasn't a problem. The docs state, regarding unicode conversion of directories:
"For example, converting a directory of 100,000 files takes the filer about 9 minutes on a netapp f540."
I read this, and said to myself, "Self, we don't have any directories that big," so I dismissed as trivial the time it would take to convert our directories. We had maximum 2,000 files in our largest directories.
Well, even with a semi-honkin' F740, two generations distant from the aforementioned F540, we were seeing `rm -rf` failures on small directories, with only tens of files. And, the clients doing the `rm -rf` weren't all that speedy, a 200 MHz Sun Ultra2, a 270 MHz Ultra5, and even a SPARC20, if I recall correctly (among other types of UNIX boxes, like 200 MHz Pentium Linux boxes).
Turns out, this unicode conversion takes a non-trivial amount of time for *any* directory. System call traces showed directory entry calls returning not all files in that directory. Apparently, during the conversion, the "new" directory list can be incomplete. Hence, an `rm -rf` first gets an (incomplete) directory listing, then removes all those files, does a `cd ..` and then tries an rmdir(), at which point, since there are still files in the directory, it spits out an error like the above.
Once we figured this out, we tried to forcibly convert all the directories by doing a `find . -print > /dev/null` to traverse the entire filer directory structure. Well, this apparently didn't work (my notes get sparse here) in all cases. Our final workaround was to go to a CIFS client (an NT box), right-click on each top level folder on the filer and select "properties," and let it finish calculating the folder size, thus doing the unicode conversion as usual (the way it had to be done before the wafl options in 5.3.x). So much for the interpretation of "wafl.convert_ucode" working via NFS access to the directory.
At the conclusion of the call, I requested the docs to be expounded upon and made clearer regarding unicode conversions for future releases.
To date, we have not had any further problems. The "access denied" messages when trying to access non-unicode directories (those created via an NFS client pre-5.3 or with the wafl.create_ucode option unset) from read-only snapshots via CIFS clients has been eliminated. Yea!
Hope this helps others.
Until next time...
The Mathworks, Inc. 508-647-7000 x7792 3 Apple Hill Drive, Natick, MA 01760-2098 508-647-7001 FAX tmerrill@mathworks.com http://www.mathworks.com ---
On Wed, 16 Feb 2000, Todd C. Merrill wrote:
Just thought I'd share a caveat that we ran into in the recent past in upgrading from 5.2.x to 5.3.x.
5.3 (?) introduced two new options, wafl.convert_ucode and wafl.create_ucode, to convert directories upon access and create directories from scratch with unicode bits, respectively, even if accessed/created via an NFS client.
"For example, converting a directory of 100,000 files takes the filer about 9 minutes on a netapp f540."
This conversion took close to a day and a half on a 630 full of hundreds of thousands of files of gzipped archive data. Note that you can monitor the conversion taking place by watching sysstat - there's a steady 3-400 disk reads per second going on while the conversion is being done.
Hal Siegel _---_ / Systems Engineering YY()))))\ /| Texas Microprocessor Division (@@) )))))/ Advanced Micro Devices /_/__/__/ hal@beast.amd.com mm mm
Can one of you explain the need for this anyway ?
Eyal
Hal Siegel wrote:
On Wed, 16 Feb 2000, Todd C. Merrill wrote:
Just thought I'd share a caveat that we ran into in the recent past in upgrading from 5.2.x to 5.3.x.
5.3 (?) introduced two new options, wafl.convert_ucode and wafl.create_ucode, to convert directories upon access and create directories from scratch with unicode bits, respectively, even if accessed/created via an NFS client.
"For example, converting a directory of 100,000 files takes the filer about 9 minutes on a netapp f540."
This conversion took close to a day and a half on a 630 full of hundreds of thousands of files of gzipped archive data. Note that you can monitor the conversion taking place by watching sysstat - there's a steady 3-400 disk reads per second going on while the conversion is being done.
Hal Siegel _---_ / Systems Engineering YY()))))\ /| Texas Microprocessor Division (@@) )))))/ Advanced Micro Devices /_/\__/__/ hal@beast.amd.com mm mm
Very strange. I would expect that any operations on a directory that was being converted would be "locked out" until the conversion was complete. The NFS or CIFS client would then either wait or time out as appropriate.
After all, isn't that how past directory format changes have been implemented?
Bruce