I have several concerns about the migration process from an AlphaServer 2100 (Digital UNIX 4.0D) to a NetApp 740. If anyone has specific information / suggestions about any of these issues, you can mail me individually, or the list if it abides by list policy.
1) Filesystem layout
We have about 5000 user home directories and several (maybe 10) other logical areas that we will be migrating. Total disk space used initially will be small, about 35G. My guess is that it would be best to keep user directories as a logically separated area (from, say, Alpha binaries) because of differences in default quotas and snapshot and backup policies. Does this sound reasonable, or could it be best to lump them together?
2) Large directory constraints
Also, for the user areas, what are the practical constraints on the number of files in a directory? I am most used to AdvFS, under which large directories can be slow to open and read, if a user tries to do something like tab completion on a filename in a directory of 1000 entries.
Will I see the same kind of behavior with WAFL, and if so, does anyone have a good suggestion on breaking up the user space? We were thinking of using directory structure organization (nothing at logical partition level) for this.
3) Quotas
How sophisticated are the quota manipulation tools? Are there utilities for changing whole sets of quotas, or do admins just use homemade scripts for this? (And if so, does anyone have a pointer to a useful script set?) Also, any suggestions on the best way to migrate AdvFS quotas on user directories? I can foresee the C or scripting part of the AdvFS side of this, but I have no idea what tools we even have to work with on the WAFL side.
4) Backups
I am used to vdump and amanda, which don't split single backup sets across multiple tapes. Do the native filer backup utilities handle that? And are backups on a per logical partition basis?
5) Snapshot policies
These are defined on a per logical or per physical partition basis?
We will be getting the filer documentation before the filer, so that should answer my questions about snapshot policies, backup and quota utilities. But any additional info on these, pointers to public scripts, and insight on the file organization issues is appreciated. Also, references to other info sources are appreciated.
Thanks.
Filesystem layout
We have about 5000 user home directories and several (maybe 10) other logical areas that we will be migrating. Total disk space used initially will be small, about 35G. My guess is that it would be best to keep user directories as a logically separated area (from, say, Alpha binaries) because of differences in default quotas and snapshot and backup policies. Does this sound reasonable, or could it be best to lump them together?
In general, bigger volumes are easier to manage, because you have fewer of them.
Q-trees are a good tool for managing space within a single volume. You can assign different default quotas in different q-trees, and many people also do backup on a q-tree basis.
(Q-tree used to be called "quota trees", since they let you assign a quota to a top-level subtree in the filesystem, but now we've added other functionality as well, such as control for UNIX versus NT style file security, so we just call them q-trees.)
However, snapshot schedules do apply on a per-volume basis, so to keep things in a single volume you will need to come up with a snapshot schedule that satisfies all users.
Large directory constraints
Also, for the user areas, what are the practical constraints on the number of files in a directory? I am most used to AdvFS, under which large directories can be slow to open and read, if a user tries to do something like tab completion on a filename in a directory of 1000 entries.
Will I see the same kind of behavior with WAFL, and if so, does anyone have a good suggestion on breaking up the user space? We were thinking of using directory structure organization (nothing at logical partition level) for this.
We did quite a bit of performance work on large directories back when NetCom still had all of their users' mailboxes in one large /usr/spool/mail. At 10K users (this was a *long* time ago) it started getting slow, and at 30K names, it really sucked. Somewhere between 30K and 100K users, we went in and re-worked our directory code to use a hashing scheme that's much more efficient, although NetCome (and other large ISPs) eventually moved away from the super-giant mail spool directory.
There are definitely performance issues with super-giant directories, but I don't think of 1K or even 10K as super giant. On the other hand, applications that *sort* all the names in a directory as they display it (like "ls" for instance, or "echo *") can get slow for reasons that have nothing to do with the file server itself. You might want to experiment.
Dave
[Experience with ISP-class mail spool management elided]
There are definitely performance issues with super-giant directories, but I don't think of 1K or even 10K as super giant. On the other hand,
[...]
It has been my experience that dealing with large directories (for values of large in the range of 10^5-7 files) that the most cost effective performance optimization technique is to re-engineer the software in question to use another method. Splitting into different directories is usually the solution. For mail, using a subdirectory of $HOME as netcom did, or for file caches, in my case, doing progressive /[0-9]/[0-9]/[0-9]/whatever trees worked for a while. We eventually moved to Oracle. While Oracle isn't fast, it at least has a predictable, and very gradual, degradation curve.
-j
On Thu, 25 Mar 1999, Joanna Gaski wrote:
Filesystem layout
My guess is that it would be best to keep user directories as a logically separated area (from, say, Alpha binaries) because of differences in default quotas and snapshot and backup policies. Does this sound reasonable, or could it be best to lump them together?
There are several levels of separating your data using NACs. From more granular to less granular they are: volume, quota tree, directory. This is also reflected in their quota system. I would suggest that you use one volume, and several qtrees. So in a way you're lumping all stuff together. I believe you want to snap binaries less often than user directories. This is because binaries usually don't change much. Since they don't change much snaping them as often as user home directories won't consume much space.
Large directory constraints
Also, for the user areas, what are the practical constraints on the number of files in a directory? I am most used to AdvFS, under which large directories can be slow to open and read, if a user tries to do something like tab completion on a filename in a directory of 1000 entries.
Filename completion is done on the client not the server, so that's really up to how fast your interactive system can process the data. The NAC should simply send the directory listing which even with a 1000 entries should be only a couple kilobytes of data. The NAC would have to search the directory structure only when you actually perform certain operations on the file, i.e. open, delete, etc. Once you open the file the NAC should not have a need to resolve the name again, unless someone else decides to open it.
Will I see the same kind of behavior with WAFL, and if so, does anyone have a good suggestion on breaking up the user space? We were thinking of using directory structure organization (nothing at logical partition level) for this.
The only issue I have seen with WAFL is that deleting a very deep (20,000 - 40,000 levels) directory structure brings the filer to a crawl. I spoke with some of the guys at NetApp about it. One said that the delete operation is very expensive on any filesystem. It shouldn't be that expensive on WAFL. The other (considerably higher up in the company) confirmed the latter. One of my fellow admins tells me that there is a product we use that may create similarly obscenely (a couple 1000) deep directory structures. I am yet to have some time to play with my abusive methods some more and provide hard evidence that something is wrong. So far I noticed that the inode cache statistics are severely broken on the NAC in 5.2.1P1. The box claimed a couple thousand percent efficient cache. I though that anything above 100% implied clairvoyance.
Quotas
How sophisticated are the quota manipulation tools? Are there utilities for changing whole sets of quotas, or do admins just use homemade scripts for this?
Quotas are really easy to manage, they are stored in a flat text file. However, it is a bummer that you have to run a quota resize on the NAC if you want the changes to take effect. I hear that NAC is planning to eliminate this object of annoyance.
(And if so, does anyone have a pointer to a useful script set?) Also, any suggestions on the best way to migrate AdvFS quotas on user directories? I can foresee the C or scripting part of the AdvFS side of this, but I have no idea what tools we even have to work with on the WAFL side.
If you can easily read the quotas for all users you simply translate it into a very simple format of the quota file. It should be only a couple minutes of work for a person only remotely capable of writing scripts. NetApp made this very easy.
- Backups
I should know this, but our back up team takes care of this.
Do the native filer backup utilities handle that? And are backups on a per logical partition basis?
I believe they are any way you want, there really are no physical partitions in WAFL. I believe you can dump a whole volume, snapshot, a quota tree, a directory, a file, or some combinations of those. Correct me on this please.
Snapshot policies
These are defined on a per logical or per physical partition basis?
Well, like I said there are really no physical partitions. They are defined on a volume basis.
Also, references to other info sources are appreciated.
I invite you to read the white paper on the WAFL filesystem. That should answer more questions for you then the documentation:
http://www.netapp.com/technology/level3/3002.html
It may not be clear to some from the paper that snapshots only copy some metadata. That's why they are so fast. The filesystem doesn't overwrite any blocks until they are not used in any snapshot including the current filesystem view. Instead changed information is recorded in another block and a duplicate inode reflects the new location of data. The snapshot simply contains a list of the old inodes. If the benefit of this is not clear to you, I can explain it in more detail on this forum or in private. Perhaps someone from NAC can do it better (please send me a copy of your explanation, if you do).
Tom
On Fri, 26 Mar 1999 tkaczma@gryf.net wrote:
Quotas are really easy to manage, they are stored in a flat text file. However, it is a bummer that you have to run a quota resize on the NAC if you want the changes to take effect. I hear that NAC is planning to eliminate this object of annoyance.
Let me elaborate a bit on this. If you increase or drecreate an existing quota you do a quota resize, which is a pain, but it is fairly quick. If you add or remove a quota from the configuration, you need to restart the quota system on the NAC. This is a slow and thus painfull process. IN both cases you must log in or rshell to the NAC.
Tom