I wanna talk this out.
What do you mean about "100,000 files in a directory will cause you performance problems"?
I was told by Network Appliance that their WAFFLE filesystem (or whatever) hashes the directory entries internally, so there are *no* scaling issues with regards to directories with many entries in them.
Are you wrong? Or is Network Appliances wrong?
Please help.
Ed Henigin ed@texas.net
ps just what the heck do you mean by 'setting up a directory hashing scheme' ? Are you talking about the MTA and/or POP3 servers having their own databases of filename->inode mappings? Your terminoligy seems vague, maybe that's where my confusion stems.
-- ---------- Forwarded message ---------- Date: Wed, 02 Jul 1997 17:54:13 +0100 From: Dom Mitchell hdm@demon.net To: Alan Judge Alan.Judge@indigo.ie Cc: toasters@mathworks.com Subject: Re: Filers for a large mail environment
[...deletia...]
Shared maildrops are a different issue. If you use something like qmail[1], OTOH, there are simply no locking issues whatsoever (and easy POP3 support, but no IMAP). We do something similiar (although propietary) for about 100,000 users, and it works well. Don't ^^^^^ forget to set up a directory hashing scheme, though as 100,000 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ files in a directory will also cause you performance problems. :-) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-Dom
-----End of forwarded message-----
On Wed, 2 Jul 1997, Edward Henigin wrote:
I wanna talk this out.
What do you mean about "100,000 files in a directory will cause you performance problems"?
NetApps (running the current code) have problems with directory files larger than 8Mbytes. How many files this is depends on the length of the filenames. I believe that they are tring to fix this problem
I ran into this on a filer that was running inn. the control/cancel dir had grown to 10MB and ther filer was hanging periodically with 100% cpu utilization
joel
Joel Gallun wrote:
On Wed, 2 Jul 1997, Edward Henigin wrote:
I wanna talk this out. What do you mean about "100,000 files in a directory will
cause you performance problems"?
NetApps (running the current code) have problems with directory files larger than 8Mbytes. How many files this is depends on the length of the filenames. I believe that they are tring to fix this problem
For those on release 4.x.
If anyone running usenet news (or with a real need for this sort of thing) is interested, please email me directly for information about a kernel we are testing. We could use a couple more people to help test the kernel out, and you might win some performance in the process.
I would prefer candidates who can give feedback on performance within a fairly short period of time (i.e., your filer should be very busy as I write this note.)
Thanks, Ken.
Edward Henigin wrote:
I wanna talk this out.
What do you mean about "100,000 files in a directory will cause you performance problems"?
I was told by Network Appliance that their WAFFLE filesystem (or whatever) hashes the directory entries internally, so there are *no* scaling issues with regards to directories with many entries in them.
Are you wrong? Or is Network Appliances wrong?
Dunno. :-) However, I have noticed problems with large directories on our system, which was causing very poor performance. One person mentioned that it might have been my use of nfsv3, instead of v2, which alledgedly has a slower "readdir" call...
In any event, it turned out to be a Solaris bug, filling the directories with thousands of .nfsxxxx turds, which was solved by the appropriate patch...
Ed Henigin ed@texas.net
ps just what the heck do you mean by 'setting up a directory hashing scheme' ? Are you talking about the MTA and/or POP3 servers having their own databases of filename->inode mappings? Your terminoligy seems vague, maybe that's where my confusion stems.
Sorry, I mean setting up the MTA, so that instead of having:
/var/spool/mail/luser
as the mailbox, you have:
/var/spool/mail/l/u/luser
Which keeps the directories small. You need to arrange for the users to have their mail drops set up this way. If they never log in, the easiest thing to do is to set their home directories to be that dir, and then deliver mail to their home directory. Again, easy with qmail.
-Dom
Are you wrong? Or is Network Appliances wrong?
Dunno. :-) However, I have noticed problems with large directories on our system, which was causing very poor performance. One person mentioned that it might have been my use of nfsv3, instead of v2, which alledgedly has a slower "readdir" call...
No, NFSv3 has the same READDIR call as NFSv2. The problem is that it also has a READDIR+ call, which not only does what READDIR does but, in addition to a bunch of filenames, returns result of GETATTR on each of the objects.
That's great for "ls -l" which does a stat() (generating a GETATTR in NFSv2) on each file.
That's not so great for, e.g., netnews, which often lists the contents of a directory solely so it can find which articles really exist. It doesn't care about anything other than the name, but most NFSv3 clients can't figure that out and use READDIR+ anyway, uselessly going out to each inode (potentially a separate disk read for each one) to collect information that won't be used.
This has nothing to do with the size of the directory, except for the fact that you have more files to look at. If you looked at the same number of files spread over a bunch of directories with a few dozen files each instead of one huge directory, the problem would be the same.
-- Karl Swartz - Technical Marketing Engineer Network Appliance kls@netapp.com (W) kls@chicago.com (H)
In any event, it turned out to be a Solaris bug, filling the directories with thousands of .nfsxxxx turds, which was solved by the appropriate patch...
What bug number and patch is that? I've seen other customers asking about ".nfs" files with Solaris 2.x clients, in ways that suggest that they might be getting bitten by that bug.
+--- In our lifetime, guy@netapp.com (Guy Harris) wrote: | | > In any event, it turned out to be a Solaris bug, filling the directories | > with thousands of .nfsxxxx turds, which was solved by the appropriate | > patch... | | What bug number and patch is that? I've seen other customers asking | about ".nfs" files with Solaris 2.x clients, in ways that suggest that | they might be getting bitten by that bug. |
I did not think this was a bug. If you do an lsof (list open files) on the files, you can get the process ID that "owns" the files. Typically it is a process that is still running. Killing and restarting the process should get rid of the files. At least this has done it for me every time.
I see this alot with files that get deleted and such. This is in a 25+ Sun 2.5.1 + NetApp f540 environment.
If there is a patch that "solves" this "problem", I would love to see what their patch description is.
Thanks.
Alexei
I did not think this was a bug. If you do an lsof (list open files) on the files, you can get the process ID that "owns" the files. Typically it is a process that is still running.
I assume he meant that there are cases in some Solaris 2.x releases where either
1) a file *isn't* in use, but the NFS client code *thinks* it is, so attempts to remove the file get turned into attempts to rename it to a ".nfs" name;
2) if a file was in use, and an attempt to remove it got turned into an attempt to rename it, when the file ceased to be in use the NFS client code didn't realize this and remove the ".nfs" file;
or something such as that, so that you get ".nfs" files when you *shouldn't* get them, or ".nfs" files don't get cleaned up properly (not due to a client crashing before it has a chance to clean them up).
+---- Guy Harris writes: | > In any event, it turned out to be a Solaris bug, filling the directories | > with thousands of .nfsxxxx turds, which was solved by the appropriate | > patch... | | What bug number and patch is that? I've seen other customers asking | about ".nfs" files with Solaris 2.x clients, in ways that suggest that | they might be getting bitten by that bug.
I don't think that it is a bug as such. .nfs files are created when an open file is deleted, this is an old trick for creating temporary files, they are automatically deleted when the program exits, however it exits. Basically it's a way of adding state to a stateless system. I am not sure if the nfs client ever tries to delete those file, I would guess that it attempts to but definitely doesn't always succeed so there is a cron job on all nfs filers I have seen that cleans up old .nfs files, typically something like:
find / -name .nfs* -mtime +7 -exec rm -f {} ; -o -fstype nfs -prune
/Michael
Guy Harris was referring to a client bug in Soalris 2 where the files are deleted on the last close of the client.
All correct clients delete the .nfsXXXX turds on last close.
The cron job cleans up .nfsXXXX turds left over from clients that may have *crashed* while holding "open-but-deleted" files. The clients keep no state over reboot to clean these things out (they could, but why bother...) Most servers clean out old files.
For our box, a cron job run on the admin client would be appropriate to clean out old .nfsXXXX's left from crashing clients...
I don't think that it is a bug as such. .nfs files are created when an open file is deleted, this is an old trick for creating temporary files, they are automatically deleted when the program exits, however it exits. Basically it's a way of adding state to a stateless system. I am not sure if the nfs client ever tries to delete those file, I would guess that it attempts to but definitely doesn't always succeed so there is a cron job on all nfs filers I have seen that cleans up old .nfs files, typically something like:
find / -name .nfs* -mtime +7 -exec rm -f {} ; -o -fstype nfs -prune
Brian Pawlowski wrote:
Guy Harris was referring to a client bug in Soalris 2 where the files are deleted on the last close of the client.
All correct clients delete the .nfsXXXX turds on last close.
The cron job cleans up .nfsXXXX turds left over from clients that may have *crashed* while holding "open-but-deleted" files. The clients keep no state over reboot to clean these things out (they could, but why bother...) Most servers clean out old files.
For our box, a cron job run on the admin client would be appropriate to clean out old .nfsXXXX's left from crashing clients...
The original thing I referred to is alledgedly solved by patch 103600-14. The circumstances involve multiply hard linked files, and high load.
-Dom
What do you mean about "100,000 files in a directory will cause you performance problems"?
I was told by Network Appliance that their WAFFLE filesystem
WAFL - Write Anywhere File Layout
(or whatever) hashes the directory entries internally, so there are *no* scaling issues with regards to directories with many entries in them.
Are you wrong? Or is Network Appliances wrong?
Neither is entirely correct. TR-30016 - Accelerated Performance for Large Directories (http://www.netapp.com/technology/level3/3006.html) explains what we do. Populating a directory with n files is an O(n^2) operation for a normal Unix file system; for a filer it's roughly O((n/2048)^2). Technically, that's still O(n^2), and thus it's not true that there are *no* scaling issues, but it's not a huge problem until you get really big, significantly bigger than 100,000 files.
As noted in another message, there is an issue which affects very large directories which are not currently in memory. This becomes most apparent at around 8MB or larger. I believe a fix in currently in test; contact NetApp Tech Support for details if this is affecting you.
-- Karl Swartz - Technical Marketing Engineer Network Appliance kls@netapp.com (W) kls@chicago.com (H)