Hello All:
There are a couple of possibilities on this one. Data ONTAP GX can handles "lots" of objects very fast. Data ONTAP GX would fix this problem and remove all the limitations but a lot of people aren't ready for GX yet. (Sabastian might consider it)
"Lots" is very subjective but the original case of using the "ls" command and taking a coffee break is typical of a case where the directory structure is just too large to fit into memory. Without more information, none of use can be sure that is really the problem. (We'd have to see statistics and look at lots of factors).
If tanyone needs to access millions of files in a single directory and doesn't want to move to GX, then working with NetApp support to change the directory structure is the best case scenario. You can hopefully get the performance back with small tweaks.
"Metadata" is sounding like a subjective term as well. There is no evidence that the metadata is a problem. The directory file is probably the problem (The directory file is a special file that has inode numbers and maps those numbers to names. It has to be read into system memory to send the information back for an "ls")
Just my .02 cents.
--April
----- Original Message ---- From: Blake Golliher thelastman@gmail.com To: Peter D. Gray pdg@uow.edu.au Cc: toasters@mathworks.com Sent: Monday, October 22, 2007 6:26:56 PM Subject: Re: WAFL metadata files
I'd argue that this is general file system issue, not so much a wafl issue. I don't think wafl is particularly slow at this workload either, they do far better then most other nas gear I've used for this workload. There are trickier things out there, like bluearc has some preloaded cache for meta data to help speed it along, but that's just fixing the problem by tossing it all in memory. If you compare a file system operation to file system operation between netapp and bluearc I'm sure you'd fine similar performance issues for a directory with millions of objects.
But I do hope some of those wafl guys can figure out a way to make lots of objects in a file system faster. It can be a huge pain.
-Blake
On 10/22/07, Peter D. Gray pdg@uow.edu.au wrote:
On Mon, Oct 22, 2007 at 12:55:48PM -0700, Blake Golliher wrote:
I have to deal with millions of objects in filesystems, I highly recommend subdirectories. look at your nfs_hist output. First do nfs_hist -z. Then count to 30 and run nfs_hist again. It's a histogram of all nfs ops, and how long they took in milisecond buckets. I'd bet lookup is taking a very long time. When dealing with a large number of objects, sensible directory structures are
key.
Yes, but to be fair, this is a weakness in the wafl filesystem. You cannot have everything, and wafl has made a trade off in the way it stores file metadata that makes it slow to handle large number of files in a directory.
I am not sure if netapp is planning any enhancements in this area or even what would be possible.
Anybody care to comment?
Regards, pdg
--
See mail headers for contact information.
__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com