On Fri, 11 Feb 2005, John Stoffel wrote:
Maren> Hi, We have an application where we need to store between 4 to Maren> 20 million small files on the on a large drive or a 3 drive Maren> raid system.
You should read up on how newservers used to deal with this issue. Basically, you want to keep down the number of entries per-directory as much as possible. So instead of having /data/<millions of files> you would have /data/a/<a* files>, /data/b/<b* files>, ... /data/z/<z* files>. You would push this down as many levels as you needed.
Yes, that may help on seeking and reading but the ammount of time required to delete 1Million files is still huge. I've not tested it though. I recall delete this kind of structure that squid would setup and it would take for ever.
The application that is creating all these files is actually Mysql, and under UFS2 we got upto the limit of 1 million files in a single directory. Yes, it worked ok, but deleting the files took for ever.
We are looking at hacking mysql so that it creates a directory tree. Still this only going to solve one part of the problem. The time required for dropping such databases or removing tables is long.
We have use the more advance database format Innodb that Mysql offers as it is meant to be faster and it only uses one file per table instead of the default 3 files.
Maren> We are finding that file systems like UFS just don't cut it Maren> never mind Linux extFS... These file systems can handle what we Maren> are trying to do but they tend to slow down as they get full Maren> and in some cases we will have to be able to delete millions of Maren> files in one go.
Have you looked at the ext3 with the indexed directory entries? Supposedly it's alot faster. Google for it, using 'ext3 dir_index'. I just did it on my 2.6.11-rc2-mm2 kernel system at home. Not performance testing though.
I will have a look at this....
tune2fs -O dir_index /dev/sdc1
Maren> In this type of configuration unfortunately we can't bundle at Maren> netapp as the technical overhead and cost would make the system Maren> unfeasable.
What techincal overhead? You just turn it on and it runs. Just say it's a cost issue and let it be.
This is part of a system that we need to hand over to customers and sticking a netapp into the equation would make most people panic i think. Even though I don't dissagree that a netapp would be ideal for this. It won't need huge throughputs just low latency and lots of random seeking.
Maren> I have so far seen the ReiserFS can do what we are looking for Maren> but we need a fs that will work on Linux, FreeBSD, Solaris Maren> etc... We must have a path to scale up or deploy a heavy load Maren> system and migrate to a big sun box if required.
Maren> Anyone knows how well Solaris 10's ZFS would work for this kind Maren> of application?
Well, since ZFS is only on Solaris, it doesn't meet your standard of being available on all filesystems. The only one that I can think of, and I don't know about FreeBSD, is VxFS from Veritas. I'd look into that if I was you.
at least if we can start on solaris we can start from old U2' E250 to huge monster machines and one maybe could dare to go into X86 solaris... it seems that sun have been working very hard on this.
But first, I'd read up on how nntp news servers handled the exact same issue, with millions of small files that needed to be added/deleted all the time. Follow their lead and re-write your application to fit the requirements.
Yes i think it would a good place to where to start...
Maren.
John John Stoffel - Senior Staff Systems Administrator - System LSI Group Toshiba America Electronic Components, Inc. - http://www.toshiba.com/taec john.stoffel@taec.toshiba.com - 508-486-1087
-------------------------------------------------------------- HKdotCOM Ltd Tel: 852 2865-4865 ext 888 Fax: 852 2865-4100 leizaola@hk.com AIM: MarenHKdotCOM ICQ: 39905706 MSN:MarenHKdotCOM -------------------------------------------------------------- Get your @hk.com email address on http://www.hk.com