netapp and berkely db optimization - toasters

4 Feb 2004


      I've got some questions that I'm hoping people on this list may be able to
help answer.  Background first.
We're working on an application that uses alot of Berkeley DB hash files,
via DB_File in perl.  Right now, we are looking at about 30k individual
databases.  Overall, the application currently hit's about 1000 of these
databases a minute with a few hundred reads per use and potentially an
equal number of writes.  Occasionally the application does grooming on the
databases, rebuilding them to reclaim space and expire unused entries.
Each database will be around 5 megs.
We're testing on a pair of clustered 760s with half of the anticipated
load.  Each filer has 2 shelves of 18s.  The load generated by the
application is enough to max the disks on both filers.  CPU utilization is
around 40% and ops are generally under 1000.  Disk read's average about
12mb/s, net out, around 5mb/s per filer.
Now, here's the question.  Under berkeley db you can control the hash
bucket size which will default to the filesystem block unless other wise
specified.  Does anyone know what the optimum bucket size would be on a
NetApp?
Any help offered in tuning the application to best use the resources is
greatly appreciated.
-- 
Kelsey Cummings - kgc@sonic.net           sonic.net, inc.
System Administrator                      2260 Apollo Way
707.522.1000 (Voice)                      Santa Rosa, CA 95407
707.547.2199 (Fax)                        http://www.sonic.net/
Fingerprint = D5F9 667F 5D32 7347 0B79  8DB7 2B42 86B6 4E2C 3896