Re: Large directory performance bug? F760/6.1.2R3/Linux/2.4.20

1 Dec 2003


      ...
It appeared to be caused by the application doing readdirs
(along with other operations: read/write/getattr) on a specific
directory, which at that time held about 70,000 files.  We fixed
the problem by disabling the readdirs within the application and
also reducing the number of files in that directory down to
about 45,000.
Probably is likely two-fold -- large (erm, huge) directories are going to
be painful for the filer in the first place, since a 'simple' request
explodes out into such a large anwser.. Secondly, I don't know if the same
holds true for Linux, but Solaris will bypass the DNLC for directories
over a given size -- likewise, what's already a nasty request gets
amplified since the client isn't caching it.
...
We don't know exactly which fix (stopping readdirs or removing
the files) did the trick, but after that Netapp CPU dropped back to
normal and the webservers were happy and the site responsive
again.
I believe its READDIR+ that you disabled (disabling READDIR altogether
would be ... interesting). This would definatley help in that situation,
as READDIR+ effectively does a GETATTR for every file at the same time, so
if the attribute data isn't beling used, would be that much more work to
be done.
Avoid large directories like this -- it will always be a problem on any
platform. Even doing something as ugly as apache's mod_rewrite to hash out
across multiple levels will boost performance in the end..
You should also take a look at your config to see if you can identify what
was constantly doing the lookups (ie. apache mod_speling will do this).
Even if its not destroying things as it was here, it will impact
performance you'll probably want to address it anyways..

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Re: Large directory performance bug? F760/6.1.2R3/Linux/2.4.20