Re: Webmail farm

4 Mar 2002


      On Mon, 4 Mar 2002, Jose Celestino wrote:
...
Ahh, in the meantime I was able to get the output of filestats, it may
be of any help....
[snip]
...
...
The getattr seems way too big and this may point to a bad caching on
the frontends. But could this bring the CPU to 100% most of the time?
Could this be a wafl issue related with the low available space on the
volume?
Just out of curiosity, what's your "wafl.maxdirsize" option set to?  Is
there a chance you've got one directory that's reached its limit?  You
didn't mention what sort of directory structure your application is using,
and with 9.9M files perhaps there's a directory that's full.
I suggest this because it bit us recently and produced _very_ similar
symptoms to what you described.  We had an application that managed to hit
the limit, having put 102,399 files in one directory, and then started
looping trying to rename a 102,400th file into it.  The result was a load
of about 1800+ NFS ops/sec and artificially high cpu usage numbers, plus
an /etc/messages file that dutifully logged the thousands upon thousands
of "ENOSPC" errors, which our application patiently and persistently
ignored. :-)  After 24 hours our Cricket NFS ops/sec graphs looked
bonkers.
So, it might seem a little wierd, but check your messages log for ENOSPC:
Wed Feb 20 16:05:22 PST [GbE-e7]: Returning ENOSPC for large dir (fsid 26082, inum 2135872)
and see if perhaps you've hit a directory size limit.  An application
that's trying to be "well behaved" and re-try a failed creat() or rename()
could be the source of all those mysterious getattrs.  Upping the
maxdirsize would alleviate that, as would splitting up any very full
directories.
-- Chris
--
Chris Lamb, Unix Guy
MeasureCast, Inc.
503-241-1469 x247
skeezics@measurecast.com

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Re: Webmail farm