Re: Webmail farm

4 Mar 2002


      Ahh, in the meantime I was able to get the output of filestats, it may
be of any help....
FILER> filestats volume webmail snapshot japc
VOL=webmail SNAPSHOT=japc
INODES=17653504 COUNTED_INODES=9942433 TOTAL_BYTES=373740074688
TOTAL_KB=383386132
FILE SIZE           CUMULATIVE COUNT    CUMULATIVE TOTAL KB 
1K                  1410803             3019516             
10K                 7413094             34777324            
100K                9373165             95819528            
1M                  9888040             272893780           
10M                 9942389             380408808           
100M                9942431             381046720           
1G                  9942432             381185320           
MAX                 9942433             383386132
AGE(ATIME)          CUMULATIVE COUNT    CUMULATIVE TOTAL KB 
0                   0                   0                   
30D                 3675160             168977132           
60D                 4879663             208385196           
90D                 6522408             239232636           
120D                7459500             268557600           
MAX                 9942433             383386132
UID                 COUNT               TOTAL KB            
#64010              9915276             380263412           
#0                  27157               3122720
GID                 COUNT               TOTAL KB            
#64010              9891377             380150296           
#0                  27089               2372920             
#1003               67                  749800              
#65534              23900               113116
Thus spake Jose Celestino, on Mon, Mar 04, 2002 at 05:41:29PM +0000:
...
Hi all.
We are currently experiencing some heavy load on a filer serving as storage to
a webmail farm:
FILER> sysstat 1
 [...]
 63%   5583      0      0    1047  4996    3524    16       0     0       3
 70%   6002      0      0     999  6005    3836     0       0     0       3
 65%   5738      0      0    1067  5829    2671     0       0     0       3
 68%   5881      0      0     972  6195    3424    16       0     0       3
 83%   7174      0      0    1363  7401    5477     0       0     0       3
 88%   7951      0      0    1609  8026    3984     0       0     0       3
 91%   8041      0      0    1387  8357    7076    16       0     0       3
 87%   7732      0      0    1369  8508    4601     0       0     0       3
 87%   7258      0      0    1196  7554    6006   681       0     0       3
100%   6290      0      0    1039  6406    8108  5108       0     0       3
 95%   6953      0      0    1381  6488    7536  2783       0     0       3
 88%   8205      0      0    1427  8375    5456     0       0     0       3
 73%   6115      0      0     993  6408    5051    16       0     0       3
 79%   7046      0      0    1138  7779    2629     0       0     0       3
 83%   6851      0      0    1181  7212    8240     0       0     0       3
 86%   7888      0      0    1417  8185    5305    16       0     0       3
 79%   7435      0      0    1217  7646    1676     0       0     0       3
 50%   4001      0      0     664  4293    2490     0       0     0       3
 48%   4253      0      0     711  3939    1564    16       0     0       3
 46%   4115      0      0     681  4066    1265     0       0     0       3
 [...]
The farm consists of 6 frontends, 2xPentiumIII - 800Mhz, 1Gb ram, 100Mb Fast Ethernet,
running apache and an altered c-client, doing direct maildir access (no imap, direct
filesystem access). The frontends go to about 250 concurrent sessions. There are nearly
1 million (1000000) maildir stored in here.
FILER> version   
NetApp Release 6.0.1R2: Fri Feb 9 01:12:44 PST 2001
FILER> ifconfig -a
e0: flags=848043<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
        inet 192.168.1.103 netmask 0xffffff00 broadcast 192.168.1.255
        partner inet 192.168.1.104 (not in use)
        ether 00:a0:98:00:9f:0a (100tx-fd-up)
e2a: flags=8042<BROADCAST,RUNNING,MULTICAST> mtu 1500
        ether 00:20:fc:1e:63:d4 (auto-unknown-cfg_down)
e2b: flags=8042<BROADCAST,RUNNING,MULTICAST> mtu 1500
        ether 00:20:fc:1e:63:d5 (auto-unknown-cfg_down)
e2c: flags=8042<BROADCAST,RUNNING,MULTICAST> mtu 1500
        ether 00:20:fc:1e:63:d6 (auto-unknown-cfg_down)
e2d: flags=8042<BROADCAST,RUNNING,MULTICAST> mtu 1500
        ether 00:20:fc:1e:63:d7 (auto-unknown-cfg_down)
e7: flags=8042<BROADCAST,RUNNING,MULTICAST> mtu 1500
        ether 00:03:47:22:85:5e (auto-1000sx-fd-down) flowcontrol full
lo: flags=948049<UP,LOOPBACK,RUNNING,MULTICAST,TCPCKSUM> mtu 4056
        inet 127.0.0.1 netmask 0xff000000 broadcast 127.0.0.1
        ether 00:00:00:00:00:00 (Shared memory)
The filer volume webmail:
FILER> df
Filesystem              kbytes       used      avail capacity  Mounted on
/vol/webmail/        406736736  383407488   23329248    94%    /vol/webmail/
/vol/webmail/.snapshot  101684180          0  101684180     0%    /vol/webmail/.snapshot
is mounted in each of the 6 frontends.
NFSstat gives me:
FILER> nfsstat
Server rpc:
TCP:
calls      badcalls   nullrecv   badlen     xdrcall    
0          0          0          0          0
UDP:
calls      badcalls   nullrecv   badlen     xdrcall    
350325996760          0          0          0
Server nfs:
calls      badcalls
393275655620
Server nfs V2: (25634001711 calls)
null       getattr    setattr    root       lookup     readlink   read       
0 0%       2872301897 11%41557694 0%0 0%       5182588465 20%124663 0%  16772257357 65%
wrcache    write      create     remove     rename     link       symlink    
0 0%       604329011 2%7689267 0% 19040366 0%12918825 0%11991858 0%26947 0%   
mkdir      rmdir      readdir    statfs     
167656 0%  827086 0%  108179901 0%718 0%
Server nfs V3: (13693563851 calls)
null       getattr    setattr    lookup     access     readlink   read       
0 0%       5446218 0% 43552013 0%2360996181 17%28976674 0%97560 0%   10689673307 78%
write      create     mkdir      symlink    mknod      remove     rmdir      
410587211 3%5298393 0% 161919 0%  4 0%       0 0%       15451799 0%82070 0%   
rename     link       readdir    readdir+   fsstat     fsinfo     pathconf   
12977697 0%9969929 0% 110291020 1%0 0%       928 0%     928 0%     0 0%       
commit     
0 0%
The getattr seems way too big and this may point to a bad caching on the frontends. But
could this bring the CPU to 100% most of the time? Could this be a wafl issue related
with the low available space on the volume?
I'll first increase the nfs client cache to try to lower the getattr's. But I fear this
won't help much, further optimization, from the bottom up, is needed.
Any ideas to help optimize the performance in this scenario? Any ideas are welcome.
If you need any further info (I wanted to send a filestats but is taking an eternity...)
please ask.
TIA.
-- 
Jose Celestino japc@co.sapo.pt SysAdmin::SAPO.pt http://www.sapo.pt

main(){printf("%xu%xk%x!\n",15,12,237);}
-- 
Jose Celestino japc@co.sapo.pt SysAdmin::SAPO.pt http://www.sapo.pt
---------------------------------------------------------------------
main(){printf("%xu%xk%x!\n",15,12,237);}

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Re: Webmail farm