On Mon, Mar 04, 2002 at 05:41:29PM +0000, Jose Celestino wrote:
FILER> ifconfig -a e0: flags=848043<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.1.103 netmask 0xffffff00 broadcast 192.168.1.255 partner inet 192.168.1.104 (not in use) ether 00:a0:98:00:9f:0a (100tx-fd-up) e2a: flags=8042<BROADCAST,RUNNING,MULTICAST> mtu 1500 ether 00:20:fc:1e:63:d4 (auto-unknown-cfg_down) e2b: flags=8042<BROADCAST,RUNNING,MULTICAST> mtu 1500 ether 00:20:fc:1e:63:d5 (auto-unknown-cfg_down) e2c: flags=8042<BROADCAST,RUNNING,MULTICAST> mtu 1500 ether 00:20:fc:1e:63:d6 (auto-unknown-cfg_down) e2d: flags=8042<BROADCAST,RUNNING,MULTICAST> mtu 1500 ether 00:20:fc:1e:63:d7 (auto-unknown-cfg_down) e7: flags=8042<BROADCAST,RUNNING,MULTICAST> mtu 1500 ether 00:03:47:22:85:5e (auto-1000sx-fd-down) flowcontrol full lo: flags=948049<UP,LOOPBACK,RUNNING,MULTICAST,TCPCKSUM> mtu 4056 inet 127.0.0.1 netmask 0xff000000 broadcast 127.0.0.1 ether 00:00:00:00:00:00 (Shared memory)
Hmm...you are just using the built in 100Mbps ethernet?
Why not use the gig interface?
The filer volume webmail:
FILER> df Filesystem kbytes used avail capacity Mounted on /vol/webmail/ 406736736 383407488 23329248 94% /vol/webmail/ /vol/webmail/.snapshot 101684180 0 101684180 0% /vol/webmail/.snapshot
Looks like you aren't using snapshots, why not do a:
snap reserve webmail 1
and reduce the 'snapshot' usage as far as possible since it isn't in use.
As someone else said, reaching above 90% can be hairy, but you aren't using the snapshots, so you shouldn't be hitting '90%' anyway, unless I don't know how snapshots work.
The getattr seems way too big and this may point to a bad caching on the frontends. But could this bring the CPU to 100% most of the time? Could this be a wafl issue related with the low available space on the volume?
getattr is used to see if anything is changing on the server. It is a good thing and perhaps there isn't caching going on with your frontend servers. Read further down for more questions from me.
What kind of filer is this?
Here is what my setup is:
6 POP servers, 3 IMAP servers, 4 combo (IMAP/WEB/POP), F760 as filestorage for Maildir format mailboxes
The 6 POP/3 IMAP servers are Solaris 8 on Sparc.
The 4 combo servers are FreeBSD 4.5-STABLE.
f760-1-mc-mpls> sysstat 2 CPU NFS CIFS HTTP Net kB/s Disk kB/s Tape kB/s Cache in out read write read write age 48% 1268 0 0 2182 4825 9121 3787 0 0 1 41% 1140 0 0 2391 8735 8882 536 0 0 1 34% 1265 0 0 2129 6337 5212 0 0 0 1 35% 1286 0 0 451 10372 7669 0 0 0 1 30% 1318 0 0 484 7207 5592 0 0 0 1 35% 1463 0 0 495 10960 5616 0 0 0 1 43% 1514 0 0 602 13488 6101 260 0 0 1 54% 1567 0 0 586 14221 7018 7672 0 0 1 36% 1427 0 0 551 12698 4454 0 0 0 1 34% 1472 0 0 613 9743 5318 0 0 0 1 29% 1400 0 0 473 7173 5218 0 0 0 1 27% 1449 0 0 368 5183 4944 0 0 0 1 27% 1058 0 0 329 6395 6130 0 0 0 1
f760-1-mc-mpls> nfsstat Server rpc: TCP: calls badcalls nullrecv badlen xdrcall 802611889 6 0 0 6
UDP: calls badcalls nullrecv badlen xdrcall 43142383 16 0 0 16
Server nfs: calls badcalls 845754229 0
Server nfs V2: (43054127 calls) null getattr setattr root lookup readlink read 38 0% 22636988 53%18956 0% 0 0% 15156029 35%149737 0% 547873 1% wrcache write create remove rename link symlink 0 0% 403447 1% 752060 2% 791185 2% 207132 0% 27 0% 1127 0% mkdir rmdir readdir statfs 167242 0% 2499 0% 2219554 5% 233 0%
Server nfs V3: (802700102 calls) null getattr setattr lookup access readlink read 0 0% 194662657 24%2922584 0% 110992483 14%139483257 17%4025240 1% 313928250 39% write create mkdir symlink mknod remove rmdir 4515215 1% 2643548 0% 154071 0% 125 0% 0 0% 4375337 1% 63317 0% rename link readdir readdir+ fsstat fsinfo pathconf 4403724 1% 1988483 0% 2025756 0% 16504930 2%3452 0% 6 0% 7667 0% commit 0 0%
My data rates are as high or higher than yours, but my ops per second is lower for whatever reason. I am not running 1M mailboxes, though, only about 25K (which is a huge difference).
What operating system are your frontends using?
Solaris and FreeBSD both have very good NFS code.