Edward, Telnet to the netapp and type "sysstat 1". Let it run for about 20 seconds and send us the output. Also, type "ifconfig -a" and send us the output. Mike
-----Original Message----- From: Edward Hibbert [mailto:EH@dataconnection.com] Sent: Tuesday, February 05, 2002 2:24 PM To: toasters Subject: Slow write performance
We're seeing some performance problems on an F720 which you guys might be able to help with (even though it's not exactly top of the range nowadays).
What we see is: - We're driving it over NFS v3. - We get about 1500 ops/sec out of it, of which one third are writes and two thirds are reads. There aren't many file open/close operations. - The operations are to random locations in large (10GB) files. - The CPU is running about 75%, and the network input and output below the throughput of the link we have to it. We've seen both CPU and network go higher if we do simple copy tests. - Looking it it via pktt trace, something approaching 15% of WRITE operations take long enough for the clients to time out and retransmit (so at least 1 second). None of the READ operations do. - The retransmissions appear to come in bunches. For example we'll see a few seconds where the filer doesn't respond, during which time the retransmissions will come in, then it will wake up and send some responses back. - The rest of the time, the WRITES are very fast (sub-ms).
This appears to have worsened recently. We tried a couple of things: - We thought that this might be because the disk had got full and fragmented, so we zapped a bunch of data. - We rebooted. Neither of these seemed to help much.
sysstat consistently shows a cache age of 1. This, and the bursty nature of the delays, suggest to me that I'm just hitting it too hard, and there's some kind of periodic cache-flushing operation going on, but do any of you folk have any other suggestions?
Edward Hibbert Internet Applications Group Data Connection Ltd Tel: +44 131 662 1212 Fax: +44 131 662 1345 Email: eh@dataconnection.com Web: http://www.dataconnection.com