In message Pine.BSF.4.00.9903042345180.26075-100000@staff2.texas.net, Jona h Yokubaitis writes:
On Thu, 4 Mar 1999, Tom Limoncelli wrote:
|If you monitor this mailing list you will get the impression that the |NetApp products have nothing but problems. That's because people |without problems don't need to post. It's sort of like going to an AA |meeting and complaining, "Gosh, you're all alcoholics! Doesn't anyone |here NOT have a drinking problem?"
I agree with this as well. We have run into many bugs with our netapps, but never have had an outage longer than 5minutes, and NEVER have lost any data. That IMO is a strong statement for network attached storage.
We've had an F330 running since 10/95. In all cases except one, downtime was limited to a ~5 minute outage. In 8/98 we had a major problem that brought us down for ~1.5 hours and left us crippled for the remainder of the day. During an outage we have always been able to get ahold of very helpful and knowledgable support via the 800 number.
After the major outage, alot of questions we asked by management about the server. (It also provided us with the final argument to get our cluster.) I came up with these numbers:
System in use: ~1034 days == 24816 hours
Total Downtime: ~10 hours (being harsh and counting "crippled" as down)
Percent Downtime: 10 / 24816 == 0.000352
Percent Uptime: 99.999648
I know it's not very scientific, but being able to work out an approximate uptime that matches the numbers NetApp's marketing guys hand out gives me a warm fuzzy feeling.
We also are running NetApp Release 4.3.5D2 and refuse to upgrade. "If it aint broke, don't break it"
That was my feeling until 8/98. Then we experienced bugs 4157...
Bug ID 4157
Title
Removes of very large files can cause the filer to deadlock
Problem Description
Simultaneous removes of files that together take up more than 1GB of space (i.e. a few large files or thousands of smaller ones) may cause a filer to deadlock and not provide file service. Multiple reboots may be required to clear the deadlock condition.
Release Fixed
5.0
Unpleasant doesn't begin do describe this beast. Not only did we need to reboot the filer, but the also the system which issued the delete. Our filer has ~200 clients any one of which could have done so... not a pretty picture.
It was a known bug at the time, my only saving grace was telling management that the release it was fixed in was still in "Early Access". I'd seriously consider moving up to a 5.x release.
|I used to be a big Auspex fan. Now I'm a big NetApp fan. (actually, I'm |a fan of big NetApps :-) ). The CIFS/NFS integration is so tight its a |thing of beauty. The RAID, wafl, and other features make it difficult |to even evaluate any other box. Now re-evaluating BudTool on the other |hand...
Bug 4157 was my worst experience with our filer. It occurred while we were still fighting to get an HA file server (F630 cluster) approved for purchase. And I still like 'em. We got an F740 cluster last month and going to be retiring our F330 in several months.
And the support is great (well, some issues with bad/wrong parts being sent recently -- but never couldn't get a hold of knowledgable personel in an emergency).
Mmmm....if only a native legato client :-)
Yuch. Native dump format is so much friendlier to work with than proprietary dump formats. That plus NDMP is why we went with BudTool. I won't be an issue for the filers once there's a native client, but wouldn't backup all my other hosts with it. Anyone know what's going to happen with this issue once BudTool is absorbed by Legato? I heard a rumor that BudTool v5.0 has already been scrapped. (Is there an Intelliguard mailing list?) If Legato and their tape format is our only upgrade path, we'll start looking at other products.
jason
--- Jason D. Kelleher kelleher@susq.com Susquehanna Partners, G.P. 610.617.2721 (voice) 401 City Line Ave, Suite 220 610.617.2916 (fax) Bala Cynwyd, PA 19004-1122