In message <Pine.BSF.4.00.9903042345180.26075-100000(a)staff2.texas.net>, Jona
h Yokubaitis writes:
>
>On Thu, 4 Mar 1999, Tom Limoncelli wrote:
>
>|If you monitor this mailing list you will get the impression that the
>|NetApp products have nothing but problems. That's because people
>|without problems don't need to post. It's sort of like going to an AA
>|meeting and complaining, "Gosh, you're all alcoholics! Doesn't anyone
>|here NOT have a drinking problem?"
>
>I agree with this as well. We have run into many bugs with our
>netapps, but never have had an outage longer than 5minutes, and NEVER
>have lost any data. That IMO is a strong statement for network
>attached storage.
We've had an F330 running since 10/95. In all cases except one,
downtime was limited to a ~5 minute outage. In 8/98 we had a major
problem that brought us down for ~1.5 hours and left us crippled
for the remainder of the day. During an outage we have always been
able to get ahold of very helpful and knowledgable support via the
800 number.
After the major outage, alot of questions we asked by management
about the server. (It also provided us with the final argument to
get our cluster.) I came up with these numbers:
System in use: ~1034 days == 24816 hours
Total Downtime: ~10 hours (being harsh and counting
"crippled" as down)
Percent Downtime: 10 / 24816 == 0.000352
Percent Uptime: 99.999648
I know it's not very scientific, but being able to work out an
approximate uptime that matches the numbers NetApp's marketing guys
hand out gives me a warm fuzzy feeling.
>We also are running NetApp Release 4.3.5D2 and refuse to upgrade.
>"If it aint broke, don't break it"
That was my feeling until 8/98. Then we experienced bugs 4157...
Bug ID 4157
Title
Removes of very large files can cause the filer to deadlock
Problem Description
Simultaneous removes of files that together take up more
than 1GB of space (i.e. a few large files or thousands of
smaller ones) may cause a filer to deadlock and not provide
file service. Multiple reboots may be required to clear
the deadlock condition.
Release Fixed
5.0
Unpleasant doesn't begin do describe this beast. Not only did we
need to reboot the filer, but the also the system which issued the
delete. Our filer has ~200 clients any one of which could have
done so... not a pretty picture.
It was a known bug at the time, my only saving grace was telling
management that the release it was fixed in was still in "Early
Access". I'd seriously consider moving up to a 5.x release.
>|I used to be a big Auspex fan. Now I'm a big NetApp fan. (actually, I'm
>|a fan of big NetApps :-) ). The CIFS/NFS integration is so tight its a
>|thing of beauty. The RAID, wafl, and other features make it difficult
>|to even evaluate any other box. Now re-evaluating BudTool on the other
>|hand...
Bug 4157 was my worst experience with our filer. It occurred while
we were still fighting to get an HA file server (F630 cluster)
approved for purchase. And I still like 'em. We got an F740
cluster last month and going to be retiring our F330 in several
months.
And the support is great (well, some issues with bad/wrong parts
being sent recently -- but never couldn't get a hold of
knowledgable personel in an emergency).
>Mmmm....if only a native legato client :-)
Yuch. Native dump format is so much friendlier to work with than
proprietary dump formats. That plus NDMP is why we went with
BudTool. I won't be an issue for the filers once there's a native
client, but wouldn't backup all my other hosts with it. Anyone
know what's going to happen with this issue once BudTool is
absorbed by Legato? I heard a rumor that BudTool v5.0 has already
been scrapped. (Is there an Intelliguard mailing list?) If Legato
and their tape format is our only upgrade path, we'll start looking
at other products.
jason
---
Jason D. Kelleher kelleher(a)susq.com
Susquehanna Partners, G.P. 610.617.2721 (voice)
401 City Line Ave, Suite 220 610.617.2916 (fax)
Bala Cynwyd, PA 19004-1122