On Tue 2 Mar, 1999, Graham Knight <grahamk(a)ast.lmco.com> wrote:
> Our encounter (albiet brief) with 5.2.1 was pretty disasterous too. I'm
> camped at 5.1.2P2 and have no reason to move. Except for the occasional
> 5.2.1 floppy boot for the fiber channel diagnostics. ;-)
I've a long story about 5.1.2...are we sitting comfortably? I shall begin:
Well, we'd decided to move to 5.1.2 as a Y2K compliance upgrade. I
fished around on the NOW site to check whether there were any patch
releases available.
You see we've been bitten several times by the following scenario:
o upgrade to recommended release.
o get hit by *known* bug.
o get told that we should have gone straight to the latest official patch
release (now apaprently the x.y.zPa releases, but once x.y.z.Da ..)
which, bizarrely is invisible in the NOW site, unless you go fishing.
o upgrade *again*
So I fished around the un-linked-to parts of the Software Library
and found 5.1.2P1, P2, D1, D2, and which now has D5, D6, and god-(and
NetApp)-only knows what else. I never touch X releases. If any other
letters are in use, I wouldn't know.
I read over the descriptions and decided that 5.1.2P2 was the latest
and greatest and officially sanctioned with a 'P' release moniker and
that that was what we should go for.
My colleagues, gaining experience with the filers, while I went off to
upgrade our DNS/etc. infrastructure, undertook to upgrade our remaining
4 F330's, F520, 3 F630's and F540. They'd carefully tested the upgrade
on another F330 a week earlier. All was apparently go-for-launch.
After the upgrade there was a problem with the quotas one one filer -
we couldn't turn them on because the filer would crash. Because we'd
upgraded from 4.3.1 they could back-out without going to 5.0 first,
which they didn't have, and didn't know how to find in NOW (I'm the only
angler). I proposed they get NetApp support on the case and progress
the issue - as the filer was still running and serving.
For extra entertainment value 2 filers threw 3 disks between them
(1 threw 2 in quick, but not lethal, succession - scary!) and another
started misbehaving doing restores in the next day.
Then another machine threw a disk, and *crashed* as a result.
We logged calls for some failures and a call was automagically logged
for the last one, from the autosupport mail (even though they're _all_
configured for autosupport), and cores were ftp'd for analysis.
We were told that we needed to use wackz (not wackq, not wacky.. wackz
*sigh* the Great UnDocumented, what a confidence builder) from, get
this, 5.1.2P2D7. to fix the quotas problem on our one limping filer.
5. 1. 2. P2 _D7_ ferchrissakes!!!
Dutifully my colleague did so.
Now I, in my greater experience would then have booted from the 5.1.2.P2D7
floppies, and installed the 5.1.2P2D7 sysfiles, downloaded and then
rebooted avoiding at all costs the known-to-be-buggy 5.1.2P2 code that
still lurked on the filer.
Sadly my colleagues just ran the wackz and rebooted...back to the
known buggy 5.1.2 install. *sigh* I was very convincing, and somewhat
impassioned and the upgrade was done, belatedly, but hopefully without
any damage during the brief 5.1.2P2 run-time. I'm probablyoverstating
the danger, but _D7_! Sheesh!
Anyway. Long story, doubt many of you are still with me, but the moral
I draw is this:
NetApp have now achieved an overly baroque and divaricated release tree
and have obscured it behind a screen of cgi's and HTML. They've developed
wonderful support mechanisms that don't seem to be applied 100%, which
means they can't be relied upon. These issues need to be addressed before
they get worse.
Being more forthright about all the patchings at each release, and making
clear statements about which one the developers and support folk think
customers should be running, rather than hiding all but the first release
in the Software library would be very good. Patches being made for one
specific customer site should perhaps be furnished only to that site, not
placed into the public software library.
Now I love NetApp kit and software for their evident virtues and I
consistently champion them in my company, but I'm getting somewhat
disenchanted, even annoyed by the increasingly less-than-polished releases
of DOT and the support structure backing us up.
The dismal state of backup-and-restore technology is also causing me
to worry about the long-term ability of NetApp boxes to fulfill our
needs. Maybe I'll change my opinion about that when we get Veritas
Netbackup working...*if* we get it working. But that's another story, for
another day. 8)
I rather hope someone in NetApp takes up these points, seeing as so many of
the fine folks on this list are NetApp stalwarts. This is the sort of
feedback that I hesitate to give to front-line support folk lest they
take it personally, or just bury it. This list at least hits doers at all
levels in the NetApp organisation!
There, that's off my chest... And everyone lived happily ever after.
>
> One day, i'd really like to see NetApp move to a user patchable OS. Anyone
> else?
Well, the patches may be big, but NetApp boxes are user patchable. We don't
need our NetApp SE's to do everything for us. 8)
>
> Graham
>
>-- End of excerpt from Graham Knight