On Tue 2 Mar, 1999, Graham Knight grahamk@ast.lmco.com wrote:
Our encounter (albiet brief) with 5.2.1 was pretty disasterous too. I'm camped at 5.1.2P2 and have no reason to move. Except for the occasional 5.2.1 floppy boot for the fiber channel diagnostics. ;-)
I've a long story about 5.1.2...are we sitting comfortably? I shall begin:
Well, we'd decided to move to 5.1.2 as a Y2K compliance upgrade. I fished around on the NOW site to check whether there were any patch releases available.
You see we've been bitten several times by the following scenario: o upgrade to recommended release. o get hit by *known* bug. o get told that we should have gone straight to the latest official patch release (now apaprently the x.y.zPa releases, but once x.y.z.Da ..) which, bizarrely is invisible in the NOW site, unless you go fishing. o upgrade *again*
So I fished around the un-linked-to parts of the Software Library and found 5.1.2P1, P2, D1, D2, and which now has D5, D6, and god-(and NetApp)-only knows what else. I never touch X releases. If any other letters are in use, I wouldn't know.
I read over the descriptions and decided that 5.1.2P2 was the latest and greatest and officially sanctioned with a 'P' release moniker and that that was what we should go for.
My colleagues, gaining experience with the filers, while I went off to upgrade our DNS/etc. infrastructure, undertook to upgrade our remaining 4 F330's, F520, 3 F630's and F540. They'd carefully tested the upgrade on another F330 a week earlier. All was apparently go-for-launch.
After the upgrade there was a problem with the quotas one one filer - we couldn't turn them on because the filer would crash. Because we'd upgraded from 4.3.1 they could back-out without going to 5.0 first, which they didn't have, and didn't know how to find in NOW (I'm the only angler). I proposed they get NetApp support on the case and progress the issue - as the filer was still running and serving.
For extra entertainment value 2 filers threw 3 disks between them (1 threw 2 in quick, but not lethal, succession - scary!) and another started misbehaving doing restores in the next day.
Then another machine threw a disk, and *crashed* as a result.
We logged calls for some failures and a call was automagically logged for the last one, from the autosupport mail (even though they're _all_ configured for autosupport), and cores were ftp'd for analysis.
We were told that we needed to use wackz (not wackq, not wacky.. wackz *sigh* the Great UnDocumented, what a confidence builder) from, get this, 5.1.2P2D7. to fix the quotas problem on our one limping filer.
5. 1. 2. P2 _D7_ ferchrissakes!!!
Dutifully my colleague did so.
Now I, in my greater experience would then have booted from the 5.1.2.P2D7 floppies, and installed the 5.1.2P2D7 sysfiles, downloaded and then rebooted avoiding at all costs the known-to-be-buggy 5.1.2P2 code that still lurked on the filer.
Sadly my colleagues just ran the wackz and rebooted...back to the known buggy 5.1.2 install. *sigh* I was very convincing, and somewhat impassioned and the upgrade was done, belatedly, but hopefully without any damage during the brief 5.1.2P2 run-time. I'm probablyoverstating the danger, but _D7_! Sheesh!
Anyway. Long story, doubt many of you are still with me, but the moral I draw is this:
NetApp have now achieved an overly baroque and divaricated release tree and have obscured it behind a screen of cgi's and HTML. They've developed wonderful support mechanisms that don't seem to be applied 100%, which means they can't be relied upon. These issues need to be addressed before they get worse.
Being more forthright about all the patchings at each release, and making clear statements about which one the developers and support folk think customers should be running, rather than hiding all but the first release in the Software library would be very good. Patches being made for one specific customer site should perhaps be furnished only to that site, not placed into the public software library.
Now I love NetApp kit and software for their evident virtues and I consistently champion them in my company, but I'm getting somewhat disenchanted, even annoyed by the increasingly less-than-polished releases of DOT and the support structure backing us up.
The dismal state of backup-and-restore technology is also causing me to worry about the long-term ability of NetApp boxes to fulfill our needs. Maybe I'll change my opinion about that when we get Veritas Netbackup working...*if* we get it working. But that's another story, for another day. 8)
I rather hope someone in NetApp takes up these points, seeing as so many of the fine folks on this list are NetApp stalwarts. This is the sort of feedback that I hesitate to give to front-line support folk lest they take it personally, or just bury it. This list at least hits doers at all levels in the NetApp organisation!
There, that's off my chest... And everyone lived happily ever after.
One day, i'd really like to see NetApp move to a user patchable OS. Anyone else?
Well, the patches may be big, but NetApp boxes are user patchable. We don't need our NetApp SE's to do everything for us. 8)
Graham
-- End of excerpt from Graham Knight