John,
Did you have a look at fastpath?
It seems every time we put in a case for upgrades to 9.3 netapp support tries to make sure we looked into this! So it must've bitten a lot of folks.
We did run into a pretty big bug on the upgrade from 9.1 to 9.3P15 -- we have a case/core in now. I've seen nfs stop serving from a node in at least 3 clusters roughly a 2-5 hours after the upgrade. We fix it by indicating the unresponsive node and either powering it down, or via NMI/SP. It will not respond to normal takeover commands. Preliminary core analysis (no full core analysis yet) points at at least 1 bug fixed in 9.3P17.
Typically when we roll updates it can take months given the number of nodes and clusters. So we stick with whatever P patch we rolled on the first set of nodes, and then by the end of upgrades 1-3 P patches are released.
With this experience, always use the latest P patch possible on the intermediary update especially if you are going to take a bit to roll it through your entire deployment. I also recommend taking a look at going to 9.5, it sounds nuts, but we've had better stability with this release. We move to this release because of a specific feature that was needed (CIFS/SMB enhancements, and flexcache/flexgroups).
Regards,
Douglas