Hello Toasters!
I hope to clarify the multiple controller issue with a fairly long winded note. If I leave out anyone's point on RAID groups, or volumes, spanning FCAL controllers, let me know; or let the whole list know. I'm game.
I suspect this precautionary rule of thumb arose from a hardware bug in the 700 series, burt 19290. There's a short description on NOW here:
http://now.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=19290
It's cryptic and brief, and should probably be reworded. I'll look into that. The long story is, arbitrating multiple PCI bus requests on 700 series hardware didn't go as smoothly as we'd hope, resulting in performance degradation in this case. How often does the case come up, you ask? The worst case is when writes over 100tx ethernet go to a quad ethernet card (which causes a lot of interrupts) in slot 1, 2 or 3; an FCAL controller is also in slot 1, 2 or 3; and another FCAL controller with destination disks for the write op is in slot 4, 5, 6 or 7. It's worth pointing out that gigabit controllers buffer data much more efficiently, and aren't throttled by waiting for PCI interrupts. A single ethernet card doesn't have so much data to unload, so it can get the bus much more readily. A quad card has lots of data to unload, with very little buffer space on the controller. Scheduling interrupts for the NIC to unload the data, and two other interrupts to load data on to the two FCAL controllers, will not go as fast as the quad card would like. This is why the NIC starts reporting h/w overflows and bus underruns in ifstat.
For the raid group / volume distinction, writes are allocated per volume, and the free space in the given raid groups will determine where the writes go. NVRAM isn't divided on a per-volume or per-raid group basis. The write data, by and large, isn't in NVRAM. The write transaction description is in NVRAM until the write is committed to disk, normally with the data being served from system memory. Teeny tiny writes are the exception to this. If you blow out an FCAL controller (which really doesn't happen that often), WAFL won't see the disks on the end of the controller. If either a volume or a raid group in a given volume is split across this hypothetically blown controller, the affected volume will lose disks. If it loses more than two in a raid group, it'll go away. Performance and redundancy are at odds for this configuration, but the exposure is low.
So, the following configurations are safe for multi-adapter volumes:
+ 800-series filers + GbE workloads + non write-intensive workloads + writes directed at multiple volumes simultaneously + 700-series implementing the workaround in burt 19290
Anyone left?
WAFL volumes can, and often do, span FCAL adapters. In many cases there is a performance benefit from spreading volumes across multiple FCALs, due to improved load balancing between adapters. Performance will not suffer if you split a RAID group (or volume) across adapters, barring bug 19290. Most of our F840 performance benchmarks are run in this configuration. Follow up questions are welcome, but may not be answered until Monday. Enjoy!