The root problem here is nodes not part of the cluster getting resets from linux hosts. I'm not a low level scsi expert, but once we had a problem that resulted in resets being sent and causing issues, and I think I remember hearing that they affect the entire zone. Meaning everything that can "see" the initiator will be told to reset.
It's nondisruptive to change from your zoning setup to a more optimal one where each zone contains a single initiator and a single target. Also, you mentioned "hard" zoning- did you mean that literally, like your zones have physical port locations in them?
On Mon, Feb 2, 2015 at 9:10 AM, Momonth momonth@gmail.com wrote:
On Mon, Feb 2, 2015 at 1:03 PM, Borzenkov, Andrei andrei.borzenkov@ts.fujitsu.com wrote:
My best guess is that filer ports were configured as initiator by
default and somehow conflicted with host HBAs (filer will try to use LUNs is found as disks). Do you use two port zones on fan-out (single initiator
- multiple targets)? Note that motherboard replacement procedure recommends
unconnecting ports until they are properly configured.
Due to "historical reasons" our zones are "two initiators, multiple targets", i know it's sub-optimal, but that's the way it is. Such zones always worked with controlled failovers, OnTAP upgrades etc.
When the NetApp technician arrived, I specifically asked him if it would be the best to disable respective ports on the fabrics for the filer in question (as I bet I saw this behaviour already once), but the answer was "no, it sould not affect anything".
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters