We have 3 email servers running Solaris 8 and Veritas 4.1 and a fas960c pair running DOT 7.0.1R1. These are all connected to a pair of FC SAN switches.
Each email server has 4 LUNs, 2 from each filer, for a total of 12 LUNs.
Dynamic multi-pathing works fine on two email servers. However, on one server, DMP works for two LUNs and fails for two. The two LUNs that fail are coming from the same filer. If we "cf takeover" on one filer, the two LUNs get I/O errors. It doesn't matter which filer runs "cf takeover".
We have been trying to isolate the problem. The email servers are essentially identical, with the same patches and the same Veritas package installed.
lun show -v on the filers does not show any difference that matters. All LUNs are "Multiprotocol Type: solaris".
We ran the veritas command to show all the paths to the LUNs and we do not see any obvious problem with the LUNs that fail. We have the right number of paths and they are correct. The paths that are supposed to be primary are indeed primary, ditto for secondary.
We have ports 7a and 7b from both filers connected to one switch and 9a and 9b connected to the other. The email hosts all have dual port Qlogic HBAs and we have the ports plugged into different switches. The switches are not directly connected to each other.
A little more background --
A week ago we were running an older version of veritas, had only one SAN switch (instead of two) and had only one FC target adapter in each filer, and we had this same problem. So even after this major SAN reconfiguration and software upgrade, the same problem persists.
Does anyone have any ideas?
We are thinking about creating two new LUNs, migrating the data over from the bad LUNs and deleting them.
Steve Losen scl@virginia.edu phone: 434-924-0640
University of Virginia ITC Unix Support