Make sure ALUA is enabled on Solaris as well as on NetApp and set the cfmode to single image on NetApp.
I suspect ALUA may be off on Solaris. If you installed the host utilities for solaris run "basic_config -ssd_set", "reboot -- -r", "stmsboot -e" (do not let stmsboot to reboot; use reboot instead in next step), "reboot -- -r". This should work.
Silviu .
________________________________
From: Linux Admin To: Brad Knowles Cc: Jack Lyons ; NetApp Toasters List Sent: Thu Mar 05 17:15:45 2009 Subject: Re: FCP Partner Path Misconfigured - Host I/O access through a non-primary and non-optimal path was detected.
I am bit confused, even Solaris 10 with MPIO has problme figure out what correct path to take
It has 2 path on san (sw1 and sw2) and efectivly 4 path to lun (netapp1 and netapp2 are connected each into sw1 and sw2)
Why would both Solaris and ESX server access netapp1 luns via netapp2 for most parts?
Should I just kill fabric mash and have netap1-sw1 and netap2-sw2 to simplify things?
Do host utulities for Solaris help with that at all?
On Thu, Mar 5, 2009 at 3:59 PM, Brad Knowles brad@shub-internet.org wrote:
Linux Admin wrote:
I solaris 10 as bad about this as ESX server? I also see the same issue with Solaris server
Ironically, I overheard my co-workers talking about this very subject outside my office just a few minutes ago. There are apparently still problems with ESX with Linux (and maybe Solaris) VMs where there is a minor hiccup and the pre-configured primary path is replaced by ESX with the alternate path. Unlike what happens when you do a VMotion, the OS is not stopped by ESX during this process -- all the VM sees is that the disks have gone away (albeit for a brief period of time), so they mark the disks as read-only and then you're hosed until someone logs in and reboots the VM or otherwise fixes it. I think this is caused by a disconnect with the multipathing done at the ESX level and no multipathing done at the VM level. I strongly suspect you'll need to install multipathing software at the VM level that is compatible with the multipathing used by your storage -- so, EMC PowerPath, if the underlying storage is on EMC, etc.... Even if ESX is doing multipathing just fine, that doesn't necessarily mean that the hosted VMs can also automatically do multipathing in a way that will work with what the storage vendor supplies, what ESX does, and so on. Our solution is that the storage is simply never allowed to go down, and we don't touch anything that is served to the ESX machines without first getting prior approval and coordination with the ESX administrators, as well as with the appropriate VM administrators. And when you have hundreds of VMs on a set of ESX servers in a single infrastructure which could potentially be VMotion'ed at any time from one node to another, that is a hell of a lot of pre-coordination. Even the slightest bump of a cable could do something that the storage or ESX doesn't like, they do their multipath thing, and then all the affected VMs on the box are screwed. Oops. -- Brad Knowles brad@shub-internet.org If you like Jazz/R&B guitar, check out LinkedIn Profile: my friend bigsbytracks on YouTube at http://tinyurl.com/y8kpxu http://preview.tinyurl.com/bigsbytracks
Another common mistake made on Solaris/FCP configurations is the unintentional use of the mpxio_set script. I've seen some FCP customers run "mpxio_set -e" by mistake. The end result of doing this is that all paths appear to be primary paths. Running "mpxio_set -e" is only done for iSCSI configurations.
Here are some notes about this:
* If ALUA is enabled on the filer, you must not use the "mpxio_set -e" command. * If you had previously used the mpxio_set script to add Netapp settings to the /kernel/drv/scsi_vhci.conf file, ALUA will not work. You MUST remove those using the "mpxio_set -d" command and then reboot to enable ALUA.
________________________________
From: Angelescu, Silviu Sent: Thursday, March 05, 2009 5:05 PM To: sysadmin.linux@gmail.com; brad@shub-internet.org Cc: jack1729@gmail.com; toasters@mathworks.com Subject: Re: FCP Partner Path Misconfigured - Host I/O access through a non-primary and non-optimal path was detected.
Make sure ALUA is enabled on Solaris as well as on NetApp and set the cfmode to single image on NetApp.
I suspect ALUA may be off on Solaris. If you installed the host utilities for solaris run "basic_config -ssd_set", "reboot -- -r", "stmsboot -e" (do not let stmsboot to reboot; use reboot instead in next step), "reboot -- -r". This should work.
Silviu .
________________________________
From: Linux Admin To: Brad Knowles Cc: Jack Lyons ; NetApp Toasters List Sent: Thu Mar 05 17:15:45 2009 Subject: Re: FCP Partner Path Misconfigured - Host I/O access through a non-primary and non-optimal path was detected.
I am bit confused, even Solaris 10 with MPIO has problme figure out what correct path to take
It has 2 path on san (sw1 and sw2) and efectivly 4 path to lun (netapp1 and netapp2 are connected each into sw1 and sw2)
Why would both Solaris and ESX server access netapp1 luns via netapp2 for most parts?
Should I just kill fabric mash and have netap1-sw1 and netap2-sw2 to simplify things?
Do host utulities for Solaris help with that at all?
On Thu, Mar 5, 2009 at 3:59 PM, Brad Knowles brad@shub-internet.org wrote:
Linux Admin wrote:
I solaris 10 as bad about this as ESX server? I also see the same issue with Solaris server
Ironically, I overheard my co-workers talking about this very subject outside my office just a few minutes ago. There are apparently still problems with ESX with Linux (and maybe Solaris) VMs where there is a minor hiccup and the pre-configured primary path is replaced by ESX with the alternate path. Unlike what happens when you do a VMotion, the OS is not stopped by ESX during this process -- all the VM sees is that the disks have gone away (albeit for a brief period of time), so they mark the disks as read-only and then you're hosed until someone logs in and reboots the VM or otherwise fixes it. I think this is caused by a disconnect with the multipathing done at the ESX level and no multipathing done at the VM level. I strongly suspect you'll need to install multipathing software at the VM level that is compatible with the multipathing used by your storage -- so, EMC PowerPath, if the underlying storage is on EMC, etc.... Even if ESX is doing multipathing just fine, that doesn't necessarily mean that the hosted VMs can also automatically do multipathing in a way that will work with what the storage vendor supplies, what ESX does, and so on. Our solution is that the storage is simply never allowed to go down, and we don't touch anything that is served to the ESX machines without first getting prior approval and coordination with the ESX administrators, as well as with the appropriate VM administrators. And when you have hundreds of VMs on a set of ESX servers in a single infrastructure which could potentially be VMotion'ed at any time from one node to another, that is a hell of a lot of pre-coordination. Even the slightest bump of a cable could do something that the storage or ESX doesn't like, they do their multipath thing, and then all the affected VMs on the box are screwed. Oops. -- Brad Knowles brad@shub-internet.org If you like Jazz/R&B guitar, check out LinkedIn Profile: my friend bigsbytracks on YouTube at http://tinyurl.com/y8kpxu http://preview.tinyurl.com/bigsbytracks