I am bit confused, even Solaris 10 with MPIO has problme figure out what correct path to take
It has 2 path on san (sw1 and sw2) and efectivly 4 path to lun (netapp1 and netapp2 are connected each into sw1 and sw2)
Why would both Solaris and ESX server access netapp1 luns via netapp2 for most parts?
Should I just kill fabric mash and have netap1-sw1 and netap2-sw2 to simplify things?
Do host utulities for Solaris help with that at all?
Linux Admin wrote:Ironically, I overheard my co-workers talking about this very subject outside my office just a few minutes ago.
I solaris 10 as bad about this as ESX server?
I also see the same issue with Solaris server
There are apparently still problems with ESX with Linux (and maybe Solaris) VMs where there is a minor hiccup and the pre-configured primary path is replaced by ESX with the alternate path. Unlike what happens when you do a VMotion, the OS is not stopped by ESX during this process -- all the VM sees is that the disks have gone away (albeit for a brief period of time), so they mark the disks as read-only and then you're hosed until someone logs in and reboots the VM or otherwise fixes it.
I think this is caused by a disconnect with the multipathing done at the ESX level and no multipathing done at the VM level. I strongly suspect you'll need to install multipathing software at the VM level that is compatible with the multipathing used by your storage -- so, EMC PowerPath, if the underlying storage is on EMC, etc....
Even if ESX is doing multipathing just fine, that doesn't necessarily mean that the hosted VMs can also automatically do multipathing in a way that will work with what the storage vendor supplies, what ESX does, and so on.
Our solution is that the storage is simply never allowed to go down, and we don't touch anything that is served to the ESX machines without first getting prior approval and coordination with the ESX administrators, as well as with the appropriate VM administrators.
And when you have hundreds of VMs on a set of ESX servers in a single infrastructure which could potentially be VMotion'ed at any time from one node to another, that is a hell of a lot of pre-coordination.
Even the slightest bump of a cable could do something that the storage or ESX doesn't like, they do their multipath thing, and then all the affected VMs on the box are screwed. Oops.
--
Brad Knowles
<brad@shub-internet.org> If you like Jazz/R&B guitar, check out
LinkedIn Profile: my friend bigsbytracks on YouTube at
<http://tinyurl.com/y8kpxu> http://preview.tinyurl.com/bigsbytracks