Hi there
I have a setup with a two node Fabric MetroCluster with A300 nodes and four Brocade 6510 switches places about 10KM from each other. Our ESXi 7.0 hosts connects via 16G FC using another four frontend Brocade 6510 switches. Our ESXi hosts can see four paths for each LUN they are presented from the SVMs. ESXi show all four paths as Active (I/O) which I find a bit odd, because two of the paths are remote to the ESXi hosts… Both the ESXi and the igroup on the NetApp is configured for ALUA… but I have googled that there can be issues with this, and maybe we have those issues…
The main issue is that if we do a few performance tests across different datastores (local and remote to the ESXi host), we see OK performance for the local datastores (800+MB/sec.) but if we try to test against a datastore that is remote to the ESXi host we see 50-60MB/sec. which is a huge difference and leads us to question the setup…
We are aware that especially writing to a remote datastore will involve a transfer to the remote controller (10KM) this controller then has to write this to it’s peer (remote) controller (10KM) the remote peer than has to send an ack. back to the peer (10KM) which then sends an ack. to the ESXi host (10KM) so all in all 40KM plus waiting for the systems… it is not ideal, but is this kind of performance normal? The two nodes does have an ethernet based cluster interconnect which is currently linked at 1Gb, and I am beginning to suspect that the data between the two nodes is going via this link? But the more I think about it, the more it does not make sense?
Our FC interswitch links are running 8Gb, but on the switches we see nothing near saturation of any ports… and we of cause also checked for port errors of any kind…
If anyone has a similar setup, any help would be great… we are doing a few more tests, but we are close to opening a case with NetApp…
/B
Hi
Your ESXi hosts will only see the paths to the cluster that owns the LUN. Because you have a two node MetroCluster, only a single node will present the LUN. The DR-partner in the MetroCluster setup doesn’t announce the paths until going into switchover mode.
So you have a single node on each site, connected with a single connection to each switch?
If your ESXi hosts are also connected with a single link to each switch I wouldn’t expect 4 paths per LUN.
The write operation for the NetApp is the same if you do the test for remote or local datastores. The moment the write reaches the NetApp, each write has to be committed by both systems before sending the acknowledgement. The only difference is the frontend fabric. If you see that order of difference between remote of local connectivity, I would look into the ISL of your frontend fabric.
Met vriendelijke groeten,
Wouter Vervloesem
Senior Consultant
Neoria NV
Prins Boudewijnlaan 41 - 2650 Edegem
T +32 3 451 23 82 | M +32 496 52 93 61
Van: Toasters toasters-bounces@teaparty.net namens Heino Walther hw@beardmann.dk Datum: dinsdag 13 december 2022 om 22:31 Aan: "toasters@teaparty.net" toasters@teaparty.net Onderwerp: Two node MetroCluster performance issues?
Hi there
I have a setup with a two node Fabric MetroCluster with A300 nodes and four Brocade 6510 switches places about 10KM from each other.
Our ESXi 7.0 hosts connects via 16G FC using another four frontend Brocade 6510 switches.
Our ESXi hosts can see four paths for each LUN they are presented from the SVMs.
ESXi show all four paths as Active (I/O) which I find a bit odd, because two of the paths are remote to the ESXi hosts…
Both the ESXi and the igroup on the NetApp is configured for ALUA… but I have googled that there can be issues with this, and maybe we have those issues…
The main issue is that if we do a few performance tests across different datastores (local and remote to the ESXi host), we see OK performance for the local datastores (800+MB/sec.) but if we try to test against a datastore that is remote to the ESXi host we see 50-60MB/sec. which is a huge difference and leads us to question the setup…
We are aware that especially writing to a remote datastore will involve a transfer to the remote controller (10KM) this controller then has to write this to it’s peer (remote) controller (10KM) the remote peer than has to send an ack. back to the peer (10KM) which then sends an ack. to the ESXi host (10KM) so all in all 40KM plus waiting for the systems… it is not ideal, but is this kind of performance normal?
The two nodes does have an ethernet based cluster interconnect which is currently linked at 1Gb, and I am beginning to suspect that the data between the two nodes is going via this link? But the more I think about it, the more it does not make sense?
Our FC interswitch links are running 8Gb, but on the switches we see nothing near saturation of any ports… and we of cause also checked for port errors of any kind…
If anyone has a similar setup, any help would be great… we are doing a few more tests, but we are close to opening a case with NetApp…
/B