I almost mentioned that the Cisco 3750 switch was a model where I'd seen problems with port hashing. The 3750 has been around for a while, so it might have improved over time. I've also seen odd problems related to those four SFP ports on the right hand side of the switch.

A Nexus shouldn't have issues, unless vPC's are in use. There are some odd special settings to make vPC's play nice with LACP with some vendors including NetApp, and there have been a few vPC related bugs as well.

From: Filip Sneppe [mailto:filip.sneppe@gmail.com]
Sent: Friday, February 03, 2017 10:28 AM
To: Steiner, Jeffrey <Jeffrey.Steiner@netapp.com>
Cc: NGC-wouter.vervloesem-neoria.be <wouter.vervloesem@neoria.be>; NGC-pdg-uow.edu.au <pdg@uow.edu.au>; toasters@teaparty.net
Subject: Re: super secret flags

Hi Jeffrey,

In our case(s), the determining factor was that the distr-func was set to "port" on the sender/source side of the SnapMirror relationship.

At the receiving end, this setting didn't matter. Yes, we are aware that the hashing algorithm does not need to be matched between both sides (including the hashing algorithm on the switch).

Also, I suspect it's not so much an LACP issue and we would probably have run into the same issue with a static multimode etherchannel too, although we've never tested this.

Before we had to break up our testing environment, we had tested and confirmed this behavior on Cisco 3750 and Nexus switches. Those aren't very exotic so we did worry about the performance drop.

ps. great thread by the way. Thanks Peter D. Gray for that other hidden flag in your reply :-)

Best regards,

Filip

On Fri, Feb 3, 2017 at 9:24 AM, Steiner, Jeffrey <Jeffrey.Steiner@netapp.com> wrote:

There is no requirement for the LACP hashing configuration to be the same on both sides. On the whole, it doesn't make any difference if there's a mismatch.

The important thing that a lot of people miss is that LACP distribution policies are controlled by the sending device. There is no negotiation. For example, you can have ONTAP using IP hashing, while the switch is using src-dst-MAC hashing. That might be a bad idea, such as with a routed environment where only 2 MAC addresses are talking, but it doesn't create a compatibility problem.

I've seen a few older switches that really don't like port hashing. I'm not sure exactly what's happening, but it seemed like the architecture of the switch wasn't expecting the same IP/MAC to appear on different multiple ports. It would pass traffic, but the CPU utilization jumped up significantly when any kind of port hashing was being used. Changing to IP solved the problem.

From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Vervloesem Wouter
Sent: Friday, February 03, 2017 9:15 AM
To: NGC-pdg-uow.edu.au <pdg@uow.edu.au>
Cc: toasters@teaparty.net
Subject: Re: super secret flags

We don't think the issue was caused by LACP.

The difference between a configuration that replicates 'fast' or 'slow' was the distr-func set to IP (fast) or port (slow).

In both situations we did use "multimode_lacp" as mode.

Regards,

Wouter Vervloesem
Storage Consultant

Neoria NV
Prins Boudewijnlaan 41 - 2650 Edegem
T +32 3 451 23 82 | M +32 496 52 93 61

Op 2 feb. 2017, om 23:06 heeft Peter D. Gray <pdg@uow.edu.au> het volgende geschreven:

On Thu, Feb 02, 2017 at 03:41:19PM +0100, Filip Sneppe wrote:

Hi Jeffrey and others,

I have come across this on a couple of times (First time I encountered this
I logged a case for it: 2005111796). Unforunately I have never had the time
to troubleshoot this. In case 2005111796, support observed packet loss in
the setup with port-based hashing, but we had to destroy our
(test/troubleshooting) setup before we could get to the bottom of this.
Since then, I have come across this on several occasions. More often than
not it was not a real issue since those SnapMirrors ran across WAN links,
or SnapMirror runs at night and can take all the time it wants, but on
1Gbps/10Gbps LANs where SM updates need to be fast, it is an issue.
However, I found out there is a TR that mentions that SnapMirror
performance could be impacted by port-based ifgrps so I've never bothered
to open any additional cases for this.

Can anyone else confirm this behavior ?

Is this an LACP ifgrp? We have had no issues with LACP on cluster mode.
On 7-mode, we saw many missed LACP packets, but like you never investigated
fully because it kept working. One thing our network guys drum into us is that the LACP setting
MUST agree at both ends.

Regards,
pdg

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters