Greetings fellow toaster admins
I hope someone can shed some light on the IP distribution function for lacp ifgrps.
We have a four port, multi mode lacp ifgrp, a0a, using interfaces e2a-e2d. We observe SnapMirror traffic egressing port e2c.
The XOR of the last two bits of the source and destination IPs are either x0 or x3 so I am expecting traffic to egress either the first or fourth port in the ifgrp.
Assuming e2a is port 0, e2b is port 1, etc., I would expect traffic to egress either e2a (0x0) or e2d (0x3).
What am I missing? I can’t find any details about the actual hashing function for IP distribution or port member indexing in an ifgrp to confirm my assumptions.
Both clusters are a single A250 HA pair running 9.15.1.
I’m typing this with my thumbs so apologies for any typos or insufficient detail.
Best wishes Stephen
"Stephen" == Stephen Stocke scstocke@gmail.com writes:
Greetings fellow toaster admins I hope someone can shed some light on the IP distribution function for lacp ifgrps.
You need to give more information on your setup, especially what kind of switches you're using and how they're configured.
We have a four port, multi mode lacp ifgrp, a0a, using interfaces e2a-e2d. We observe SnapMirror traffic egressing port e2c.
So? Why do you care?
The XOR of the last two bits of the source and destination IPs are either x0 or x3 so I am expecting traffic to egress either the first or fourth port in the ifgrp.
Assuming e2a is port 0, e2b is port 1, etc., I would expect traffic to egress either e2a (0x0) or e2d (0x3).
Share your config (cli output) so we can look at it.
What am I missing? I can’t find any details about the actual hashing function for IP distribution or port member indexing in an ifgrp to confirm my assumptions.
Both clusters are a single A250 HA pair running 9.15.1.
Are you seeing performance problems? Are you seeing that your traffic isn't being balanced across all your links?
I guess I really don't understand the problem you're trying to solve, unless oyu're just looking for info on why it works this way, which might really be a Netapp only answer.
Neoria - Public
Hello Stephen,
This document, although already 12y old, is still a good reference. https://www.netapp.com/media/19900-tr-4847.pdf
Check pp7 and the following page(s) regarding load balancing. “Port” distri function is generally known to be the preferred one when configuring an ifgrp, but it may also depend on the brand/type of switch stack to which you connected the nics of the controllers.
Good luck & best regards,
Peter Tas.
Neoria - Public
From: John Stoffel john@stoffel.org Date: Tuesday, 4 February 2025 at 18:29 To: Stephen Stocke scstocke@gmail.com Cc: toasters@teaparty.net toasters@teaparty.net Subject: Re: Igfrp IP Distribution Function
"Stephen" == Stephen Stocke scstocke@gmail.com writes:
Greetings fellow toaster admins I hope someone can shed some light on the IP distribution function for lacp ifgrps.
You need to give more information on your setup, especially what kind of switches you're using and how they're configured.
We have a four port, multi mode lacp ifgrp, a0a, using interfaces e2a-e2d. We observe SnapMirror traffic egressing port e2c.
So? Why do you care?
The XOR of the last two bits of the source and destination IPs are either x0 or x3 so I am expecting traffic to egress either the first or fourth port in the ifgrp.
Assuming e2a is port 0, e2b is port 1, etc., I would expect traffic to egress either e2a (0x0) or e2d (0x3).
Share your config (cli output) so we can look at it.
What am I missing? I can’t find any details about the actual hashing function for IP distribution or port member indexing in an ifgrp to confirm my assumptions.
Both clusters are a single A250 HA pair running 9.15.1.
Are you seeing performance problems? Are you seeing that your traffic isn't being balanced across all your links?
I guess I really don't understand the problem you're trying to solve, unless oyu're just looking for info on why it works this way, which might really be a Netapp only answer.
_______________________________________________ toasters mailing list -- toasters@lists.teaparty.net To unsubscribe send an email to toasters-leave@lists.teaparty.net
Hello
Thank you to everyone who replied on and off list.
I believe I have a good understanding of ifgrp configuration and the purpose of the various distribution functions. In this case I'm asking about the specifics of the IP distribution function: how the hash is actually calculated with a given set of source and destination IP addresses and the result used to select the egress interface. As John suggests, this may be a case where only Netapp can answer the question.
Quoting from TR-4847 https://www.netapp.com/media/19900-tr-4847.pdf which Peter linked to (emphasis mine):
IP: Second-best load distribution method, since the IP addresses of both sender (LIF) and client are used to deterministically select the particular physical link that a packet traverses. Although deterministic in the selection of a port, the *balancing is * *performed using an advanced hash function*. This has been found to work under a wide variety of circumstances, but particular selections of IP addresses might still lead to unequal load distribution.
The only information I can find about the 'advanced hash function' is in an old 7-mode KB https://kb.netapp.com/Legacy/ONTAP/7Mode/How_does_load_balancing_on_a_VIF_work which states (emphasis mine):
*As of Data ONTAP 7.3.2, a multimode or LACP ifgrp/vif uses an implementation of a "SuperFastHash"*, utilizing the last 16 bits of the source and destination IP addresses (-b ip), the last 16 bits of the source and destination MAC addresses (-b mac), or the last 16 bits of the source and destination IP addresses in combination with the source and destination TCP port (-b port).
The output of the algorithm results in a far more dynamic, more balanced distribution than the algorithm used in versions of Data ONTAP prior to 7.3.2. The result is still the same, however, in that each TCP stream will associate with only one interface, allowing for only one port's worth of bandwidth per TCP stream.
Data ONTAP releases prior to 7.3.2:
Documentation *prior to 7.3.2* states this as the formula:
*((source_address XOR destination_address) % number_of_links)*
When I was asked to look into a SnapMirror traffic imbalance, I started by calculating the IP hash using the 'standard' load balancing formula you would expect for source and destination IP. However, the results indicated different behaviour than what we are seeing. After doing more research today, I ran across the old KB quoted above and also this Github https://github.com/qdrddr/ontap-lacp page. However, it looks like the code was created to replicate empirical data rather than being the actual SuperFastHash algorithm used by Netapp.
*Why do I care?* Well, mostly just academic curiosity about the 'better mouse trap' that Netapp built. The imbalance we are seeing isn't causing any issues. It was just one of those curious things that we investigated to understand why it was happening. Also, it will be much easier for me to tweak the intercluster LIF IPs to get the desired traffic balancing in the short term than waiting for a maintenance window to tear down the data ifgrps and rebuild them with the 'Port' distribution function. (We will rebuild them eventually.)
Thanks again. I hope someone can shed a bit more light on the 'SuperFastHash' or whatever has replaced it in ONTAP 9.
Best wishes Stephen
On Tue, 4 Feb 2025 at 20:06, Peter Tas Peter.Tas@neoria.be wrote:
Neoria - Public
Hello Stephen,
This document, although already 12y old, is still a good reference.
https://www.netapp.com/media/19900-tr-4847.pdf
Check pp7 and the following page(s) regarding load balancing. “Port” distri function is generally known to be the preferred one when configuring an ifgrp, but it may also depend on the brand/type of switch stack to which you connected the nics of the controllers.
Good luck & best regards,
Peter Tas.
Neoria - Public From: John Stoffel john@stoffel.org *Date: *Tuesday, 4 February 2025 at 18:29 *To: *Stephen Stocke scstocke@gmail.com *Cc: *toasters@teaparty.net toasters@teaparty.net *Subject: *Re: Igfrp IP Distribution Function
"Stephen" == Stephen Stocke scstocke@gmail.com writes:
Greetings fellow toaster admins I hope someone can shed some light on the IP distribution function for lacp ifgrps.
You need to give more information on your setup, especially what kind of switches you're using and how they're configured.
We have a four port, multi mode lacp ifgrp, a0a, using interfaces e2a-e2d. We observe SnapMirror traffic egressing port e2c.
So? Why do you care?
The XOR of the last two bits of the source and destination IPs are either x0 or x3 so I am expecting traffic to egress either the first or fourth port in the ifgrp.
Assuming e2a is port 0, e2b is port 1, etc., I would expect traffic to egress either e2a (0x0) or e2d (0x3).
Share your config (cli output) so we can look at it.
What am I missing? I can’t find any details about the actual hashing function for IP distribution or port member indexing in an ifgrp to confirm my assumptions.
Both clusters are a single A250 HA pair running 9.15.1.
Are you seeing performance problems? Are you seeing that your traffic isn't being balanced across all your links?
I guess I really don't understand the problem you're trying to solve, unless oyu're just looking for info on why it works this way, which might really be a Netapp only answer.
toasters mailing list -- toasters@lists.teaparty.net To unsubscribe send an email to toasters-leave@lists.teaparty.net
Neoria - Public
Hi Stephen,
There is also a TR on snapmirror best practices, containing a decent networking chapter starting pp10. This might be worth the read for you as well. It does state “port” distri function is preferred for an ifgrp hosting IC lifs for SM relations. https://www.netapp.com/media/17229-tr4015.pdf?v=127202175503P (this one is dated 2024, btw. 😉)
Multiple IC lifs per node (up to a max of 8) are possible to optimize multipathing network streams for SM. It would be worth considering a separate IG with 2 NICs for that purpose, providing you do not need the bandwidth of 4 NICs for front-end networking. Reconfiguring an ifgrp containing 4 NICs and remove 2 is not that big a deal not even online.
Just my 5 cents.
BR, Peter.
Neoria - Public
From: Stephen Stocke scstocke@gmail.com Date: Wednesday, 5 February 2025 at 00:32 To: Peter Tas Peter.Tas@neoria.be Cc: John Stoffel john@stoffel.org, toasters@teaparty.net toasters@teaparty.net Subject: Re: Igfrp IP Distribution Function Hello
Thank you to everyone who replied on and off list.
I believe I have a good understanding of ifgrp configuration and the purpose of the various distribution functions. In this case I'm asking about the specifics of the IP distribution function: how the hash is actually calculated with a given set of source and destination IP addresses and the result used to select the egress interface. As John suggests, this may be a case where only Netapp can answer the question.
Quoting from TR-4847https://www.netapp.com/media/19900-tr-4847.pdf which Peter linked to (emphasis mine):
IP: Second-best load distribution method, since the IP addresses of both sender (LIF) and client are used to deterministically select the particular physical link that a packet traverses. Although deterministic in the selection of a port, the balancing is performed using an advanced hash function. This has been found to work under a wide variety of circumstances, but particular selections of IP addresses might still lead to unequal load distribution.
The only information I can find about the 'advanced hash function' is in an old 7-mode KBhttps://kb.netapp.com/Legacy/ONTAP/7Mode/How_does_load_balancing_on_a_VIF_work which states (emphasis mine):
As of Data ONTAP 7.3.2, a multimode or LACP ifgrp/vif uses an implementation of a "SuperFastHash", utilizing the last 16 bits of the source and destination IP addresses (-b ip), the last 16 bits of the source and destination MAC addresses (-b mac), or the last 16 bits of the source and destination IP addresses in combination with the source and destination TCP port (-b port).
The output of the algorithm results in a far more dynamic, more balanced distribution than the algorithm used in versions of Data ONTAP prior to 7.3.2. The result is still the same, however, in that each TCP stream will associate with only one interface, allowing for only one port's worth of bandwidth per TCP stream.
Data ONTAP releases prior to 7.3.2:
Documentation prior to 7.3.2 states this as the formula:
((source_address XOR destination_address) % number_of_links)
When I was asked to look into a SnapMirror traffic imbalance, I started by calculating the IP hash using the 'standard' load balancing formula you would expect for source and destination IP. However, the results indicated different behaviour than what we are seeing. After doing more research today, I ran across the old KB quoted above and also this Githubhttps://github.com/qdrddr/ontap-lacp page. However, it looks like the code was created to replicate empirical data rather than being the actual SuperFastHash algorithm used by Netapp.
Why do I care? Well, mostly just academic curiosity about the 'better mouse trap' that Netapp built. The imbalance we are seeing isn't causing any issues. It was just one of those curious things that we investigated to understand why it was happening. Also, it will be much easier for me to tweak the intercluster LIF IPs to get the desired traffic balancing in the short term than waiting for a maintenance window to tear down the data ifgrps and rebuild them with the 'Port' distribution function. (We will rebuild them eventually.)
Thanks again. I hope someone can shed a bit more light on the 'SuperFastHash' or whatever has replaced it in ONTAP 9.
Best wishes Stephen
On Tue, 4 Feb 2025 at 20:06, Peter Tas <Peter.Tas@neoria.bemailto:Peter.Tas@neoria.be> wrote:
Neoria - Public
Hello Stephen,
This document, although already 12y old, is still a good reference. https://www.netapp.com/media/19900-tr-4847.pdf
Check pp7 and the following page(s) regarding load balancing. “Port” distri function is generally known to be the preferred one when configuring an ifgrp, but it may also depend on the brand/type of switch stack to which you connected the nics of the controllers.
Good luck & best regards,
Peter Tas.
Neoria - Public From: John Stoffel <john@stoffel.orgmailto:john@stoffel.org> Date: Tuesday, 4 February 2025 at 18:29 To: Stephen Stocke <scstocke@gmail.commailto:scstocke@gmail.com> Cc: toasters@teaparty.netmailto:toasters@teaparty.net <toasters@teaparty.netmailto:toasters@teaparty.net> Subject: Re: Igfrp IP Distribution Function
"Stephen" == Stephen Stocke <scstocke@gmail.commailto:scstocke@gmail.com> writes:
Greetings fellow toaster admins I hope someone can shed some light on the IP distribution function for lacp ifgrps.
You need to give more information on your setup, especially what kind of switches you're using and how they're configured.
We have a four port, multi mode lacp ifgrp, a0a, using interfaces e2a-e2d. We observe SnapMirror traffic egressing port e2c.
So? Why do you care?
The XOR of the last two bits of the source and destination IPs are either x0 or x3 so I am expecting traffic to egress either the first or fourth port in the ifgrp.
Assuming e2a is port 0, e2b is port 1, etc., I would expect traffic to egress either e2a (0x0) or e2d (0x3).
Share your config (cli output) so we can look at it.
What am I missing? I can’t find any details about the actual hashing function for IP distribution or port member indexing in an ifgrp to confirm my assumptions.
Both clusters are a single A250 HA pair running 9.15.1.
Are you seeing performance problems? Are you seeing that your traffic isn't being balanced across all your links?
I guess I really don't understand the problem you're trying to solve, unless oyu're just looking for info on why it works this way, which might really be a Netapp only answer.
_______________________________________________ toasters mailing list -- toasters@lists.teaparty.netmailto:toasters@lists.teaparty.net To unsubscribe send an email to toasters-leave@lists.teaparty.netmailto:toasters-leave@lists.teaparty.net