I upgraded our AFF8020 (2 nodes) yesterday from 9.1P10 to 9.3P21, as a first step towards 9.7P12.
Unfortunately, the two 10Gb interfaces (e0c, e0d) on the second controller won't connect after the upgrade. It seems like a layer one problem, with no link lights on either the NetApp, or the switch side. Yet it seems unlikely that we have two physical paths fail at the same time, and too coincidental that it happened (apparently) during the giveback. And, reseating the fiber jumpers, as well as the SFP optics hasn't helped.
Has anyone else seen anything like this?
I've engaged (third party) support, but so far, no joy.
I've tried shut/no shut on the switch side, (Cisco 9K) also tried the same sort of thing on the NetApp side (advanced mode: network port modify -node <nodename> -port <portname> -up-admin false, on both ports, then back to true)
Also, the e0c and e0d were bundled into a if_group (a0a) on the NetApp side, and a VPC port channel on the Cisco side, (identically on both nodes, so the config seems fine since it still works on node1. I tried deleting node2's a0a, but that failed due to a lif having set it's home to that a0a. So, I migrated that lif "permanently" to the working node, and then deleted node2's a0a, and then tried the -up-admin=false/true trick again on the e0c and e0d ports, but still no joy.
I noticed the MTU was set to 9216 on the Cisco side and 1500 on the NetApp side, but again, this is identical on both nodes, and is working on node1, so even though it looks suspicious, it's probably not the problem. I tried setting the NetApp side to 9000 (it didn't accept 9216), but that didn't help either.
I thinking my next step is to delete node1's a0a, and remove the corresponding port channels on the Cisco side, then try swapping known good fibers and optics to the problem ports to which piece is the root cause.
What does the toasters brain trust say to all this?
Hi Brian,
I recall having the same issue a few years ago and for me it was fixed by replacing the SFPs. The firmware on them was not compatible anymore with the newer Ontap Release and/or NIC driver there and the link was always down. Are you using NetApp official SFPs or are you coding them on your own?
Best,
Alexander Griesser Head of Systems Operations
ANEXIA Internetdienstleistungs GmbH
E-Mail: AGriesser@anexia-it.com Web: http://www.anexia-it.com
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
-----Ursprüngliche Nachricht----- Von: Brian Parent bparent@ucsd.edu Gesendet: Freitag, 26. März 2021 02:23 An: toasters@teaparty.net Betreff: NICs fail to connect after upgrade to OnTap 9.3P21
I upgraded our AFF8020 (2 nodes) yesterday from 9.1P10 to 9.3P21, as a first step towards 9.7P12.
Unfortunately, the two 10Gb interfaces (e0c, e0d) on the second controller won't connect after the upgrade. It seems like a layer one problem, with no link lights on either the NetApp, or the switch side. Yet it seems unlikely that we have two physical paths fail at the same time, and too coincidental that it happened (apparently) during the giveback. And, reseating the fiber jumpers, as well as the SFP optics hasn't helped.
Has anyone else seen anything like this?
I've engaged (third party) support, but so far, no joy.
I've tried shut/no shut on the switch side, (Cisco 9K) also tried the same sort of thing on the NetApp side (advanced mode: network port modify -node <nodename> -port <portname> -up-admin false, on both ports, then back to true)
Also, the e0c and e0d were bundled into a if_group (a0a) on the NetApp side, and a VPC port channel on the Cisco side, (identically on both nodes, so the config seems fine since it still works on node1. I tried deleting node2's a0a, but that failed due to a lif having set it's home to that a0a. So, I migrated that lif "permanently" to the working node, and then deleted node2's a0a, and then tried the -up-admin=false/true trick again on the e0c and e0d ports, but still no joy.
I noticed the MTU was set to 9216 on the Cisco side and 1500 on the NetApp side, but again, this is identical on both nodes, and is working on node1, so even though it looks suspicious, it's probably not the problem. I tried setting the NetApp side to 9000 (it didn't accept 9216), but that didn't help either.
I thinking my next step is to delete node1's a0a, and remove the corresponding port channels on the Cisco side, then try swapping known good fibers and optics to the problem ports to which piece is the root cause.
What does the toasters brain trust say to all this?
-- Brian Parent Information Technology Services Department ITS Computing Infrastructure Operations Group its-ci-ops-help@ucsd.edu (team email address for Service Now) UC San Diego (858) 534-6090
Thanks for the history Alexander. These SFPs came from NetApp with the initial purchase, so hopefully I can depend on them continueing to work with newer OnTap releases.
Good news though, it's all working now. Though rebooting didn't help, I was able to halt the node, then use the SP to power cycle it, after which the e0c and e0d NICS connected.
Re:
From: Alexander Griesser AGriesser@anexia-it.com Date: Fri, 26 Mar 2021 05:05:01 +0000 Subject: AW: NICs fail to connect after upgrade to OnTap 9.3P21 To: Brian Parent bparent@ucsd.edu, "toasters@teaparty.net" toasters@teaparty.net
Hi Brian,
I recall having the same issue a few years ago and for me it was fixed by replacing the SFPs. The firmware on them was not compatible anymore with the newer Ontap Release and/or NIC driver there and the link was always down. Are you using NetApp official SFPs or are you coding them on your own?
Best,
Alexander Griesser Head of Systems Operations
ANEXIA Internetdienstleistungs GmbH
E-Mail: AGriesser@anexia-it.com Web: https://urldefense.com/v3/__http://www.anexia-it.com__;!!Mih3wA!U9IUAhKqspo_...
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
-----Ursprüngliche Nachricht----- Von: Brian Parent bparent@ucsd.edu Gesendet: Freitag, 26. März 2021 02:23 An: toasters@teaparty.net Betreff: NICs fail to connect after upgrade to OnTap 9.3P21
I upgraded our AFF8020 (2 nodes) yesterday from 9.1P10 to 9.3P21, as a first step towards 9.7P12.
Unfortunately, the two 10Gb interfaces (e0c, e0d) on the second controller won't connect after the upgrade. It seems like a layer one problem, with no link lights on either the NetApp, or the switch side. Yet it seems unlikely that we have two physical paths fail at the same time, and too coincidental that it happened (apparently) during the giveback. And, reseating the fiber jumpers, as well as the SFP optics hasn't helped.
Has anyone else seen anything like this?
I've engaged (third party) support, but so far, no joy.
I've tried shut/no shut on the switch side, (Cisco 9K) also tried the same sort of thing on the NetApp side (advanced mode: network port modify -node <nodename> -port <portname> -up-admin false, on both ports, then back to true)
Also, the e0c and e0d were bundled into a if_group (a0a) on the NetApp side, and a VPC port channel on the Cisco side, (identically on both nodes, so the config seems fine since it still works on node1. I tried deleting node2's a0a, but that failed due to a lif having set it's home to that a0a. So, I migrated that lif "permanently" to the working node, and then deleted node2's a0a, and then tried the -up-admin=false/true trick again on the e0c and e0d ports, but still no joy.
I noticed the MTU was set to 9216 on the Cisco side and 1500 on the NetApp side, but again, this is identical on both nodes, and is working on node1, so even though it looks suspicious, it's probably not the problem. I tried setting the NetApp side to 9000 (it didn't accept 9216), but that didn't help either.
I thinking my next step is to delete node1's a0a, and remove the corresponding port channels on the Cisco side, then try swapping known good fibers and optics to the problem ports to which piece is the root cause.
What does the toasters brain trust say to all this?
-- Brian Parent Information Technology Services Department ITS Computing Infrastructure Operations Group its-ci-ops-help@ucsd.edu (team email address for Service Now) UC San Diego (858) 534-6090