TO/GB fixed this issue now, JFYI.

 

Best,

 

Alexander Griesser

Head of Systems Operations

 

ANEXIA Internetdienstleistungs GmbH

 

E-Mail: AGriesser@anexia-it.com

Web: http://www.anexia-it.com

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 

Von: Alexander Griesser
Gesendet: Mittwoch, 17. Juni 2020 17:15
An: 'Jeff Bryer' <bryer@sfu.ca>; toasters@teaparty.net
Betreff: AW: CSM MismatchRemoteDevice

 

Bingo:

 

CLUSTER::*> net port show -fields remote-device-id -broadcast-domain Cluster

  (network port show)

node   port remote-device-id

------ ---- --------------------

Node1 e0a  sw01

Node1 e0b  sw02

Node2 e0a  sw01

Node2 e0b  sw02

Node3 e0a  sw01

Node3 e0b  sw02

Node4

       e0a  -

node4

       e0b  -

8 entries were displayed.

 

Still trying to get that fixed without TO/GB, though 😊

 

Alexander Griesser

Head of Systems Operations

 

ANEXIA Internetdienstleistungs GmbH

 

E-Mail: AGriesser@anexia-it.com

Web: http://www.anexia-it.com

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 

Von: Toasters <toasters-bounces@teaparty.net> Im Auftrag von Jeff Bryer
Gesendet: Mittwoch, 17. Juni 2020 16:14
An: toasters@teaparty.net
Betreff: Re: CSM MismatchRemoteDevice

 

If you want to try troubleshooting it before the Support portal is back up:

 

We just had a case for this issue (it was triggered for us by upgrading from 9.5 to 9.7).

 

"The message occurs when the Cluster Session Manager (CSM) establishes a connection between nodes over the cluster network interface, but the node's remote device IDs do not match."

 

Make sure CDP is enabled:

node run -node * options cdpd.enable

 

Look for a blank remote-device-id:

net port show -role cluster -fields remote-device-id

 

We had to do a takeover and giveback of the node with the missing ID to clear the problem.

 

(As mentioned:

https://kb.netapp.com/Advice_and_Troubleshooting/Data_Storage_Software/ONTAP_OS/How_to_troubleshoot_the_EMS__%22csm.mismatchRemoteDevice%22_message_in_the_event_log

 

 


From: Toasters <toasters-bounces@teaparty.net> on behalf of Heino Walther <hw@beardmann.dk>
Sent: Wednesday, June 17, 2020 5:22 AM
To: Alexander Griesser; Jason Gorrie
Cc: toasters@teaparty.net
Subject: Re: CSM MismatchRemoteDevice

 

An even without any switches configured __

DK01NETAPP01::> switch show
  (storage switch show)
This table is currently empty.

__. Go figure...

/Heino

D. 17.06.2020 14.20 skrev "Alexander Griesser" <AGriesser@anexia-it.com>:

    In your case, it is able to read the switch name - in my case, it's not (name is just "-").
    So maybe this is is also the reason for the message, because CDP was disabled in the first place.

    I think what I'm looking for here is a way to "restart" this CSM daemon or to restart the validation/detection process, if that's possible at all.

    Alexander Griesser
    Head of Systems Operations

    ANEXIA Internetdienstleistungs GmbH

    E-Mail: AGriesser@anexia-it.com
    Web: http://www.anexia-it.com

    Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt
    Geschäftsführer: Alexander Windbichler
    Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

    -----Ursprüngliche Nachricht-----
    Von: Heino Walther <hw@beardmann.dk>
    Gesendet: Mittwoch, 17. Juni 2020 14:18
    An: Alexander Griesser <AGriesser@anexia-it.com>; Jason Gorrie <jbgorrie@uwaterloo.ca>
    Cc: toasters@teaparty.net
    Betreff: Re: CSM MismatchRemoteDevice

    OK, mine looks like this... and as you can see two nodes are connected to the same switches...

    So I get this error...
    6/17/2020 14:10:40  DK01NETAPP01-03  ERROR         csm.mismatchRemoteDevice: CSM connection between source LIF 1001 and destination address 169.254.48.70 might not be optimal for session 0a05a7bb5da3abb3. The source is currently connected to nasw02 remote device and the destination is currently connected to nasw01 remote device.

    (I get other such messages... so I assumed it was because of the cabling)

    DK01NETAPP01::> network device-discovery show -platform CN1610
    Node/       Local  Discovered
    Protocol    Port   Device (LLDP: ChassisID)  Interface         Platform
    ----------- ------ ------------------------- ----------------  ---------------- DK01NETAPP01-04/cdp
                e3a    nasw01                    0/2               CN1610
                e3c    nasw02                    0/2               CN1610
    DK01NETAPP01-01/cdp
                e0a    nasw01                    0/3               CN1610
                e0c    nasw01                    0/4               CN1610
    DK01NETAPP01-03/cdp
                e3a    nasw01                    0/1               CN1610
                e3c    nasw02                    0/1               CN1610
    DK01NETAPP01-02/cdp
                e0a    nasw02                    0/3               CN1610
                e0c    nasw02                    0/4               CN1610
    8 entries were displayed.


    Maybe it's the switch config?  I sadly does not have any remote management to my switches.....

    I'm on ONTAP 9.7P2  so pretty new version...

    /Heino



    D. 17.06.2020 14.13 skrev "Alexander Griesser" <AGriesser@anexia-it.com>:

        I would say that the cabling is correct:

        CLUSTER::> network device-discovery show -platform CN1610
        Node/       Local  Discovered
        Protocol    Port   Device (LLDP: ChassisID)  Interface         Platform
        ----------- ------ ------------------------- ----------------  ----------------
        Node1    /cdp
                    e0a    sw01      0/1               CN1610
                    e0b    sw02      0/1               CN1610
        node2     /cdp
                    e0a    sw01      0/2               CN1610
                    e0b    sw02      0/2               CN1610
        node3     /cdp
                    e0a    sw01      0/3               CN1610
                    e0b    sw02      0/3               CN1610
        node4     /cdp
                    e0a    sw01      0/4               CN1610
                    e0b    sw02      0/4               CN1610
        8 entries were displayed.

        Alexander Griesser
        Head of Systems Operations

        ANEXIA Internetdienstleistungs GmbH

        E-Mail: AGriesser@anexia-it.com
        Web: http://www.anexia-it.com

        Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt
        Geschäftsführer: Alexander Windbichler
        Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

        -----Ursprüngliche Nachricht-----
        Von: Heino Walther <hw@beardmann.dk>
        Gesendet: Mittwoch, 17. Juni 2020 14:11
        An: Alexander Griesser <AGriesser@anexia-it.com>; Jason Gorrie <jbgorrie@uwaterloo.ca>
        Cc: toasters@teaparty.net
        Betreff: Re: CSM MismatchRemoteDevice

        Hi there

        I have the same messages in my 4-node cluster.
        I believe it's because of the cluster cabling isn't correct (in my case anyway) we have two interconnected cluster switches, I think it's because we hooked up two cluster ports to the same switch....
        We are in the process of migrating to a new cluster pair, so we don't want to investigate it further because the clustering works __ But I would suggest to check that you have one cluster cable from each host in each of your cluster switches...

        /Heino

        D. 17.06.2020 14.01 skrev "Toasters på vegne af Alexander Griesser" <toasters-bounces@teaparty.net på vegne af AGriesser@anexia-it.com>:

            All cluster ports up here:

            CLUSTER::> net int show -role cluster
              (network interface show)
                        Logical    Status     Network            Current       Current Is
            Vserver     Interface  Admin/Oper Address/Mask       Node          Port    Home
            ----------- ---------- ---------- ------------------ ------------- ------- ----
            Cluster
                        Node1_clus1 up/up    169.254.101.103/16 node1        e0a     true
                        Node1_clus2 up/up    169.254.210.167/16 node1        e0b     true
                        Node2_clus1 up/up    169.254.44.143/16  node2       e0a     true
                        Node2_clus2 up/up    169.254.161.155/16 node2        e0b     true
                        Node3_clus1 up/up    169.254.15.190/16  node3       e0a     true
                        Node3_clus2 up/up    169.254.223.100/16 node3       e0b     true
                        Node4_clus1
                                     up/up    169.254.183.224/16 node4    e0a     true
                        node4_clus2
                                     up/up    169.254.221.115/16 node4       e0b     true
            8 entries were displayed.


            CLUSTER::> net port show -broadcast-domain Cluster
              (network port show)

            Node: node1
                                                              Speed(Mbps) Health
            Port      IPspace      Broadcast Domain Link MTU  Admin/Oper  Status
            --------- ------------ ---------------- ---- ---- ----------- --------
            e0a       Cluster      Cluster          up   9000  auto/10000 healthy
            e0b       Cluster      Cluster          up   9000  auto/10000 healthy

            Node: node2
                                                              Speed(Mbps) Health
            Port      IPspace      Broadcast Domain Link MTU  Admin/Oper  Status
            --------- ------------ ---------------- ---- ---- ----------- --------
            e0a       Cluster      Cluster          up   9000  auto/10000 healthy
            e0b       Cluster      Cluster          up   9000  auto/10000 healthy

            Node: node3
                                                              Speed(Mbps) Health
            Port      IPspace      Broadcast Domain Link MTU  Admin/Oper  Status
            --------- ------------ ---------------- ---- ---- ----------- --------
            e0a       Cluster      Cluster          up   9000  auto/10000 healthy
            e0b       Cluster      Cluster          up   9000  auto/10000 healthy

            Node: node4
                                                              Speed(Mbps) Health
            Port      IPspace      Broadcast Domain Link MTU  Admin/Oper  Status
            --------- ------------ ---------------- ---- ---- ----------- --------
            e0a       Cluster      Cluster          up   9000  auto/10000 healthy
            e0b       Cluster      Cluster          up   9000  auto/10000 healthy
            8 entries were displayed.

            Alexander Griesser
            Head of Systems Operations

            ANEXIA Internetdienstleistungs GmbH

            E-Mail: AGriesser@anexia-it.com
            Web: http://www.anexia-it.com

            Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt
            Geschäftsführer: Alexander Windbichler
            Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

            -----Ursprüngliche Nachricht-----
            Von: Jason Gorrie <jbgorrie@uwaterloo.ca>
            Gesendet: Mittwoch, 17. Juni 2020 13:54
            An: Alexander Griesser <AGriesser@anexia-it.com>
            Betreff: Re: CSM MismatchRemoteDevice

            Hi,

            I have pages of those on a newly expanded cluster (was 6 nodes, now 10 for a tech refresh).
            CDP is enabled all over, and “net device-discovery show” shows good data.
            Currently 9.5P12.

            I do have one of the ports down (bad cable, RMA taking 4+ days) so perhaps that is why?
            —
            Jason

            > On Jun 17, 2020, at 00:33, Alexander Griesser <AGriesser@anexia-it.com> wrote:
            >
            > Hey toasters,
            > 
            > anyone ever experienced such messages in the event log? Mines are getting flooded here.
            > It’s a 4-Node cluster (9.6P1) and someone forgot to enable CDP on two of the nodes, which (according to the syslog translator) might seem to be the reason for this issue.
            > 
            > CLUSTER::>  event log show
            > Time                Node             Severity      Event
            > ------------------- ---------------- ------------- ---------------------------
            > 6/17/2020 06:28:18  node1           ERROR         csm.mismatchRemoteDevice: CSM connection between source LIF 1012 and destination address 169.254.221.115 might not be optimal for session 0b05995fad05cdb0. The source is currently connected to CLUSTER-sw02 remote device and the destination is currently connected to - remote device.
            > 
            > https://mysupport.netapp.com/site/bugs-online/syslog-translator/details?eventId=5e85e63097855c5d2ee02338
            > 
            > The corrective action in this link just says:
            > „Ensure that the Cisco Discover Protocol (CDP) is running on the nodes and switches. In addition, ensure that the cluster ports are up and the cluster LIFs are configured and hosted according to the suggested cluster configuration."
            > 
            > I’ve enabled CDP now for all nodes, cluster ports are up and all the lifs are where they should be.
            > Do I need to restart this CSM service or something like that in order to retry CDP resolution and get rid of this message?
            > 
            > Thanks,
            > 
            > Alexander Griesser


            _______________________________________________
            Toasters mailing list
            Toasters@teaparty.net
            https://www.teaparty.net/mailman/listinfo/toasters




_______________________________________________
Toasters mailing list
Toasters@teaparty.net
https://www.teaparty.net/mailman/listinfo/toasters