Oh, well that's different entirely. :)
The cluster may be out of quorum, which is causing this issue.
Did you capture the aforementioned commands?
"RPC timeout" here means that the API is being sent across the cluster to other nodes via RPC. Since the nodes are down, the commands are failing.
Keep in mind that a scenario where two nodes in a cluster are powered off is not a normal scenario. If you are doing maintenance, you would want to mark those nodes as "eligibility false" to ensure they don't participate in the cluster during maintenance. You also want to ensure epsilon is not on the nodes and to move epsilon if it is.
-----Original Message----- From: vladimir.zhigulin@gmail.com [mailto:vladimir.zhigulin@gmail.com] On Behalf Of Momonth Sent: Wednesday, March 30, 2016 10:27 AM To: Parisi, Justin Cc: NGC-tmacmd-gmail.com; toasters@teaparty.net Subject: Re: NetApp SDK for cDOT: any API call fails if a cluster node is not available
A correction to my initial state:
1. I have the whole HA-pair (ie two nodes) being powered off.
On Wed, Mar 30, 2016 at 3:30 PM, Parisi, Justin Justin.Parisi@netapp.com wrote:
Try narrowing your API call to a specific node. It’s possible it’s trying to query the node that’s down and causing the timeout.
I initially noticed this behavior with "diagnosis-alert-get-iter" call, which doesn't require a node parameter. But even simple thing like "version" fails.
API might not be smart enough to know to ignore a node that is not up.
The reality proves otherwise =) I'm on 8.3.1.
Also be sure to check that it did fail over properly as tmac mentioned. And that the cluster is in quorum. (set diag; cluster show; cluster ring show)
Since both nodes are down, there was actually no failover taking place.
Here is what I get:
cdot::*> cluster ring show .. <output of healthy nodes here> ..
Warning: Unable to list entries on node na101node-4a. RPC: Port mapper failure - RPC: Timed out Unable to list entries on node na101node-4b. RPC: Port mapper failure - RPC: Timed out 30 entries were displayed.