The ssh timeout is set to 600 and all the others were set to 60.  We don't use anything other than ssh into them anyway.  The timeouts of the session behaviors.  Both systems are on the same subnet and we haven't had any problems with timeouts or latency with them.

For the vol copy, this is the end of the data from the ssh command:

VOLCOPY: Starting on volume 1.
Volume 'restore_test' is now restricted.
18:03:32 MDT : vol copy restore 0 : begun, 11583164 MB to be copied.
[root@cutthroat etc]# date
Wed Oct 30 18:13:36 MDT 2013

You can see it ran for ~10 minutes, exited with no errors, and nothing in the logs.  When I run it again, it behaves the same, but I noticed the size to be copied gets a little lower each time.  A few thousand more of these and it will be done :-)

And for Peter, we don't have any Windows we support in our lab, so we are pretty PowerShell adverse.

I'll open a call with NetApp and figure out how to trace things better.  I am going to try an interactive shell with both just to see how they work out before I do that though.

Thanks,

Jeff


On Wed, Oct 30, 2013 at 7:06 PM, Jordan Slingerland <Jordan.Slingerland@independenthealth.com> wrote:

I thought I should add that I use NDMP copy almost every day and only via ssh from an admin host as you describe.  Typically I do it interactively, though mainly just because it is  easier to deal with special characters and spaces and not ending up in nested single quote, double quote, escape character hell.

 

One last though, I certainly wouldn’t jump to this conclusion, but, I once had an issue with ssh sessions being timed out irregularly when “idle” and it ended up being a firewall had hit the max amount of arp entries (with some  help from a rogue device doing very wide ping sweeps) and was apparently killing of the connections that it deemed the most idle .

 

--JMS

 

From: tmac [mailto:tmacmd@gmail.com]
Sent: Wednesday, October 30, 2013 8:47 PM
To: Jordan Slingerland
Cc: Jeff Cleverley; <Toasters@teaparty.net>
Subject: Re: ndmpcopy and vol copy via ssh?

 

You may wish to explore this ssh option:

 

ServerAliveInterval

 

You can set it in you .ssh config file. Most set the value of 60 to have an Alive message sent every 60 seconds.

 

We ended up having to do this across our wan to keep clients connected overnight.


--tmac

 

Tim McCarthy

Principal Consultant

 

          

 

        Clustered ONTAP                                                        Clustered ONTAP

 NCDA ID: XK7R3GEKC1QQ2LVD           RHCE6 110-107-141           NCSIE ID: C14QPHE21FR4YWD4

     Expires: 08 November 2014              Current until Aug 02, 2016         Expires: 08 November 2014

 

On Wed, Oct 30, 2013 at 8:41 PM, Jordan Slingerland <Jordan.Slingerland@independenthealth.com> wrote:

Hmm, is it a matter of using ssh interactively, vs non-interactively?

Ie:

 

Ssh toasterA ndmpcopy ….

 

Vs

 

Ssh toasterA

>ndmpcopy ……

 

 

Also, if you have not, look at

Options :

autologout.console.timeout  

autologout.telnet.timeout   

ssh.idle.timeout

 

and see if you are hitting one of them.    

 

 

 

From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Jeff Cleverley
Sent: Wednesday, October 30, 2013 8:23 PM
To: <Toasters@teaparty.net>
Subject: ndmpcopy and vol copy via ssh?

 

Greetings,

Is it acceptable to use ssh from an admin host to run ndmpcopy and vol copy commands?

It seems to work initially, but both seem to die after varying periods of time with no useful explanations in the log files.  I found a KB about someone connecting to the filer and running ctrl-c thinking the console is idle, but that doesn't apply here unless an ssh command from a cron job or something has the same effect.  ndmpd status and vol copy status verify that there are no copies running.

All filers are 8.1.2P4, source is a 6080, destination is a 6290, 10G networking on both.

 

I'm doing restore time tests of entire volumes using various methods and these 2 are part of the list.  The vol copy ran for ~10 minutes out of the 466 estimated, then stopped.  No errors on the command line.  It does run, so all the host.equiv entries and permissions are good.  I'm sure it did not do the 11TB in 10 minutes :-)

The ndmpcopy command ran for ~80 minutes and silently quit.  It restored 217G, almost all  are in smaller directories.  The larger directories didn't show up and there are no abort messages in the logs.  They just quit logging.

For the ndmpcopy I had a script do an ndmpcopy for each directory (9 total).  It looked like multiple ones were running at the same time which seems OK.  I was hoping to use parallel threads to speed things up.

Any ideas on the silent failures?

Thanks,

Jeff


--
Jeff Cleverley
Unix Systems Administrator
4380 Ziegler Road
Fort Collins, Colorado 80525
970-288-4611


_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

 




--
Jeff Cleverley
Unix Systems Administrator
4380 Ziegler Road
Fort Collins, Colorado 80525
970-288-4611