I have an NFS client (CentOS 5) that is mounting a volume from filer across a WAN with about 20ms of latency (40 ms round trip). We have an application that makes a "backup" copy of a directory of data before modifying it. The directory can contain a lot tiny of files (for example 600 files, taking up less than 12MB). I'm using NFS V3 over TCP (also tried V3 over UDP).
The NFS copy (using cp -a) is glacial. Looking at the WireShark packet captures shows the client and the server in a deadly lock step request/reply situation (typically GETATTR and SETATTR). As far as I can tell the client/server are not allowing any windowing/pipelining of requests. Is there a setting I am missing to enable more outstanding requests (on the client or the server)? All the searches I've done for tuning seem to be about fixing issues for large bulk transfers. Is the NFS protocol inherently limited in this way. I expected this kind of behavior in SMB 1 but not with NFS.
Thanks.
arnold
Though generally, the more files the slower things are going to be, 600 files doesn't really seem like that many.
Are there any long delays in excess of several seconds in your packet capture? How long is the transfer actually taking? Compared to a xfer of the same data in a tar?
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Arnold de Leon Sent: Thursday, October 17, 2013 2:17 PM To: toasters Subject: Slow copy of a directory full of files via an NFS client across a WAN
I have an NFS client (CentOS 5) that is mounting a volume from filer across a WAN with about 20ms of latency (40 ms round trip). We have an application that makes a "backup" copy of a directory of data before modifying it. The directory can contain a lot tiny of files (for example 600 files, taking up less than 12MB). I'm using NFS V3 over TCP (also tried V3 over UDP).
The NFS copy (using cp -a) is glacial. Looking at the WireShark packet captures shows the client and the server in a deadly lock step request/reply situation (typically GETATTR and SETATTR). As far as I can tell the client/server are not allowing any windowing/pipelining of requests. Is there a setting I am missing to enable more outstanding requests (on the client or the server)? All the searches I've done for tuning seem to be about fixing issues for large bulk transfers. Is the NFS protocol inherently limited in this way. I expected this kind of behavior in SMB 1 but not with NFS.
Thanks.
arnold
Mount settings
Flags: rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=sys
AFAIK these are the defaults.
I have not seen any delays that are on the order of seconds. What I see is a steady stream of 1 requests 1 response. What I was hoping to see is a multiple requests followed by multiple responses (to deal w/ the latency).
arnold
On Thu, Oct 17, 2013 at 11:29 AM, Jordan Slingerland < Jordan.Slingerland@independenthealth.com> wrote:
Though generally, the more files the slower things are going to be, 600 files doesn’t really seem like that many. ****
Are there any long delays in excess of several seconds in your packet capture? ****
How long is the transfer actually taking? Compared to a xfer of the same data in a tar? ****
*From:* toasters-bounces@teaparty.net [mailto: toasters-bounces@teaparty.net] *On Behalf Of *Arnold de Leon *Sent:* Thursday, October 17, 2013 2:17 PM *To:* toasters *Subject:* Slow copy of a directory full of files via an NFS client across a WAN****
I have an NFS client (CentOS 5) that is mounting a volume from filer across a WAN with about 20ms of latency (40 ms round trip). We have an application that makes a "backup" copy of a directory of data before modifying it. The directory can contain a lot tiny of files (for example 600 files, taking up less than 12MB). I'm using NFS V3 over TCP (also tried V3 over UDP).****
The NFS copy (using cp -a) is glacial. Looking at the WireShark packet captures shows the client and the server in a deadly lock step request/reply situation (typically GETATTR and SETATTR). As far as I can tell the client/server are not allowing any windowing/pipelining of requests. Is there a setting I am missing to enable more outstanding requests (on the client or the server)? All the searches I've done for tuning seem to be about fixing issues for large bulk transfers. Is the NFS protocol inherently limited in this way. I expected this kind of behavior in SMB 1 but not with NFS.****
Thanks.****
arnold****
Do you require NFSv3?
Why not try NFSv2 - that will bypass the multitude of *ATTR calls going on a bit and may be better.
Only testing will tell, of course.
I just tried NFSv2, similar results. Same deadly embrace of request/reply which is death with a link with latency.
For this current test case this took over 5 minutes to copy 11MB of data in 600 files.
For reference
NFS (WAN 20ms) to NFS (WAN 20ms): 5 minutes NFS (WAN 20 ms) to local disk: 50 seconds Local disk to NFS (WAN 20ms ): 3 minutes 50 seconds Local disk to local disk: 0.285 seconds NFS (LAN 1ms) to NFS (LAN 1ms): 5 seconds
[I see that Peter also added data. This is a bit of throwback for me, I've been around long enough to deal with the fun Usenet spool directories and joys of lots of small files]
The irony here is all I want is to make a copy of a directory which would be a trivial operation on the filer itself, I just need another reference to the directory. The filer can make the copy when data is modified. I'm wondering if I can do this with the OnTap API (I don't know about the APIs yet). The other possible work around is using a host that is on the same LAN as the filer to do the copies. I just can't believe that there isn't a simple solution to this.
Thanks everyone for the ideas and information.
arnold
On Thu, Oct 17, 2013 at 2:28 PM, Mike Horwath drechsau@gmail.com wrote:
Do you require NFSv3?
Why not try NFSv2 - that will bypass the multitude of *ATTR calls going on a bit and may be better.
Only testing will tell, of course.
Try NFSv4. Compound ops will help with high latency links.
Sent on my bb
From: Arnold de Leon [mailto:a-toasters@deleons.com] Sent: Thursday, October 17, 2013 11:37 PM To: toasters toasters@teaparty.net Subject: Re: Slow copy of a directory full of files via an NFS client across a WAN
I just tried NFSv2, similar results. Same deadly embrace of request/reply which is death with a link with latency.
For this current test case this took over 5 minutes to copy 11MB of data in 600 files.
For reference
NFS (WAN 20ms) to NFS (WAN 20ms): 5 minutes NFS (WAN 20 ms) to local disk: 50 seconds Local disk to NFS (WAN 20ms ): 3 minutes 50 seconds Local disk to local disk: 0.285 seconds NFS (LAN 1ms) to NFS (LAN 1ms): 5 seconds
[I see that Peter also added data. This is a bit of throwback for me, I've been around long enough to deal with the fun Usenet spool directories and joys of lots of small files]
The irony here is all I want is to make a copy of a directory which would be a trivial operation on the filer itself, I just need another reference to the directory. The filer can make the copy when data is modified. I'm wondering if I can do this with the OnTap API (I don't know about the APIs yet). The other possible work around is using a host that is on the same LAN as the filer to do the copies. I just can't believe that there isn't a simple solution to this.
Thanks everyone for the ideas and information.
arnold
On Thu, Oct 17, 2013 at 2:28 PM, Mike Horwath <drechsau@gmail.commailto:drechsau@gmail.com> wrote: Do you require NFSv3?
Why not try NFSv2 - that will bypass the multitude of *ATTR calls going on a bit and may be better.
Only testing will tell, of course.
Just a curiosity....what if you use rsync?
nfs1<-->Host1 <--WAN-> Host2 <--> NFS2
where you rsync/scp from mount on host1 w/NFS1 to host2->either localdisk or to an nfsmount there:
i.e rsync -avHP /nfs1/mydir/ host2:/nfs2/mydir/ or scp -r /nfs1/mydir/ host2:/nfs2/mydir/
again...just curious.
I do this as my wan between buildings is so bad, my host hangs, sometimes it comes back, most of the time, requires a reset.
--tmac
*Tim McCarthy* *Principal Consultant*
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Thu, Oct 17, 2013 at 7:03 PM, McDonald, Alex Alex.Mcdonald@netapp.comwrote:
Try NFSv4. Compound ops will help with high latency links.
Sent on my bb
*From*: Arnold de Leon [mailto:a-toasters@deleons.com] *Sent*: Thursday, October 17, 2013 11:37 PM *To*: toasters toasters@teaparty.net *Subject*: Re: Slow copy of a directory full of files via an NFS client across a WAN
I just tried NFSv2, similar results. Same deadly embrace of request/reply which is death with a link with latency.
For this current test case this took over 5 minutes to copy 11MB of data in 600 files.
For reference
NFS (WAN 20ms) to NFS (WAN 20ms): 5 minutes NFS (WAN 20 ms) to local disk: 50 seconds Local disk to NFS (WAN 20ms ): 3 minutes 50 seconds Local disk to local disk: 0.285 seconds NFS (LAN 1ms) to NFS (LAN 1ms): 5 seconds
[I see that Peter also added data. This is a bit of throwback for me, I've been around long enough to deal with the fun Usenet spool directories and joys of lots of small files]
The irony here is all I want is to make a copy of a directory which would be a trivial operation on the filer itself, I just need another reference to the directory. The filer can make the copy when data is modified. I'm wondering if I can do this with the OnTap API (I don't know about the APIs yet). The other possible work around is using a host that is on the same LAN as the filer to do the copies. I just can't believe that there isn't a simple solution to this.
Thanks everyone for the ideas and information.
arnold
On Thu, Oct 17, 2013 at 2:28 PM, Mike Horwath drechsau@gmail.com wrote:
Do you require NFSv3?
Why not try NFSv2 - that will bypass the multitude of *ATTR calls going on a bit and may be better.
Only testing will tell, of course.
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
1. If this is Netapp to netapp you may try ndmcopy
2. If you need something more vendor agnostic – you may want to tar the data at the source, copy this as a single file to the remote site and untar it there. You may copy every subdirectory separately in parallel – which will also speed-up things. You may also want to tune TCP window to overcome some of the latency issues – you’d have to use something like rsync over hpn-ssh to leverage it
From: Arnold de Leon [mailto:a-toasters@deleons.com] Sent: Thursday, October 17, 2013 11:37 PM To: toasters <toasters@teaparty.netmailto:toasters@teaparty.net> Subject: Re: Slow copy of a directory full of files via an NFS client across a WAN
I just tried NFSv2, similar results. Same deadly embrace of request/reply which is death with a link with latency.
For this current test case this took over 5 minutes to copy 11MB of data in 600 files.
For reference
NFS (WAN 20ms) to NFS (WAN 20ms): 5 minutes NFS (WAN 20 ms) to local disk: 50 seconds Local disk to NFS (WAN 20ms ): 3 minutes 50 seconds Local disk to local disk: 0.285 seconds NFS (LAN 1ms) to NFS (LAN 1ms): 5 seconds
[I see that Peter also added data. This is a bit of throwback for me, I've been around long enough to deal with the fun Usenet spool directories and joys of lots of small files]
The irony here is all I want is to make a copy of a directory which would be a trivial operation on the filer itself, I just need another reference to the directory. The filer can make the copy when data is modified. I'm wondering if I can do this with the OnTap API (I don't know about the APIs yet). The other possible work around is using a host that is on the same LAN as the filer to do the copies. I just can't believe that there isn't a simple solution to this.
Thanks everyone for the ideas and information.
arnold
On Thu, Oct 17, 2013 at 2:28 PM, Mike Horwath <drechsau@gmail.commailto:drechsau@gmail.com> wrote: Do you require NFSv3?
Why not try NFSv2 - that will bypass the multitude of *ATTR calls going on a bit and may be better.
Only testing will tell, of course.
--------------------------------------------------------------------- Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
an old-school Unix trick is to use tar, piped to tar in another directory
e.g.
tar cf - /some/file | (cd /some/file; tar xf -)
This page gives a good overview of tar, scp and rsync for doing what you are talking about. http://www.crucialp.com/resources/tutorials/server-administration/how-to-cop...
On Thu, Oct 17, 2013 at 5:22 PM, Touretsky, Gregory < gregory.touretsky@intel.com> wrote:
**1. **If this is Netapp to netapp you may try ndmcopy ****
**2. **If you need something more vendor agnostic – you may want to tar the data at the source, copy this as a single file to the remote site and untar it there. You may copy every subdirectory separately in parallel – which will also speed-up things. You may also want to tune TCP window to overcome some of the latency issues – you’d have to use something like rsync over hpn-ssh to leverage it****
*From*: Arnold de Leon [mailto:a-toasters@deleons.coma-toasters@deleons.com]
*Sent*: Thursday, October 17, 2013 11:37 PM *To*: toasters toasters@teaparty.net *Subject*: Re: Slow copy of a directory full of files via an NFS client across a WAN
I just tried NFSv2, similar results. Same deadly embrace of request/reply which is death with a link with latency. ****
For this current test case this took over 5 minutes to copy 11MB of data in 600 files.****
For reference****
NFS (WAN 20ms) to NFS (WAN 20ms): 5 minutes****
NFS (WAN 20 ms) to local disk: 50 seconds****
Local disk to NFS (WAN 20ms ): 3 minutes 50 seconds****
Local disk to local disk: 0.285 seconds****
NFS (LAN 1ms) to NFS (LAN 1ms): 5 seconds****
[I see that Peter also added data. This is a bit of throwback for me, I've been around long enough to deal with the fun Usenet spool directories and joys of lots of small files]****
The irony here is all I want is to make a copy of a directory which would be a trivial operation on the filer itself, I just need another reference to the directory. The filer can make the copy when data is modified. I'm wondering if I can do this with the OnTap API (I don't know about the APIs yet). The other possible work around is using a host that is on the same LAN as the filer to do the copies. I just can't believe that there isn't a simple solution to this.****
Thanks everyone for the ideas and information.****
arnold****
On Thu, Oct 17, 2013 at 2:28 PM, Mike Horwath drechsau@gmail.com wrote:*
Do you require NFSv3?
Why not try NFSv2 - that will bypass the multitude of *ATTR calls going on a bit and may be better.
Only testing will tell, of course.****
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Arnold,
Just out of curiosity, have you tried a different transfer method like rsync to see what the results are? Do you have any dead symbolic links in the directory? We used to run into issues with cp because it tried to resolve the link even though it wasn't going to copy it.
Jeff
On Thu, Oct 17, 2013 at 4:37 PM, Arnold de Leon a-toasters@deleons.comwrote:
I just tried NFSv2, similar results. Same deadly embrace of request/reply which is death with a link with latency.
For this current test case this took over 5 minutes to copy 11MB of data in 600 files.
For reference
NFS (WAN 20ms) to NFS (WAN 20ms): 5 minutes NFS (WAN 20 ms) to local disk: 50 seconds Local disk to NFS (WAN 20ms ): 3 minutes 50 seconds Local disk to local disk: 0.285 seconds NFS (LAN 1ms) to NFS (LAN 1ms): 5 seconds
[I see that Peter also added data. This is a bit of throwback for me, I've been around long enough to deal with the fun Usenet spool directories and joys of lots of small files]
The irony here is all I want is to make a copy of a directory which would be a trivial operation on the filer itself, I just need another reference to the directory. The filer can make the copy when data is modified. I'm wondering if I can do this with the OnTap API (I don't know about the APIs yet). The other possible work around is using a host that is on the same LAN as the filer to do the copies. I just can't believe that there isn't a simple solution to this.
Thanks everyone for the ideas and information.
arnold
On Thu, Oct 17, 2013 at 2:28 PM, Mike Horwath drechsau@gmail.com wrote:
Do you require NFSv3?
Why not try NFSv2 - that will bypass the multitude of *ATTR calls going on a bit and may be better.
Only testing will tell, of course.
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
I would also use rsync if performance was at a premium. You can have an rsync communicate with an rsync daemon on the remote side and even compress the data transfer.
This also could be an issue of mount options and the TCP window. Large TCP windows are especially helpful when dealing with a slow WAN connection, and if attribute caching is disabled on the NFS mount that will hurt you as well.
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Jeff Cleverley Sent: Friday, October 18, 2013 1:14 AM To: Arnold de Leon Cc: toasters Subject: Re: Slow copy of a directory full of files via an NFS client across a WAN
Arnold, Just out of curiosity, have you tried a different transfer method like rsync to see what the results are? Do you have any dead symbolic links in the directory? We used to run into issues with cp because it tried to resolve the link even though it wasn't going to copy it. Jeff
On Thu, Oct 17, 2013 at 4:37 PM, Arnold de Leon <a-toasters@deleons.commailto:a-toasters@deleons.com> wrote: I just tried NFSv2, similar results. Same deadly embrace of request/reply which is death with a link with latency.
For this current test case this took over 5 minutes to copy 11MB of data in 600 files.
For reference
NFS (WAN 20ms) to NFS (WAN 20ms): 5 minutes NFS (WAN 20 ms) to local disk: 50 seconds Local disk to NFS (WAN 20ms ): 3 minutes 50 seconds Local disk to local disk: 0.285 seconds NFS (LAN 1ms) to NFS (LAN 1ms): 5 seconds
[I see that Peter also added data. This is a bit of throwback for me, I've been around long enough to deal with the fun Usenet spool directories and joys of lots of small files]
The irony here is all I want is to make a copy of a directory which would be a trivial operation on the filer itself, I just need another reference to the directory. The filer can make the copy when data is modified. I'm wondering if I can do this with the OnTap API (I don't know about the APIs yet). The other possible work around is using a host that is on the same LAN as the filer to do the copies. I just can't believe that there isn't a simple solution to this.
Thanks everyone for the ideas and information.
arnold
On Thu, Oct 17, 2013 at 2:28 PM, Mike Horwath <drechsau@gmail.commailto:drechsau@gmail.com> wrote: Do you require NFSv3?
Why not try NFSv2 - that will bypass the multitude of *ATTR calls going on a bit and may be better.
Only testing will tell, of course.
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
-- Jeff Cleverley Unix Systems Administrator 4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611
Try mounting with the "noatime" option, as this prevents at least one request per file access. And "nodiratime", if you have it.
NFS attributes can be cached, though it is not clear if this will help. If your copy only touches each file once, the attribute cache won't help. But if your application is reading and writing to these files too, attribute caching may help:
acregmin=n The minimum time in seconds that attributes of a regular file should be cached before requesting fresh information from a server. The default is 3 seconds.
acregmax=n The maximum time in seconds that attributes of a regular file can be cached before requesting fresh information from a server. The default is 60 seconds.
acdirmin=n The minimum time in seconds that attributes of a directory should be cached before requesting fresh information from a server. The default is 30 seconds.
acdirmax=n The maximum time in seconds that attributes of a directory can be cached before requesting fresh information from a server. The default is 60 seconds.
actimeo=n Using actimeo sets all of acregmin, acregmax, acdirmin, and acdirmax to the same value. There is no default value.
This option also may be useful:
nocto Suppress the retrieval of new attributes when creating a file.
Tom
On 2013-10-17, at 12:56 PM, Arnold de Leon a-toasters@deleons.com wrote:
Mount settings
Flags: rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=sys
AFAIK these are the defaults.
I have not seen any delays that are on the order of seconds. What I see is a steady stream of 1 requests 1 response. What I was hoping to see is a multiple requests followed by multiple responses (to deal w/ the latency).
arnold
On Thu, Oct 17, 2013 at 11:29 AM, Jordan Slingerland Jordan.Slingerland@independenthealth.com wrote:
Though generally, the more files the slower things are going to be, 600 files doesn’t really seem like that many.
Are there any long delays in excess of several seconds in your packet capture?
How long is the transfer actually taking? Compared to a xfer of the same data in a tar?
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Arnold de Leon Sent: Thursday, October 17, 2013 2:17 PM To: toasters Subject: Slow copy of a directory full of files via an NFS client across a WAN
I have an NFS client (CentOS 5) that is mounting a volume from filer across a WAN with about 20ms of latency (40 ms round trip). We have an application that makes a "backup" copy of a directory of data before modifying it. The directory can contain a lot tiny of files (for example 600 files, taking up less than 12MB). I'm using NFS V3 over TCP (also tried V3 over UDP).
The NFS copy (using cp -a) is glacial. Looking at the WireShark packet captures shows the client and the server in a deadly lock step request/reply situation (typically GETATTR and SETATTR). As far as I can tell the client/server are not allowing any windowing/pipelining of requests. Is there a setting I am missing to enable more outstanding requests (on the client or the server)? All the searches I've done for tuning seem to be about fixing issues for large bulk transfers. Is the NFS protocol inherently limited in this way. I expected this kind of behavior in SMB 1 but not with NFS.
Thanks.
arnold
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
On Thu, Oct 17, 2013 at 12:56:43PM -0700, Arnold de Leon wrote:
I have not seen any delays that are on the order of seconds. What I see is a steady stream of 1 requests 1 response. What I was hoping to see is a multiple requests followed by multiple responses (to deal w/ the latency).
That's how your application (cp -a) works, there is no chance to have multiple outstanding requests when none are requested. For read/write the system has at least a chance to do read-ahead or write-behind buffering, but for scanning a directory tree sequentially no magic is done.
You can change your procedure to first generate a list of files (using ls -f or possibly find, avoid stat() calls). Split that list into parts and work on each part in parallel, for example with cpio or pax, 'tar -T' should work too.
Linux can issue parallel requests to the NFS server, there is a limit of 16 per mount or per server depending on kernel version, and that limit is tunable.
Greetings,
Thanks everyone for the ideas. I should have listed some of the other things we already tried. We had played around with generating the list with a find and feeding a parallel copy and got some speed up that way but I was hoping that I was just missing something obvious that was keeping cp/tar/cpio/rsync issuing the next write before the previous one.
The "nocto" option is one I haven't tried but looks interesting. NFSv4 is not an immediate option (but could be in the long run). Another option is write a custom copy program that does parallel writes but we need to understand more. Generating the list is being surprisingly slow (slow enough to exceed my time budget). I would have expected READDIR/READDIR+ to be a little smarter but this was not being apparent with the tests I have run so far. If there is way to make the filer do the copy itself that would be ok as well.
The client is making a copy of a directory on the same filer. The filer just happens to be remote.
Client <---WAN---> Filer
So if there there was directory called Filer:/dir/a the Client wants to make Filer:/dir/a-copy before the contents of "/dir/a" get modified.
Thanks again.
arnold
On Thu, Oct 17, 2013 at 10:24 PM, Michael van Elst mlelstv@serpens.dewrote:
On Thu, Oct 17, 2013 at 12:56:43PM -0700, Arnold de Leon wrote:
I have not seen any delays that are on the order of seconds. What I see
is
a steady stream of 1 requests 1 response. What I was hoping to see is a multiple requests followed by multiple responses (to deal w/ the
latency).
That's how your application (cp -a) works, there is no chance to have multiple outstanding requests when none are requested. For read/write the system has at least a chance to do read-ahead or write-behind buffering, but for scanning a directory tree sequentially no magic is done.
You can change your procedure to first generate a list of files (using ls -f or possibly find, avoid stat() calls). Split that list into parts and work on each part in parallel, for example with cpio or pax, 'tar -T' should work too.
Linux can issue parallel requests to the NFS server, there is a limit of 16 per mount or per server depending on kernel version, and that limit is tunable.
Greetings,
Michael van Elst
Internet: mlelstv@serpens.de "A potential Snark may lurk in every tree."
I am not sure that this applicable to your problem, but we made very good experiences with the bbcp (Babar Copy: http://www.slac.stanford.edu/~abh/bbcp/ , http://www.nics.tennessee.edu/computing-resources/data-transfer/bbcp ) program when we migrated our Exchange mailboxes over a WAN links.
"The BBCP utility is capable of breaking up your transfer into multiple simultaneously transferring streams, thereby transferring data faster than single-streaming utilities"
It can work from list but also recursively transfer complete directory trees.
Christoph
Von: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] Im Auftrag von Arnold de Leon Gesendet: Freitag, 18. Oktober 2013 08:03 An: toasters Betreff: Re: Slow copy of a directory full of files via an NFS client across a WAN
Thanks everyone for the ideas. I should have listed some of the other things we already tried. We had played around with generating the list with a find and feeding a parallel copy and got some speed up that way but I was hoping that I was just missing something obvious that was keeping cp/tar/cpio/rsync issuing the next write before the previous one.
The "nocto" option is one I haven't tried but looks interesting. NFSv4 is not an immediate option (but could be in the long run). Another option is write a custom copy program that does parallel writes but we need to understand more. Generating the list is being surprisingly slow (slow enough to exceed my time budget). I would have expected READDIR/READDIR+ to be a little smarter but this was not being apparent with the tests I have run so far. If there is way to make the filer do the copy itself that would be ok as well.
The client is making a copy of a directory on the same filer. The filer just happens to be remote.
Client <---WAN---> Filer
So if there there was directory called Filer:/dir/a the Client wants to make Filer:/dir/a-copy before the contents of "/dir/a" get modified.
Thanks again.
arnold
On Thu, Oct 17, 2013 at 10:24 PM, Michael van Elst <mlelstv@serpens.demailto:mlelstv@serpens.de> wrote: On Thu, Oct 17, 2013 at 12:56:43PM -0700, Arnold de Leon wrote:
I have not seen any delays that are on the order of seconds. What I see is a steady stream of 1 requests 1 response. What I was hoping to see is a multiple requests followed by multiple responses (to deal w/ the latency).
That's how your application (cp -a) works, there is no chance to have multiple outstanding requests when none are requested. For read/write the system has at least a chance to do read-ahead or write-behind buffering, but for scanning a directory tree sequentially no magic is done.
You can change your procedure to first generate a list of files (using ls -f or possibly find, avoid stat() calls). Split that list into parts and work on each part in parallel, for example with cpio or pax, 'tar -T' should work too.
Linux can issue parallel requests to the NFS server, there is a limit of 16 per mount or per server depending on kernel version, and that limit is tunable.
Greetings, -- Michael van Elst Internet: mlelstv@serpens.demailto:mlelstv@serpens.de "A potential Snark may lurk in every tree."
The client is making a copy of a directory on the same filer. The filer just happens to be remote.
If the directory is on the same filer, use ndmpcopy. Or better yet...why are you making this copy. It sounds like what you are trying to accomplish with the /dir/a-copy could be accomplished with a snapshot.
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Arnold de Leon Sent: Friday, October 18, 2013 2:03 AM To: toasters Subject: Re: Slow copy of a directory full of files via an NFS client across a WAN
Thanks everyone for the ideas. I should have listed some of the other things we already tried. We had played around with generating the list with a find and feeding a parallel copy and got some speed up that way but I was hoping that I was just missing something obvious that was keeping cp/tar/cpio/rsync issuing the next write before the previous one.
The "nocto" option is one I haven't tried but looks interesting. NFSv4 is not an immediate option (but could be in the long run). Another option is write a custom copy program that does parallel writes but we need to understand more. Generating the list is being surprisingly slow (slow enough to exceed my time budget). I would have expected READDIR/READDIR+ to be a little smarter but this was not being apparent with the tests I have run so far. If there is way to make the filer do the copy itself that would be ok as well.
The client is making a copy of a directory on the same filer. The filer just happens to be remote.
Client <---WAN---> Filer
So if there there was directory called Filer:/dir/a the Client wants to make Filer:/dir/a-copy before the contents of "/dir/a" get modified.
Thanks again.
arnold
On Thu, Oct 17, 2013 at 10:24 PM, Michael van Elst <mlelstv@serpens.demailto:mlelstv@serpens.de> wrote: On Thu, Oct 17, 2013 at 12:56:43PM -0700, Arnold de Leon wrote:
I have not seen any delays that are on the order of seconds. What I see is a steady stream of 1 requests 1 response. What I was hoping to see is a multiple requests followed by multiple responses (to deal w/ the latency).
That's how your application (cp -a) works, there is no chance to have multiple outstanding requests when none are requested. For read/write the system has at least a chance to do read-ahead or write-behind buffering, but for scanning a directory tree sequentially no magic is done.
You can change your procedure to first generate a list of files (using ls -f or possibly find, avoid stat() calls). Split that list into parts and work on each part in parallel, for example with cpio or pax, 'tar -T' should work too.
Linux can issue parallel requests to the NFS server, there is a limit of 16 per mount or per server depending on kernel version, and that limit is tunable.
Greetings, -- Michael van Elst Internet: mlelstv@serpens.demailto:mlelstv@serpens.de "A potential Snark may lurk in every tree."
A snapshot might do it. Is there a way to create a snapshot on demand for just one directory? A writeable snapshot clone would definitely do it.
arnold
On Fri, Oct 18, 2013 at 5:53 AM, Jordan Slingerland < Jordan.Slingerland@independenthealth.com> wrote:
The client is making a copy of a directory on the same filer. The
filer just happens to be remote.****
If the directory is on the same filer, use ndmpcopy. Or better yet…why are you making this copy. It sounds like what you are trying to accomplish with the /dir/a-copy could be accomplished with a snapshot.****
*From:* toasters-bounces@teaparty.net [mailto: toasters-bounces@teaparty.net] *On Behalf Of *Arnold de Leon
*Sent:* Friday, October 18, 2013 2:03 AM *To:* toasters *Subject:* Re: Slow copy of a directory full of files via an NFS client across a WAN****
Thanks everyone for the ideas. I should have listed some of the other things we already tried. We had played around with generating the list with a find and feeding a parallel copy and got some speed up that way but I was hoping that I was just missing something obvious that was keeping cp/tar/cpio/rsync issuing the next write before the previous one. ****
The "nocto" option is one I haven't tried but looks interesting. NFSv4 is not an immediate option (but could be in the long run). Another option is write a custom copy program that does parallel writes but we need to understand more. Generating the list is being surprisingly slow (slow enough to exceed my time budget). I would have expected READDIR/READDIR+ to be a little smarter but this was not being apparent with the tests I have run so far. If there is way to make the filer do the copy itself that would be ok as well.****
The client is making a copy of a directory on the same filer. The filer just happens to be remote.****
Client <---WAN---> Filer****
So if there there was directory called Filer:/dir/a the Client wants to make Filer:/dir/a-copy before the contents of "/dir/a" get modified.****
Thanks again.****
arnold****
On Thu, Oct 17, 2013 at 10:24 PM, Michael van Elst mlelstv@serpens.de wrote:****
On Thu, Oct 17, 2013 at 12:56:43PM -0700, Arnold de Leon wrote:
I have not seen any delays that are on the order of seconds. What I see
is
a steady stream of 1 requests 1 response. What I was hoping to see is a multiple requests followed by multiple responses (to deal w/ the
latency).****
That's how your application (cp -a) works, there is no chance to have multiple outstanding requests when none are requested. For read/write the system has at least a chance to do read-ahead or write-behind buffering, but for scanning a directory tree sequentially no magic is done.
You can change your procedure to first generate a list of files (using ls -f or possibly find, avoid stat() calls). Split that list into parts and work on each part in parallel, for example with cpio or pax, 'tar -T' should work too.
Linux can issue parallel requests to the NFS server, there is a limit of 16 per mount or per server depending on kernel version, and that limit is tunable.
Greetings,
Michael van Elst
Internet: mlelstv@serpens.de "A potential Snark may lurk in every tree."
No, you can only snapshot an entire volume. The benefit is it is nearly instant. (creating a snapshot is only making a copy of the root inode which is 4KB)
There are obviously then change rate considerations if you are using snapshots. It's possible these are negligible ... it is possible they are significant to prohibitive...
You can create or delete a snapshot on the fly of the entire volume with the `snap create ${volume} ${snapshotname}`
A writable snapshot is also an option if you have a flexclone license. Same idea except the flexclone copy will be writable. The flexclone only takes up 4KB of additional space until you begin to write to it. The flexclone parent volume will continue to grow commensurate to your change rate and snapshot schedule as long as the flexclones exist.
--Jordan
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Arnold de Leon Sent: Friday, October 18, 2013 10:42 AM To: toasters Subject: Re: Slow copy of a directory full of files via an NFS client across a WAN
A snapshot might do it. Is there a way to create a snapshot on demand for just one directory? A writeable snapshot clone would definitely do it.
arnold
On Fri, Oct 18, 2013 at 5:53 AM, Jordan Slingerland <Jordan.Slingerland@independenthealth.commailto:Jordan.Slingerland@independenthealth.com> wrote:
The client is making a copy of a directory on the same filer. The filer just happens to be remote.
If the directory is on the same filer, use ndmpcopy. Or better yet...why are you making this copy. It sounds like what you are trying to accomplish with the /dir/a-copy could be accomplished with a snapshot.
From: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net] On Behalf Of Arnold de Leon
Sent: Friday, October 18, 2013 2:03 AM To: toasters Subject: Re: Slow copy of a directory full of files via an NFS client across a WAN
Thanks everyone for the ideas. I should have listed some of the other things we already tried. We had played around with generating the list with a find and feeding a parallel copy and got some speed up that way but I was hoping that I was just missing something obvious that was keeping cp/tar/cpio/rsync issuing the next write before the previous one.
The "nocto" option is one I haven't tried but looks interesting. NFSv4 is not an immediate option (but could be in the long run). Another option is write a custom copy program that does parallel writes but we need to understand more. Generating the list is being surprisingly slow (slow enough to exceed my time budget). I would have expected READDIR/READDIR+ to be a little smarter but this was not being apparent with the tests I have run so far. If there is way to make the filer do the copy itself that would be ok as well.
The client is making a copy of a directory on the same filer. The filer just happens to be remote.
Client <---WAN---> Filer
So if there there was directory called Filer:/dir/a the Client wants to make Filer:/dir/a-copy before the contents of "/dir/a" get modified.
Thanks again.
arnold
On Thu, Oct 17, 2013 at 10:24 PM, Michael van Elst <mlelstv@serpens.demailto:mlelstv@serpens.de> wrote: On Thu, Oct 17, 2013 at 12:56:43PM -0700, Arnold de Leon wrote:
I have not seen any delays that are on the order of seconds. What I see is a steady stream of 1 requests 1 response. What I was hoping to see is a multiple requests followed by multiple responses (to deal w/ the latency).
That's how your application (cp -a) works, there is no chance to have multiple outstanding requests when none are requested. For read/write the system has at least a chance to do read-ahead or write-behind buffering, but for scanning a directory tree sequentially no magic is done.
You can change your procedure to first generate a list of files (using ls -f or possibly find, avoid stat() calls). Split that list into parts and work on each part in parallel, for example with cpio or pax, 'tar -T' should work too.
Linux can issue parallel requests to the NFS server, there is a limit of 16 per mount or per server depending on kernel version, and that limit is tunable.
Greetings, -- Michael van Elst Internet: mlelstv@serpens.demailto:mlelstv@serpens.de "A potential Snark may lurk in every tree."
Summary
It appears that our best bet is to use the Netapp APIs and the "clone" commands. This would be most efficient, the data never leaves the filer. The "copies" don't take any additional space (except for the meta data) until they get modified.
We need to do a little more research and testing.
Thanks.
It sounds to me that is the best tool for the job.
The only gotcha I can think of is these clones either need to be deleted or split at some point , or they will just grow indefinitely with changes. Also, you are limited to 255 snapshots per volume and each flexclone is going to use a snapshot.
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Arnold de Leon Sent: Friday, October 18, 2013 8:30 PM To: toasters Subject: Re: Slow copy of a directory full of files via an NFS client across a WAN
Summary
It appears that our best bet is to use the Netapp APIs and the "clone" commands. This would be most efficient, the data never leaves the filer. The "copies" don't take any additional space (except for the meta data) until they get modified.
We need to do a little more research and testing.
Thanks.
Jordan,
for quite some time now, if you clone with the 'clone' command, you will create sis-clones, that do not depend on a snapshot. (As opposed to 'lun clone' or 'vol clone'...) Therefore no splitting necessary, neither will you run into the 255 limit.
OTOH, since you'd have to clone file by file (there's no directory cloning...) it could get tedious and you might run into the whole WAN traffic problem. It might just be faster to 'ndmpcopy' the directory (1 command, executed on the filer, negligible WAN traffic), perhaps followed by a dedupe run to regain the space. It would have (at least) the same space effect as the file cloning, but way less network traffic, and might therefore be not only simpler, but also faster.
My 2c
On 10/21/2013 8:15 PM, Jordan Slingerland wrote:
It sounds to me that is the best tool for the job.
The only gotcha I can think of is these clones either need to be deleted or split at some point , or they will just grow indefinitely with changes. Also, you are limited to 255 snapshots per volume and each flexclone is going to use a snapshot.
*From:*toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] *On Behalf Of *Arnold de Leon *Sent:* Friday, October 18, 2013 8:30 PM *To:* toasters *Subject:* Re: Slow copy of a directory full of files via an NFS client across a WAN
Summary
It appears that our best bet is to use the Netapp APIs and the "clone" commands. This would be most efficient, the data never leaves the filer. The "copies" don't take any additional space (except for the meta data) until they get modified.
We need to do a little more research and testing.
Thanks.
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters