A number of replies I've received about this have suggested that the problem might be the block size used. I don't think this is implicated: both the ONTAP dumps (dump and restore both fast) and the Solaris dumps (dump fairly fast, restore cripplingly slow) are blocked at 63 KB. I've confirmed that they _really_ are so blocked by having my read-and-discard-via-rmt(8) program show the lengths actually returned by the read operations.
Thanks to all those who respinded, but I'm still in search of new ideas.
Chris Thompson Email: cet1@cam.ac.uk
I am running ONTAP 6.4.3P2 on an F810. It has a SCSI-attached DLT7000 drive used primarily for its own backups, but we also use the filer's rmt(8) implementation to dump a few Solaris ufs partitions that way.
We've been doing a round of checking that we can we restore from our backups. Everything went fine until I tried to restore these Solaris partitions. The problem isn't content, but the appallingly slow transfer rate. This isn't ufsrestore's fault per se, as I've used a program that just does 63KB reads via the rmt interface and disposes of the data [*] and I can still get at best 400 KB/sec out if it. That's at least a factor of 10 less than it should be: dumping the Solaris partitions achieves something like 4000 KB/sec [not quite up to 7000 KB/sec that the ONTAP dumps run at --- there's a 100 MHz network link involved --- but acceptable]. It's not a problem with tape reading as such: ONTAP restores run at essentially the same speed as ONTAP dumps.
We certainly didn't have this problem the last time we tried this, about a year ago. What's changed since then?
. we were using an F740 that has been replaced by the F810 [the DLT7000 tape drive is the same one]
. we were running ONTAP 6.2.2 then, 6.4.3P2 now
. we were running Solaris 8 then, Solaris 9 now (well patched, in both cases) - but as mentioned above, it's not ufsrestore per se
Anyone got any ideas on this one? or similar experiences?
Chris Thompson Email: cet1@cam.ac.uk
[*] "Disposes of the data" after it's been sucked through a pipe on the Solaris host. I suppose I could write a program to call rcmd(3) directly rather than relying on rsh(1) to do it for me, but I really doubt that this can be where the bottleneck lies.