toasters February 2000

toasters@lists.teaparty.net

102 participants
113 discussions

Re: Centralized backup of multiple filers
by Marion Hakanson 18 Feb '00

18 Feb '00

Barry Lustig wrote: > > Do any of the available backup solutions support a "dump to disk mode" > like amanda? Amanda uses the disk on the local backup machine as the cache. > After the backup completes to disk amanda dumps that file to tape. This > takes care of the streaming issues. This of course would be a problem with > large NetApp filesystems. But imagine if the packages (veritas, legato, > workstation solutions, etc.) took the incoming data stream and wrote it as > configurable chunks (100Mb, 1GB, etc.) to disk. They could then flush those > chunks to tape. Hmm, I recall reading that CommVault Systems was working on a product which migrated (HSM-style) older files to optical jukebox, and thence to tape (via backup system), and made all the migrated stuff look like a really huge "snapshots" area: http://www.commvault.com/net_apps.html It's not _exactly_ what the original poster was talking about, but it seems similar in feel. And automatic. Anyone had any experience with this product? It would seem pretty cool, especially since it looks like they have this integrated with Unix (NFS) and NT file servers, database servers, etc. Regards, -- Marion Hakanson <hakanson(a)cse.ogi.edu> CSE Computing Facilities

1 0

RE: Problems encountered with AIT-2 tape drive on F740 server
by Roth, William 18 Feb '00

18 Feb '00

David, AIT-2 in native mode isn't supported yet but... Depending on the compressibility of the data, 50 GB may fill the tape. What kind of data are we dealing with, binaries, graphics, image, etc.? What quantity of data are we dealing with on the filer, before any compression? Are you using the dump command, or third party software to manage your backups? If manually using dump, what is the command you're entering? I understand that you're using nrst0a, I don't know why the abort occurs. Is there anything in /etc/messages or /etc/ndmplog.yyyymmdd that help define why the abort occurs? If you're using third party backup software, are there any messages that help us define the cause of the abort? Based on the problem you've described, I don't think that going from 5.3.4R2 to 5.3.4R3 is going to resolve the issue. Thanks, Bill Roth -----Original Message----- From: David H. Brierley [mailto:dhb@ssd.ray.com] Sent: Thursday, February 17, 2000 3:05 PM To: toasters(a)mathworks.com Subject: Problems encountered with AIT-2 tape drive on F740 server I am having a problem on one of my machines and I was wondering if anyone has encountered anything similar. I have an F740 running 5.3.4R2, with a directly attached AIT-2 tape library. When I first tested the tape drive it performed wonderfully. Once I was happy with the 740 and with the tape drive, I moved the bulk of the data from a pair of old F230's over to the F740. The problem I then encountered is that when I am dumping to the tape it writes 50 gig and then aborts. I am using the "36C" AIT-2 tapes and I am accessing the tape drive using device "nrst0a", so I should be able to write approximately 72 gig of data to the tape. I have tried several different tapes and get the same results. I also tried setting the tape drive into DLT7000 emulation and I still get the same results. The qeustion now is: is there something wrong with the tape drive or is this a problem with 5.3.4R2? Should I try upgrading to 5.3.4R3? I have another F740 with a DLT7000 library on it and it seems to work just fine. Did I blunder badly by switching to AIT-2? Please help!!!!! -- David H. Brierley Raytheon Electronic Systems, Naval & Maritime Integrated Systems Engineering Technology, Operating Systems Support Group

1 0

RE: Centralized backup of multiple filers
by Kelly Wyatt 17 Feb '00

17 Feb '00

And another... if you use the filer in a multiprotocol manner, won't you lose the NT acls if you back it up from UNIX? -- Kelly Wyatt, Kelly.Wyatt(a)SAS.com Systems Programmer Integrated Solutions Consulting SAS Institute Inc. / SAS Campus Drive / Cary, NC 27513 http://www.sas.com -----Original Message----- From: Bruce Sterling Woodcock [mailto:sirbruce@ix.netcom.com] Sent: Wednesday, February 16, 2000 6:10 AM To: Brian Tao; Darrell Fuhriman Cc: toasters(a)mathworks.com Subject: Re: Centralized backup of multiple filers > I thought they were already quite busy. If they're so busy that > you don't feel you can attach a local tape drive, I'm not really > understanding how they're unbusy enough to hammer them with a > bunch of rsyncs. Another point: Is buying and attaching all that disk to the SUN systems really cheaper than simply attaching a local tape drive? And setting up a cluster with SnapMirror would probably be more expensive, but not prohibitively so. Bruce

2 1

Centralized backup of multiple filers
by Brian Tao 17 Feb '00

17 Feb '00

This is my plan, after having debated the merits of distributed tape libraries on each filer vs. centralized tape library with network backup. I've posted separately to both the toasters and bigbackup mailing lists (even though I figure most people on the second list are also on the first). - backup clients are 12 filers (mostly F740's), each with multiple 100 Mbps Ethernet interfaces - backup servers are 2 Sun E420R's with enough CPU, memory, U2SCSI and Gigabit interfaces to keep things humming - each filer has 1 or 2 100 Mbps interfaces plugged into a switch, with the backup servers on Gigabit (probably something like a Catalyst 3524XL: 24 10/100 + 2 Gigabit) - each backup server will have four U2SCSI channels or two FC-AL loops, initially with half a terabyte of local disk and an Exabyte X80 library with 4 Mammoth2 drives (expandable to 8) - stage 1 backup: filesystems on all the filers will be replicated to the tape servers' local drives (probably rsync over NFS) - stage 2 backup: local filesystems are streamed to tape This seems to work around most of the "problems" associated with backing up directly to tape, with a few extra side benefits thrown in. I can only realistically expect a peak of 8 to 10 MB/sec from our filers (for some of them, there is only "busy" hours and "really busy" hours). That's not enough to keep the tape drives streaming and happy. To do that, I'd have to multiplex backup streams to a single tape, and I always thought that was a bad idea. Hard drives, of course, have no "streaming" issues. They'll take the data however fast or slow the Netapps can send them. Once the Netapp filesystems have been replicated to local disk, you blast them out to tape. With compression turned on, I figure I'll need about 20MB/sec per tape drive to keep them chugging along. Less shoeshining, less wear-and-tear on the media, longer tape drive MTBF. Since all the filer filesystems are consolidated on local storage, you can slice-n-dice your backup sets to fit whatever drive/tape/time constraints you may have. This also gives you a nearline copy of all your data. Combined with the Netapp's snapshots, I should never ever have to go to tape to retrieve a current generation copy of a file that was accidentally deleted or corrupted. Disaster recovery of a downed filesystem can also come off local disk instead of tape. If you use commercial tape backup software, you don't have to worry about buying and maintaining licenses for all the Netapps: all the software sees is one server backing up its own drives to a tape stacker. This may result in savings greater than the cost of the local drive storage. I haven't had an opportunity to really test out how fast rsync works over NFS with the particular hardware setup described above, so that's the weak link. If the results from trial runs on a non- dedicated Ultra2 can be scaled up to a quad CPU E420R, I don't think there will be a problem. Multiple rsyncs can be fired up concurrently to keep the filers busy. For the amount of data we have (300GB at present), I expect the tape drives will only be busy for about an hour doing a weekly full backup, and only a few minutes each day for differentials. Anyone else doing it like this? -- Brian Tao (BT300, taob(a)risc.org) "Though this be madness, yet there is method in't"

5 6

Re: Centralized backup of multiple filers
by Darrell Fuhriman 17 Feb '00

17 Feb '00

Brian Tao <taob(a)risc.org> writes: > I can only realistically expect a peak of 8 to 10 MB/sec from our > filers (for some of them, there is only "busy" hours and "really busy" Is that from experience or theory? > out to tape. With compression turned on, I figure I'll need about > 20MB/sec per tape drive to keep them chugging along. Less Does M2 have the streaming issues of DLT? You'll certainly never push that much data into a DLT7000 or AIT. ~10MB/sec with bursts to 12 is the best you'll get. I think M2 is probably too new to know how it really behaves in the real world. > constraints you may have. This also gives you a nearline copy of all > your data. Combined with the Netapp's snapshots, I should never ever If you're doing that, it would seem to me it would make more sense to use a second (perhaps in a cluster) toaster and mirror the volumes using snapmirror. > I haven't had an opportunity to really test out how fast rsync > works over NFS with the particular hardware setup described above, so > that's the weak link. If the results from trial runs on a non- It's not all that fast (somewhere i have some numbers I ran). If you're syncing lots of smaller files, you'll burn through memory like mad, too. > there will be a problem. Multiple rsyncs can be fired up concurrently > to keep the filers busy. For the amount of data we have (300GB at I thought they were already quite busy. If they're so busy that you don't feel you can attach a local tape drive, I'm not really understanding how they're unbusy enough to hammer them with a bunch of rsyncs. Frankly, I think you've made the solution much more complicated than need/should be. Backups should be done as simply as possible -- you've added lots of opportunity for things to break, which really don't need to be there. > Anyone else doing it like this? I suspect not. Darrell

3 4

Does anyone have experience/knowledge of 9K frames ?
by Eyal Traitel 17 Feb '00

17 Feb '00

Hi ! Does anybody knows if it is supported now in NetApp ? What about Cisco switches ? Has anyone uses it with trunking also ? Eyal.

1 0

Re: wafl unicode options and conversion times
by Todd C. Merrill 17 Feb '00

17 Feb '00

Bruce Sterling Woodcock wrote: > From: Joan Pearson <jpearson(a)netapp.com> > > Yes, while a directory is being converted, all access to it should be > > locked out. Perhaps the nfs client was using information it had > > already cached before the conversion. > > Yeah, but doesn't the cache check something on the directory before > it makes assumptions? Access or modify time? > > Maybe the filer isn't updating that at the start of the conversion > process. I dunno. There's gotta be a way to defeat the client > cacheing other than mounting "noac". This was all "first contact" stuff, from both a freshly rebooted filer and client. No caching as far as I understand it was involved. In our debugging, we tried the "noac" option; it didn't work (for obvious reasons looking back on it now). The first contact initiates the conversion; unfortunately for us, it was the dirents() from the `rm -rf` returning an incomplete directory listing. Joan, the call was 83902. On Wed, 16 Feb 2000, Hal Siegel wrote: > This conversion took close to a day and a half on a 630 full of > hundreds of thousands of files of gzipped archive data. Interesting. I interpreted the docs as saying a single directory with 100,000 files took 9 minutes. I never expected a directory with 20-30 files to be converted in anything more than a negligible time. My expectations were just out of line with the behaviour of the filer. NetApp folks, perhaps this could be expounded upon in future docs with more warnings and other examples. > Note that you > can monitor the conversion taking place by watching sysstat - there's a > steady 3-400 disk reads per second going on while the conversion is > being done. Looking at our MRTG data a few days later, I figured that out. Doh! Eyal Traitel wrote: > Can one of you explain the need for this anyway ? It's really only a concern in a mixed CIFS/NFS environment, which we are heavily. The docs can explain the wafl ucode options much better than I can. But, one problem with not having the unicode information in the directory structure is this, which we ran into all the time: o create a directory via an NFS client o do not access this directory from a CIFS client o create a snapshot o try to access this directory via a CIFS client via the snapshot You will get "permission denied." The CIFS access tries to create the unicode information in the directory upon creation or access, but, since it's in a read-only snapshot, it can't. The wafl ucode options allow you to create directories with the unicode information already there, even if created via an NFS client, or to convert the directories upon any access, NFS or CIFS. It is the latter case that takes a non-trivial amount of time, enough to impact normal behavior. Until next time... The Mathworks, Inc. 508-647-7000 x7792 3 Apple Hill Drive, Natick, MA 01760-2098 508-647-7001 FAX tmerrill(a)mathworks.com http://www.mathworks.com ---

1 0

wafl unicode options and conversion times
by Todd C. Merrill 16 Feb '00

16 Feb '00

Just thought I'd share a caveat that we ran into in the recent past in upgrading from 5.2.x to 5.3.x. It took two separate cases and 50+ emails and phone calls to figure out, so hopefully this info will save somebody else quite a bit of time. (Many thanks to Mike Smith in tech support for his patience and perseverance in resolving it.) 5.3 (?) introduced two new options, wafl.convert_ucode and wafl.create_ucode, to convert directories upon access and create directories from scratch with unicode bits, respectively, even if accessed/created via an NFS client. We updated our rc files with these options and proceeded with business as usual after the upgrade. Hours later, our entire automated testing environment came to a screeching halt with "permission denied" and other bizarre errors. For instance, an `rm -rf` of a directory would fail, leaving files around, when all the file and directory permissions would have allowed the removal: % rm -rf build.removeme rm: Unable to remove directory build.removeme/test/toolbox/map/map: File exists rm: Unable to remove directory build.removeme/test/toolbox/map: File exists ... Cut to the chase: after getting packet snoops and trying various things on different filers, and the usual troubleshooting, NetApp was able to reproduce the "problem." It wasn't a problem. The docs state, regarding unicode conversion of directories: "For example, converting a directory of 100,000 files takes the filer about 9 minutes on a netapp f540." I read this, and said to myself, "Self, we don't have any directories that big," so I dismissed as trivial the time it would take to convert our directories. We had maximum 2,000 files in our largest directories. Well, even with a semi-honkin' F740, two generations distant from the aforementioned F540, we were seeing `rm -rf` failures on small directories, with only tens of files. And, the clients doing the `rm -rf` weren't all that speedy, a 200 MHz Sun Ultra2, a 270 MHz Ultra5, and even a SPARC20, if I recall correctly (among other types of UNIX boxes, like 200 MHz Pentium Linux boxes). Turns out, this unicode conversion takes a non-trivial amount of time for *any* directory. System call traces showed directory entry calls returning not all files in that directory. Apparently, during the conversion, the "new" directory list can be incomplete. Hence, an `rm -rf` first gets an (incomplete) directory listing, then removes all those files, does a `cd ..` and then tries an rmdir(), at which point, since there are still files in the directory, it spits out an error like the above. Once we figured this out, we tried to forcibly convert all the directories by doing a `find . -print > /dev/null` to traverse the entire filer directory structure. Well, this apparently didn't work (my notes get sparse here) in all cases. Our final workaround was to go to a CIFS client (an NT box), right-click on each top level folder on the filer and select "properties," and let it finish calculating the folder size, thus doing the unicode conversion as usual (the way it had to be done before the wafl options in 5.3.x). So much for the interpretation of "wafl.convert_ucode" working via NFS access to the directory. At the conclusion of the call, I requested the docs to be expounded upon and made clearer regarding unicode conversions for future releases. To date, we have not had any further problems. The "access denied" messages when trying to access non-unicode directories (those created via an NFS client pre-5.3 or with the wafl.create_ucode option unset) from read-only snapshots via CIFS clients has been eliminated. Yea! Hope this helps others. Until next time... The Mathworks, Inc. 508-647-7000 x7792 3 Apple Hill Drive, Natick, MA 01760-2098 508-647-7001 FAX tmerrill(a)mathworks.com http://www.mathworks.com ---

4 3

New Gigabit Ethernet cards in cluster?
by Michael Rogan 16 Feb '00

16 Feb '00

Hello, Has anyone tried to use the new Gigabit Ethernet II (Intel) cards using a single vif in a cluster? I have tried these cards individually and failed the cluster over and everything works just fine but if I put them in a single/multi vif and fail the cluster, I can ping the machine, I can telnet to it and it states that it has failed over, I can http to it; however when I try NFS connections to it, it does not respond? Any suggestions? Config file (Partner rc is nearly identical): ## # RC, file san1 -- MJR # Network 172.17.1.0 to be changed after testing -- MJR # Modified for testing -- MJR ## hostname san1 options dns.domainname blackberry.net options dns.enable on options nis.enable off # Setup Virtual Interfaces ifconfig e6 flowcontrol none ifconfig e7 flowcontrol none vif create single sanvif e6 e7 vif favor e6 # configure interfaces ifconfig e0 192.168.200.110 netmask 255.255.255.0 mediatype 100tx-fd partner 192.168.200.111 ifconfig sanvif 172.17.1.110 netmask 255.255.255.0 trusted partner sanvif # Test, works if vif not configured - MJR #ifconfig e6 172.17.1.110 netmask 255.255.255.0 partner e6 route add default 192.168.200.1 1 savecore exportfs -a nfs on vol snapmirror on ## Michael Michael J. Rogan BlackBerry Sys. Adm. Research In Motion 519-888-7465 www.blackberry.net www.rim.net

1 0

Please add me to your list
by WAY,PETER (HP-Loveland,ex1) 16 Feb '00

16 Feb '00

Peter Way

1 0

← Newer
1
2
3
4
5
6
7
8
9
...
12
Older →

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

toasters February 2000