On the evening of Saturday February 20 we upgraded from BudTool 4.5 to 4.6. Since then, we've not been able to do local NDMP backups on our filer. Anyone out there experience similar problems? If so, would you please throw some suggestions my way.
We have support calls open with both NetApp (being very helpful considering it's probably not an issue with their system) and Intelliguard (with what we pay them I'd assume they could at least hire warm bodies). Here's what we've come up with so far:
1) Intelliguard told us to add the line "ROBOT_SCSI_LUN 1" to the jbmgr_config for the filer. This allowed the jbmgr daemon to start.
audreyii: /usr/budtool/bud [225] % cat jbmgr_config.audreyii.2 ROBOT_DEV_HOST nfssrv1 ROBOT_DEV_NAME spt0 ROBOT_SCSI_ID 2 ROBOT_SCSI_BUS 2 ROBOT_SCSI_LUN 1 DATA_DEV_NAME0 nrst0a DATA_DEV_HOST0 nfssrv1 PASSWORD xxxxxxxx USEBARCODE N VALIDATECMD "$BTHOME/bin/jbupdate audreyii:2 -a" audreyii: /usr/budtool/bud [226] %
2) Running the BudTool probe-scsi command shows the devices:
audreyii: ~ [230] % probe-scsi -h nfssrv1 Copyright (c) 1990-1999 by Intelliguard Software Inc. probe-scsi BudTool 4.6 SCSI Bus 0 SCSI Bus 1 SCSI Bus 2 Target 2 Unit 0 Removable Tape EXABYTE EXB-8500 CC37 Unit 1 Removable Jukebox Device EXABYTE TZ Media Changer CC37 SCSI Bus 3 SCSI Bus 4 audreyii: ~ [231] %
3) After placing 2 labled BudToool tapes into the jukebox, running jbupdate returns instantly showing no tapes in the jukebox (doesn't even attempt to read tape labels).
audreyii: ~ [236] % jbupdate audreyii:2 -a ------------------------------------------------------------------ ------ W A R N I N G ------- You have selected to force an update on all 5 tapes. This process requires several minutes per tape for unbarcoded tapes. Large jukeboxes with unbarcoded tapes can take several hours. The PID of this process is 8721. Use a HUP, INT, TERM or QUIT to terminate this process if you do not want to continue. ------------------------------------------------------------------ Update complete. audreyii: ~ [237] % jbmgr -H audreyii:2 ls -a 0: 1: 2: 3: 4: audreyii: ~ [238] %
4) We can perform remote NDMP backups to our BudTool master media server, but this isn't practical at ~2.2 GB/hr with 70+ GB on our filer. (It's an old, slow F330 which I'm supposed to be replacing instead of trouble-shooting...)
5) Gave Intelliguard a copy of our config file, the jbmgr diag log, and the probe-scsi output for analysis. The diag file does show some errors:
6) After not receiving much help from Intelliguard, we called NetApp. They sugggested we upgrade from 5.1.2P2 to 5.1.2R3P1 to remove the possibility that this was related to an NDMPD bug (6417).
7) Spoke to Intelliguard again Monday and informed them that we were going to upgrade the filer. We decided to see if the upgrade solved the problem. (They managed to fix a btcp issue after I waited 4 hours for a call back -- I'm still waiting for our sales rep to tell me exactly what kind of turn around to expect when paying extra for 24x7 support.)
8) Nothing changed after the upgrade Monday night. I informed NetApp; they opened another call and had us send them more info.
9) Told Intelliguard, gave them another copy of the diag file.
10) Configured BudTool to do lev 1 remote backups of the filer, this should hold us until the weekend.
11) Had to take a personal day Tuesday (for personal reasons). Hoped I get a couple voice/e-mails telling me that someone figured out what was wrong. (Finding out we were stupid and didn't have BudTool configured correctly would have been a relief.)
12) Wednesday we received some follow-up calls from NetApp, the call was escalated, and we're currently expecting a call.
13) We verified that tapes can be manually loaded into the jukebox drive and dump run on the console of the filer. Pretty confident that the jukebox didn't coincidentally die during the BudTool upgrade.
14) Will file missing persons reports for Intelliguard employees in another 24 hours. (Anyone have the number for the police in Dublin, CA?)
So far, I'm confident it's a BudTool issue. (Many thanks/apologies to NetApp if it is.) I'd be very grateful if someone could give me a possible avenue of investigation. I'm at a dead-end and looking at over 30 hours of backups to lev 0 our filer this weekend... :(
thanks,
jason
--- Jason D. Kelleher kelleher@susq.com Susquehanna Partners, G.P. 610.617.2721 (voice) 401 City Line Ave, Suite 220 610.617.2916 (fax) Bala Cynwyd, PA 19004-1122
We installed Budtool 4.6 on our Budtool Host (a Sun running Solaris 2.6). Our Media server is a NetApp 760. We did not have any problems doing NDMP backups. So maybe you should remove Budtool, and install 4.6 from scratch.
"Jason D. Kelleher" wrote:
On the evening of Saturday February 20 we upgraded from BudTool 4.5 to 4.6. Since then, we've not been able to do local NDMP backups on our filer. Anyone out there experience similar problems? If so, would you please throw some suggestions my way. We have support calls open with both NetApp (being very helpful considering it's probably not an issue with their system) and Intelliguard (with what we pay them I'd assume they could at least hire warm bodies). Here's what we've come up with so far: 1) Intelliguard told us to add the line "ROBOT_SCSI_LUN 1" to the jbmgr_config for the filer. This allowed the jbmgr daemon to start.
audreyii: /usr/budtool/bud [225] % cat jbmgr_config.audreyii.2 ROBOT_DEV_HOST nfssrv1 ROBOT_DEV_NAME spt0 ROBOT_SCSI_ID 2 ROBOT_SCSI_BUS 2 ROBOT_SCSI_LUN 1 DATA_DEV_NAME0 nrst0a DATA_DEV_HOST0 nfssrv1 PASSWORD xxxxxxxx USEBARCODE N VALIDATECMD "$BTHOME/bin/jbupdate audreyii:2 -a" audreyii: /usr/budtool/bud [226] %
2) Running the BudTool probe-scsi command shows the devices:
audreyii: ~ [230] % probe-scsi -h nfssrv1 Copyright (c) 1990-1999 by Intelliguard Software Inc. probe-scsi BudTool 4.6 SCSI Bus 0 SCSI Bus 1 SCSI Bus 2 Target 2 Unit 0 Removable Tape EXABYTE EXB-8500 CC37 Unit 1 Removable Jukebox Device EXABYTE TZ Media Changer CC37 SCSI Bus 3 SCSI Bus 4 audreyii: ~ [231] %
3) After placing 2 labled BudToool tapes into the jukebox, running jbupdate returns instantly showing no tapes in the jukebox (doesn't even attempt to read tape labels).
audreyii: ~ [236] % jbupdate audreyii:2 -a
------ W A R N I N G -------
You have selected to force an update on all 5 tapes. This process requires several minutes per tape for unbarcoded tapes. Large jukeboxes with unbarcoded tapes can take several hours. The PID of this process is 8721. Use a HUP, INT, TERM or QUIT to terminate this process if you do not want to continue.
Update complete. audreyii: ~ [237] % jbmgr -H audreyii:2 ls -a 0: 1: 2: 3: 4: audreyii: ~ [238] %
4) We can perform remote NDMP backups to our BudTool master media server, but this isn't practical at ~2.2 GB/hr with 70+ GB on our filer. (It's an old, slow F330 which I'm supposed to be replacing instead of trouble-shooting...) 5) Gave Intelliguard a copy of our config file, the jbmgr diag log, and the probe-scsi output for analysis. The diag file does show some errors: 6) After not receiving much help from Intelliguard, we called NetApp. They sugggested we upgrade from 5.1.2P2 to 5.1.2R3P1 to remove the possibility that this was related to an NDMPD bug (6417). 7) Spoke to Intelliguard again Monday and informed them that we were going to upgrade the filer. We decided to see if the upgrade solved the problem. (They managed to fix a btcp issue after I waited 4 hours for a call back -- I'm still waiting for our sales rep to tell me exactly what kind of turn around to expect when paying extra for 24x7 support.) 8) Nothing changed after the upgrade Monday night. I informed NetApp; they opened another call and had us send them more info. 9) Told Intelliguard, gave them another copy of the diag file. 10) Configured BudTool to do lev 1 remote backups of the filer, this should hold us until the weekend. 11) Had to take a personal day Tuesday (for personal reasons). Hoped I get a couple voice/e-mails telling me that someone figured out what was wrong. (Finding out we were stupid and didn't have BudTool configured correctly would have been a relief.) 12) Wednesday we received some follow-up calls from NetApp, the call was escalated, and we're currently expecting a call. 13) We verified that tapes can be manually loaded into the jukebox drive and dump run on the console of the filer. Pretty confident that the jukebox didn't coincidentally die during the BudTool upgrade. 14) Will file missing persons reports for Intelliguard employees in another 24 hours. (Anyone have the number for the police in Dublin, CA?) So far, I'm confident it's a BudTool issue. (Many thanks/apologies to NetApp if it is.) I'd be very grateful if someone could give me a possible avenue of investigation. I'm at a dead-end and looking at over 30 hours of backups to lev 0 our filer this weekend... :( thanks, jason
Jason D. Kelleher kelleher@susq.com Susquehanna Partners, G.P. 610.617.2721 (voice) 401 City Line Ave, Suite 220 610.617.2916 (fax) Bala Cynwyd, PA 19004-1122