During some testing I found that when a dump is already in progress to our filer's local tape library (one drive), another dump can be run. (Okay, so I as at home and thought that the previous dump would have finished.) Both dumps were apparently running, thinking that they were successfully writing to the same tape.
An mt status command run during this period showed that the drive was ready, but checking the status several times showed that writing was occurring (block numbers were changing). Also, the fileno number increased several more times than it should have.
The behaviour I would expect is that is a dump is already running, another attempt to dump to the tape would result in a device busy message. Did I witness one dump clobbering another?
The filer is an F740, the library is a Qualstar 6110, and Data OnTap is version 5.2.1.
Thanks for any insight.
During some testing I found that when a dump is already in progress to our filer's local tape library (one drive), another dump can be run. (Okay, so I as at home and thought that the previous dump would have finished.) Both dumps were apparently running, thinking that they were successfully writing to the same tape.
An mt status command run during this period showed that the drive was ready, but checking the status several times showed that writing was occurring (block numbers were changing). Also, the fileno number increased several more times than it should have.
The behaviour I would expect is that is a dump is already running, another attempt to dump to the tape would result in a device busy message. Did I witness one dump clobbering another?
The filer is an F740, the library is a Qualstar 6110, and Data OnTap is version 5.2.1.
Thanks for any insight.
So you have one tape drive hooked up to the filer, and you ran two rsh dumps to the same drive?
I just tried the same experiment on a filer running 5.3:
(First Xwindow) tooting% rsh dickens dump 0f rst0a /etc DUMP: Dumping tape file 1 on rst0a DUMP: creating "/vol/home/../snapshot_for_backup.3" snapshot.
(Second Xwindow) tooting% rsh dickens dump 0f rst0a /etc DUMP: Dumping tape file 1 on rst0a DUMP: open of rst0a failed. DUMP: tape open failed, and can't ask questions when run via "rsh". DUMP: DUMP IS ABORTED
So what you would expect, happened here. Looking through the code, I don't see any obvious changes in how we "lock" and "unlock" tape devices between 5.2.1 and 5.3.
Do you have some sort of log of the two dumps that were running? It would be interesting to see exactly what was happening from their points of view...
Stephen Manley File System Recovery Sleuth