I finally got fed up with deleting snapshots by hand when they started using too much disk space, so I created a simple snapshot monitor that automatically deletes snapshots when they start using more than 100% of the space reserved for snapshot usage.
Now, I configure each filer to schedule more snapshots than I really want and I use only "snap reserve" to limit how much disk usage is taken by the snapshots. If a filer's data turnover rises, it automatically scales the number of snapshots back.
For example:
Jun 26 10:30:36 sodium na_snapmon: rsh uranium snap delete hourly.3 Jun 26 10:33:03 sodium na_snapmon: rsh neptunium snap delete nightly.1 Jun 26 10:34:49 sodium na_snapmon: rsh thorium snap delete hourly.2 Jun 26 10:40:04 sodium na_snapmon: rsh neptunium snap delete hourly.3
It will also try to delete dump snapshots that are older than two days old, and it will also ignore snapshots that have been specially created or renamed.
The best way to run this is as root cron job, every ten minutes or so.
This runs on Linux now, but should also work on Solaris. Other Unix systems with very minimal modification.
------- start of cut text -------------- #!/usr/bin/perl
# na_snapmon - monitor snapshot usage and keep it under control # Copyright (C) 1998 Daniel Quinlan # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
use Getopt::Std; require "find.pl";
sub usage; sub trim_snapshots; sub snap_delete;
my $limit; my $prog;
$prog = $0; $prog =~ s@.*/@@;
getopts("hd");
if ($opt_h) { &usage; exit 0; }
foreach $dir (@ARGV) { my $df_worked = 0; if (! -d $dir) { warn("$prog: $dir: no such snapshot mount point"); } open (DF, "df -k $dir |"); while (<DF>) { if (/[0-9]%/) { $df_worked = 1; ($fs, $blocks, $used, $avail, $capacity, $mount) = split; if ($capacity !~ /^[0-9]?[0-9]%/) { $fs =~ s/:.*//; trim_snapshots($fs, $dir); } } } close (DF); }
sub trim_snapshots { my ($filer, $dir) = @_; my (@list, @snapshots, %atime);
# read snapshot list opendir(DIR, $dir) || die "can't opendir $dir: $!"; @snapshots = grep { ! /^..?$/ && -d "$dir/$_" } readdir(DIR); closedir DIR;
# sort by atime foreach $entry (@snapshots) { $atime{$entry} = -A "$dir/$entry"; } @snapshots = sort {$atime{$a} <=> $atime{$b}} @snapshots;
# maybe delete something while ($s = (pop @snapshots)) { # try to delete old dump snapshots, but since this might # fail, keep trying to delete after this one. if ($s =~ /^snapshot_for_dump.[0-9]+$/ && time - $atime{$s} >172800) { snap_delete($filer, $s); next; } # skip manual snapshots next if ($s !~ /^(hourly|nightly|weekly).[0-9]+$/); # leave one snapshot at all times next if ($s =~ /^hourly.0$/); snap_delete($filer, $s); last; } }
sub snap_delete { my ($filer, $snapshot) = @_;
if ($opt_d || ($> != 0)) { print "debug: rsh $filer snap delete $snapshot\n"; } else { system("logger -t na_snapmon "rsh $filer snap delete $snapshot""); system("rsh $filer snap delete $snapshot"); } }
sub usage { print <<EOF; usage: $prog [-hd] [list of snapshot mount points]
-h print this help -d debugging mode: don't do, just show
EOF } ------- end ----------------------------