Greetings,
We have a 8.0.1 filer that runs as a NearStore for SV backups. The group using this normally kept 120 nightly backups. The aggregates were getting full so they decided to delete all snapshots older than 90 days. This was done with a script using snap delete <vol> <snapshot> with a list of snapshots fed into the command.
Some of the lists were not in reverse order so it deleted nightly.91 before nightly.119. I know deleting them in this type of order will work anyway, but it causes the system to do more processing.
The problem is the disk space is continuing to grow on all volumes in the aggregates. An aggr show_space of the aggregates shows everything increasing in used space. The volumes have no space reservation.
I'm trying to figure out how this mass deletion of snapshots (several hundred) is causing an increase in space usage. Any ideas?
Thanks,
Jeff
Two points
1. space reclamation on NetApp happens in background. For a large file it can take quite a long time.
2. deleting snapshot does not mean you will free anything :) Remember that snapshot shares blocks with other snapshots and active file system. So deleting snapshot simply decrements reference counts. You can estimate how much space will be actually freed *before* deleting using "snap reclaimable".
--- With best regards
Andrey Borzenkov Senior system engineer Service operations
-----Original Message----- From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Jeff Cleverley Sent: Saturday, December 17, 2011 4:05 AM To: toasters@teaparty.net Subject: Disk space increasing after snapshots are deleted.
Greetings,
We have a 8.0.1 filer that runs as a NearStore for SV backups. The group using this normally kept 120 nightly backups. The aggregates were getting full so they decided to delete all snapshots older than 90 days. This was done with a script using snap delete <vol> <snapshot> with a list of snapshots fed into the command.
Some of the lists were not in reverse order so it deleted nightly.91 before nightly.119. I know deleting them in this type of order will work anyway, but it causes the system to do more processing.
The problem is the disk space is continuing to grow on all volumes in the aggregates. An aggr show_space of the aggregates shows everything increasing in used space. The volumes have no space reservation.
I'm trying to figure out how this mass deletion of snapshots (several hundred) is causing an increase in space usage. Any ideas?
Thanks,
Jeff
Andrey,
I am aware of both points. We figured it would take a while for the space to get reclaimed, but the space used in every volume in the aggregate was increasing. We know from the environment that there are daily changes made that should free up something in most of the volumes. Even if nothing did free up, we would expect no change in space utilization, not an increase in usage.
They thin provision all the volumes to 40 TB so there is no deduplication on anything. They also don't use compression of any sort that would have this type of effect. I suggested they try disabling SV for a while to let it the filer try to catch up. From what I can tell they did this and rebooted the filer. It looks like there is more space available now than there was yesterday, but it does not seem to have freed up very much space.
We still don't know why the space was increasing during the deletions.
Thanks,
Jeff
On Sat, Dec 17, 2011 at 12:56 AM, Borzenkov, Andrey andrey.borzenkov@ts.fujitsu.com wrote:
Two points
space reclamation on NetApp happens in background. For a large file it can take quite a long time.
deleting snapshot does not mean you will free anything :) Remember that snapshot shares blocks with other snapshots and active file system. So deleting snapshot simply decrements reference counts. You can estimate how much space will be actually freed *before* deleting using "snap reclaimable".
With best regards
Andrey Borzenkov Senior system engineer Service operations
-----Original Message----- From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Jeff Cleverley Sent: Saturday, December 17, 2011 4:05 AM To: toasters@teaparty.net Subject: Disk space increasing after snapshots are deleted.
Greetings,
We have a 8.0.1 filer that runs as a NearStore for SV backups. The group using this normally kept 120 nightly backups. The aggregates were getting full so they decided to delete all snapshots older than 90 days. This was done with a script using snap delete <vol> <snapshot> with a list of snapshots fed into the command.
Some of the lists were not in reverse order so it deleted nightly.91 before nightly.119. I know deleting them in this type of order will work anyway, but it causes the system to do more processing.
The problem is the disk space is continuing to grow on all volumes in the aggregates. An aggr show_space of the aggregates shows everything increasing in used space. The volumes have no space reservation.
I'm trying to figure out how this mass deletion of snapshots (several hundred) is causing an increase in space usage. Any ideas?
Thanks,
Jeff
-- Jeff Cleverley Unix Systems Administrator 4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611 _______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
What with snapshots and the filesystem trickery that makes them possible, it may help to temporarily forget about the ongoing deletions and focus on the space consumption. It can be hard to identify who is writing; besides filehandles held open for long-term writing, kernels can write seemingly on their own initiative for core dumps and the like.
Sometimes it helps to ogle raw ip traffic to look for hosts sending lots of traffic to an address owned by a filer.
Todd,
This filer is a NearStore and it only gets SV traffic. It's easy to see what SV are running. There would only be a few different volumes with active SV running at a time. A df followed by another one 5 minutes later showed more than 100 different volumes increased their space usage. It acts like the reverse of the 7.3 upgrade where meta data for the snapshots is moved to the aggregate. This acts like the aggregate was pushing everything back to the volumes.
This is the same filer I posted about the vol options root command failing earlier in the week. We did a software update from 8.0.1 to 8.0.1P1 which is what I run on most of my NearStores. We did this because the root command worked on one of my 3140 NS with that OS. That software update should have delivered new files, so we don't expect we have any corrupt files anywhere.
Thanks,
Jeff
On Sat, Dec 17, 2011 at 10:05 AM, Bennett Todd bet@rahul.net wrote:
What with snapshots and the filesystem trickery that makes them possible, it may help to temporarily forget about the ongoing deletions and focus on the space consumption. It can be hard to identify who is writing; besides filehandles held open for long-term writing, kernels can write seemingly on their own initiative for core dumps and the like.
Sometimes it helps to ogle raw ip traffic to look for hosts sending lots of traffic to an address owned by a filer.
Jeff,
When you say that the volumes have no space reservation, do you mean snapshot reservation, or space guarantees?
Volume sizes within an aggregate can behave very non-intuitively if they don't have any kind of space guarantee, especially if there is a mix of guaranteed and non-guaranteed volumes in that aggregate. Are there one or more "guarantee=volume" volumes in that aggregate? If any of these are growing, it will look like the available space for any non-guaranteed volumes is shrinking, which kinda looks like these non-reserved volumes are growing (which they aren't). The mass snapshot deletion may be a red herring.
Andy
----- Original Message ----- From: "Jeff Cleverley" jeff.cleverley@avagotech.com To: "Andrey Borzenkov" andrey.borzenkov@ts.fujitsu.com Cc: toasters@teaparty.net Sent: Saturday, December 17, 2011 11:48:32 AM Subject: Re: Disk space increasing after snapshots are deleted.
Andrey,
I am aware of both points. We figured it would take a while for the space to get reclaimed, but the space used in every volume in the aggregate was increasing. We know from the environment that there are daily changes made that should free up something in most of the volumes. Even if nothing did free up, we would expect no change in space utilization, not an increase in usage.
They thin provision all the volumes to 40 TB so there is no deduplication on anything. They also don't use compression of any sort that would have this type of effect. I suggested they try disabling SV for a while to let it the filer try to catch up. From what I can tell they did this and rebooted the filer. It looks like there is more space available now than there was yesterday, but it does not seem to have freed up very much space.
We still don't know why the space was increasing during the deletions.
Thanks,
Jeff
On Sat, Dec 17, 2011 at 12:56 AM, Borzenkov, Andrey andrey.borzenkov@ts.fujitsu.com wrote:
Two points
space reclamation on NetApp happens in background. For a large file it can take quite a long time.
deleting snapshot does not mean you will free anything :) Remember that snapshot shares blocks with other snapshots and active file system. So deleting snapshot simply decrements reference counts. You can estimate how much space will be actually freed *before* deleting using "snap reclaimable".
With best regards
Andrey Borzenkov Senior system engineer Service operations
-----Original Message----- From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Jeff Cleverley Sent: Saturday, December 17, 2011 4:05 AM To: toasters@teaparty.net Subject: Disk space increasing after snapshots are deleted.
Greetings,
We have a 8.0.1 filer that runs as a NearStore for SV backups. The group using this normally kept 120 nightly backups. The aggregates were getting full so they decided to delete all snapshots older than 90 days. This was done with a script using snap delete <vol> <snapshot> with a list of snapshots fed into the command.
Some of the lists were not in reverse order so it deleted nightly.91 before nightly.119. I know deleting them in this type of order will work anyway, but it causes the system to do more processing.
The problem is the disk space is continuing to grow on all volumes in the aggregates. An aggr show_space of the aggregates shows everything increasing in used space. The volumes have no space reservation.
I'm trying to figure out how this mass deletion of snapshots (several hundred) is causing an increase in space usage. Any ideas?
Thanks,
Jeff
On Sat, Dec 17, 2011 at 9:48 PM, Andrew Siegel abs@blueskystudios.com wrote:
Jeff,
When you say that the volumes have no space reservation, do you mean snapshot reservation, or space guarantees?
Both. Since these are NearStores and almost everything is snapshots, snap reserve is turned off. The snapshots will take however much space they need anyway so there is no real need to have a reserve. We also thin provision the volumes with no space guarantee and size the volumes to the size of the aggregates. This eliminates having to manage space for a bunch of volumes. We don't have to constantly resize volumes. All that has to be done is monitor the aggregate space.
Volume sizes within an aggregate can behave very non-intuitively if they don't have any kind of space guarantee, especially if there is a mix of guaranteed and non-guaranteed volumes in that aggregate. Are there one or more "guarantee=volume" volumes in that aggregate? If any of these are growing, it will look like the available space for any non-guaranteed volumes is shrinking, which kinda looks like these non-reserved volumes are growing (which they aren't). The mass snapshot deletion may be a red herring.
Excellent idea, but unfortunately it doesn't appear to be the issue. One aggregate has 3 volumes and the other with 1 volume that have volume guarantee set. The 3 I looked have "junk" in the name and are 20 MB. Since this is the minimum volume size they were obviously doing some testing. The other volume is the root volume they were planning on trying to migrate to. This was the volume from one of my other posts. It is a fixed size and the space in that isn't growing. It just has a copy of the current vol0 in it.
I did notice the WAFL reserve used space is also increasing. Is this normal as other volumes grow or is there something else that could cause this to space usage to grow?
Thanks,
Jeff
Andy
----- Original Message ----- From: "Jeff Cleverley" jeff.cleverley@avagotech.com To: "Andrey Borzenkov" andrey.borzenkov@ts.fujitsu.com Cc: toasters@teaparty.net Sent: Saturday, December 17, 2011 11:48:32 AM Subject: Re: Disk space increasing after snapshots are deleted.
Andrey,
I am aware of both points. We figured it would take a while for the space to get reclaimed, but the space used in every volume in the aggregate was increasing. We know from the environment that there are daily changes made that should free up something in most of the volumes. Even if nothing did free up, we would expect no change in space utilization, not an increase in usage.
They thin provision all the volumes to 40 TB so there is no deduplication on anything. They also don't use compression of any sort that would have this type of effect. I suggested they try disabling SV for a while to let it the filer try to catch up. From what I can tell they did this and rebooted the filer. It looks like there is more space available now than there was yesterday, but it does not seem to have freed up very much space.
We still don't know why the space was increasing during the deletions.
Thanks,
Jeff
On Sat, Dec 17, 2011 at 12:56 AM, Borzenkov, Andrey andrey.borzenkov@ts.fujitsu.com wrote:
Two points
space reclamation on NetApp happens in background. For a large file it can take quite a long time.
deleting snapshot does not mean you will free anything :) Remember that snapshot shares blocks with other snapshots and active file system. So deleting snapshot simply decrements reference counts. You can estimate how much space will be actually freed *before* deleting using "snap reclaimable".
With best regards
Andrey Borzenkov Senior system engineer Service operations
-----Original Message----- From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Jeff Cleverley Sent: Saturday, December 17, 2011 4:05 AM To: toasters@teaparty.net Subject: Disk space increasing after snapshots are deleted.
Greetings,
We have a 8.0.1 filer that runs as a NearStore for SV backups. The group using this normally kept 120 nightly backups. The aggregates were getting full so they decided to delete all snapshots older than 90 days. This was done with a script using snap delete <vol> <snapshot> with a list of snapshots fed into the command.
Some of the lists were not in reverse order so it deleted nightly.91 before nightly.119. I know deleting them in this type of order will work anyway, but it causes the system to do more processing.
The problem is the disk space is continuing to grow on all volumes in the aggregates. An aggr show_space of the aggregates shows everything increasing in used space. The volumes have no space reservation.
I'm trying to figure out how this mass deletion of snapshots (several hundred) is causing an increase in space usage. Any ideas?
Thanks,
Jeff