My question about file folding (after reading about it here and on NOW)...
Why would a block end up in a snapshot if it is the same as the previous snapshot? Isn't the point of snapshots only to save the 'old version' of blocks that have been changed?
OK, second question, does it only check the most recent snapshot or does it drill through all existing snaps?
Now that I think more on the subject, third question... what happens to the snapshot "n" that references blocks in snap "n+1" when "n+1" rolls off the end and expires?
--Chuck
-----Original Message----- From: Mitko Blazeski [mailto:Mitko.Blazeski@proact.se] Sent: Thursday, December 12, 2002 3:10 AM To: 'Kumar, Rahul'; 'toasters@mathworks.com' Subject: RE: file folding impact
File folding describes the process of checking the data in the most recent snapshot, and, if it is identical to the snapshot currently being created, just referencing the previous snapshot instead of taking up disk space writing the same data in the new snapshot. Disk space is saved by sharing unchanged file blocks between the active version of the file and the version of the file in the latest snapshot, if any.
/Mitko
-----Original Message----- From: Kumar, Rahul [mailto:rahul.kumar@eds.com] Sent: den 12 december 2002 05:19 To: stefan.holzwarth@adac.de; mats.oberg@tietoenator.com; toasters@mathworks.com Subject: RE: file folding impact
BTW what does this option do
Rahul
-----Original Message----- From: stefan.holzwarth@adac.de [mailto:stefan.holzwarth@adac.de] Sent: Wednesday, December 11, 2002 7:08 PM To: mats.oberg@tietoenator.com; toasters@mathworks.com Subject: AW: file folding impact
Since 2 months we use filefolding on all volumes. (F760,2TB,3000 User, only cifs) No remarkable impact in cpu usage. No trouble. Snapshotsize decreasing a little. Regards Stefan Holzwarth
-----Ursprüngliche Nachricht----- Von: Öberg Mats [mailto:mats.oberg@tietoenator.com] Gesendet: Mittwoch, 11. Dezember 2002 14:20 An: toasters@mathworks.com Betreff: file folding impact
Hi, I'm thinking about enabling file folding on one of our filers. Have anybody tried this? If so, what kind of performance decrease/size gain can be expected from enabling it?
---- Mats
Why would a block end up in a snapshot if it is the same as the previous snapshot? Isn't the point of snapshots only to save the 'old version' of blocks that have been changed?
Chuck,
Microsoft designed CIFS, in their infinite wisdom, such that it rewrites the entire file instead of updating changed blocks. Netapp Ontap detects this and updates the changed blocks only. This feature is proprietary to Netapp and is meant to save a lot of snapshot space.
Sorry I can't answer your other questions.
/Brian/
My question about file folding (after reading about it here and on NOW)...
Why would a block end up in a snapshot if it is the same as the previous snapshot? Isn't the point of snapshots only to save the 'old version' of blocks that have been changed?
Snapshots do not copy any blocks, so including the same block in multiple snapshots incurs no overhead. It works like this. Each data block in a volume has associated with it a bit map of length 32 (20 on older releases) These bits correspond to snapshots. For example, if bit #3 is set, then the block is part of snapshot #3. The filer keeps track of which logical name (eg "hourly.0") is associated with each bit. The size of the bit map limits the number of snapshots that can exist in a volume.
Here's what happens when you create a snapshot. The filer picks an unused snapshot number and sets the corresponding bit for all blocks currently in use by the live filesystem. Note that many of these blocks may have had other bits set when other snapshots were created previously. So a block can be a member of multiple snapshots. It may even be a member of all snapshots. This is not unusual at all. If a file is older than your oldest snapshot, then the blocks that comprise the file are members of all snapshots.
When a snapshot is deleted, the bit corresponding to the snapshot is cleared for each block that has the bit set. If a block is left with no bits set and if the block is not part of the live filesystem, then it is freed for reuse.
So long as a block is a member of any snapshot, it cannot be modified, so it cannot be freed for reuse. The only way to get that block back is to delete all the snapshots that it belongs to.
You may wonder how you can possibly modify a file once it has been snapshotted because you can't modify any of its blocks. The filer simply makes changes in new data blocks and links them into the file in place of the snapshotted blocks. The snapshotted blocks are left untouched. However, they are no longer part of the live filesystem.
So when you look at a snapshot of a file, you are not looking at a copy. You are looking at the actual data blocks that comprised the file when the snapshot was made. In fact, the entire volume is treated this way including the directories, inodes, etc. So when you look at a snapshot, you are looking at the actual blocks that comprised the volume at the moment the snapshot was taken. File permissions, owner, group, timestamps, etc., are all frozen in time. (This can be a security issue. If a sensitive file has been left world readable and you change the permissions to protect it, you have to remember that any snapshotted copy of the file still has the wrong permissions, so anyone can still read the file in the snapshot. Your only option is to delete all snapshots where the file is world readable.)
OK, second question, does it only check the most recent snapshot or does it drill through all existing snaps?
Don't know.
Now that I think more on the subject, third question... what happens to the snapshot "n" that references blocks in snap "n+1" when "n+1" rolls off the end and expires?
Don't think about what happens to the snapshot, think about what happens to the blocks. If a block is a member of snapshots n and n+1, then the block has at least two bits set (n and n+1). When n+1 expires, bit n+1 is cleared for all blocks where it is set. Since bit n is still set, the block is not freed, leaving snapshot n still intact.
Steve Losen scl@virginia.edu phone: 434-924-0640
University of Virginia ITC Unix Support
-----Original Message----- From: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com]On Behalf Of Steve Losen Sent: Thursday, December 12, 2002 1:37 PM To: Chuck Tomasi Cc: 'Mitko Blazeski'; 'Kumar, Rahul'; 'toasters@mathworks.com' Subject: Re: file folding impact
My question about file folding (after reading about it here and
on NOW)...
Why would a block end up in a snapshot if it is the same as the previous snapshot? Isn't the point of snapshots only to save the 'old
version' of
blocks that have been changed?
Snapshots do not copy any blocks, so including the same block in multiple snapshots incurs no overhead. It works like this. Each data block in a volume has associated with it a bit map of length 32 (20 on older releases) These bits correspond to snapshots. For example, if bit #3 is set, then the block is part of snapshot #3. The filer keeps track of which logical name (eg "hourly.0") is associated with each bit. The size of the bit map limits the number of snapshots that can exist in a volume.
[snip...]
Ok, all of that makes sense to me. But I am still unclear on what filefolding does exactly.
This explanation fits in well with how I think, so if someone could present the basic concept of filefolding in a similar matter, I would very much appreciate it. :)
Thanks, Jordan
-----Original Message----- From: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com]On Behalf Of Steve Losen Sent: Thursday, December 12, 2002 1:37 PM To: Chuck Tomasi Cc: 'Mitko Blazeski'; 'Kumar, Rahul'; 'toasters@mathworks.com' Subject: Re: file folding impact
My question about file folding (after reading about it here and
on NOW)...
Why would a block end up in a snapshot if it is the same as the previous snapshot? Isn't the point of snapshots only to save the 'old
version' of
blocks that have been changed?
Snapshots do not copy any blocks, so including the same block in multiple snapshots incurs no overhead. It works like this. Each data block in a volume has associated with it a bit map of length 32 (20 on older releases) These bits correspond to snapshots. For example, if bit #3 is set, then the block is part of snapshot #3. The filer keeps track of which logical name (eg "hourly.0") is associated with each bit. The size of the bit map limits the number of snapshots that can exist in a volume.
[snip...]
Ok, all of that makes sense to me. But I am still unclear on what filefolding does exactly.
This explanation fits in well with how I think, so if someone could present the basic concept of filefolding in a similar matter, I would very much appreciate it. :)
You had to ask ... :-)
When a client modifies a file, it often does it in a way that replaces the entire file, by writing a whole new file from start to finish. This is particularly true of editors and word processors. Let us suppose that the old file is in a snapshot. When the client writes the new file, it cannot overwrite the original file because its blocks are in a snapshot. So the new file is written to freshly allocated blocks and the old blocks are removed from the live filesystem, but remain in the snapshot of course. If the modification to the file is very small, such as adding a few sentences to the end of a large document, you end up duplicating a lot of data. Up to the point where you added to the end of the document, the blocks in the snapshot and the blocks in the live file contain duplicate data. File folding detects this and "stitches" the old snapshotted blocks back into the live file, and frees the freshly allocated blocks. That way small changes to large snapshotted files don't consume so much disk space.
I don't know how clever or aggressive file folding is -- the more thoroughly it looks for duplicated blocks, the more space it will recover, but the more CPU it will consume. Because snapshots must be preserved intact, you can only fold two blocks that have identical data. If you add a single byte to the beginning of a text file, you completely throw off the original block boundaries, making folding impossible.
Steve Losen scl@virginia.edu phone: 434-924-0640
University of Virginia ITC Unix Support