SnapMirror vs. SnapVault for 70+ million files...

List overview All Threads
Download

newer

older

xcp for NTFS files?

To enable sanitization or not

Ray Van Dolson

22 Aug 2016 22 Aug '16

5:35 p.m.

We wanted to use SnapVault to protect a volume containing 70+ million files (probably also has around 30TB of data, though it de-dupes down to less than 6TB). However, it appears that with SnapVault a full file scan is preformed prior to the block-based replication, and that scan can take around 24 hours. I'm assuming it will do this on subsequent differential vaults, though the block transfer part should be much shorter we'll still need to wait for the file scan to complete.

As we'd like to "back up" this data at least once a day, would we be better positioned by using SnapMirror? My belief is that it does *not* scan all of the files first and simply replicates changed blocks.

We'd need to keep more snapshots on the source storage to meet our retention requirements (or maybe further replicate the volume on the destination side?).

Thanks, Ray

Show replies by date

Sebastian Goetze

22 Aug 22 Aug

5:59 p.m.

In 7-Mode, SnapVault logically goes through the whole filesystem to find changed blocks. E.g. dedupe is not 'seen' at all by SnapVault. SnapMirror on the other hand just looks at the blocks and doesn't care how big or small the filesystem is.

In High File Count (HFC) situations (or highly dedupeable data) I always advise to use SnapMirror, if at all possible.

It transfers the data deduped (and compressed, if the source is compressed) and can also compress on the wire (don't do this, if your source data is already compressed...).

And yes, you could continue the replication sequence with SnapVault (e.g local on the secondary, e.g. only weekly's but further back). This could offset the possibly more storage you might need on the primary (e.g. if you don't yet do weekly's at all on the primary at the moment).

My 2c

Sebastian

On 8/22/2016 6:35 PM, Ray Van Dolson wrote:

...

We wanted to use SnapVault to protect a volume containing 70+ million files (probably also has around 30TB of data, though it de-dupes down to less than 6TB). However, it appears that with SnapVault a full file scan is preformed prior to the block-based replication, and that scan can take around 24 hours. I'm assuming it will do this on subsequent differential vaults, though the block transfer part should be much shorter we'll still need to wait for the file scan to complete.

As we'd like to "back up" this data at least once a day, would we be better positioned by using SnapMirror? My belief is that it does *not* scan all of the files first and simply replicates changed blocks.

We'd need to keep more snapshots on the source storage to meet our retention requirements (or maybe further replicate the volume on the destination side?).

Thanks, Ray _______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters

Klise, Steve

6:06 p.m.

I used to do ~ 10m files (10TB vol) for a roaming profile repository via snapmirror about 5 years ago. Worked like a charm and was 7-mode. We would sm to another filer, and then ndmp dump of the 2ndary vol to a vtl.

On 8/22/16, 9:59 AM, "toasters-bounces@teaparty.net on behalf of Sebastian Goetze" <toasters-bounces@teaparty.net on behalf of spgoetze@gmail.com> wrote:

In High File Count (HFC) situations (or highly dedupeable data) I always advise to use SnapMirror, if at all possible.

It transfers the data deduped (and compressed, if the source is compressed) and can also compress on the wire (don't do this, if your source data is already compressed...).

My 2c

Sebastian

On 8/22/2016 6:35 PM, Ray Van Dolson wrote: > We wanted to use SnapVault to protect a volume containing 70+ million > files (probably also has around 30TB of data, though it de-dupes down > to less than 6TB). However, it appears that with SnapVault a full file > scan is preformed prior to the block-based replication, and that scan > can take around 24 hours. I'm assuming it will do this on subsequent > differential vaults, though the block transfer part should be much > shorter we'll still need to wait for the file scan to complete. > > As we'd like to "back up" this data at least once a day, would we be > better positioned by using SnapMirror? My belief is that it does *not* > scan all of the files first and simply replicates changed blocks. > > We'd need to keep more snapshots on the source storage to meet our > retention requirements (or maybe further replicate the volume on the > destination side?). > > Thanks, > Ray > _______________________________________________ > Toasters mailing list > Toasters@teaparty.net > http://www.teaparty.net/mailman/listinfo/toasters

_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters

Michael Bergman

7:35 p.m.

On 2016-08-22 19:06, Klise, Steve wrote:

...

I used to do ~ 10m files (10TB vol) for a roaming profile repository via snapmirror about 5 years ago. Worked like a charm and was 7-mode. We would sm to another filer, and then ndmp dump of the 2ndary vol to a vtl.

I'm sure 10 M inodes in 10 TB worked AOK for you. It did for us as well, no prob. If you multiply that by 10x... to 100 M inodes... *then* it's a problem. Then several volumes with >100 M inodes, and it starts to hurt. A lot. I think that a FlexVol (qtree or whatever) with 10 M inodes in it isn't very much, not even 5 y ago

Luckily we're rid of all this file tree walk c**p in C.DOT

All the bg scanners still in ONTAP walking file trees, like the redirect scanner which implicitly runs after you've done Aggr reallocate, is just a PITA.... *sigh* They literally can never finish. I've tried, during a long period of time and it's just "#(¤%/%(¤

I think there's probably some of these bg scanner still in Kahuna as well, but I don't know for sure. Redirect scanning was moved to Vol Affinity in wafl_exempt though, in 8.3.1 (I think it was, correct me if I'm wrong!)

Regards, M

...

On 8/22/16, 9:59 AM, "toasters-bounces@teaparty.net on behalf of Sebastian Goetze"<toasters-bounces@teaparty.net on behalf of spgoetze@gmail.com> wrote:

In 7-Mode, SnapVault logically goes through the whole filesystem to find changed blocks. E.g. dedupe is not 'seen' at all by SnapVault. SnapMirror on the other hand just looks at the blocks and doesn't care how big or small the filesystem is.

In High File Count (HFC) situations (or highly dedupeable data) I always advise to use SnapMirror, if at all possible.

It transfers the data deduped (and compressed, if the source is compressed) and can also compress on the wire (don't do this, if your source data is already compressed...).

And yes, you could continue the replication sequence with SnapVault (e.g local on the secondary, e.g. only weekly's but further back). This could offset the possibly more storage you might need on the primary (e.g. if you don't yet do weekly's at all on the primary at the moment).

My 2c

Sebastian

Jordan Slingerland

7:40 p.m.

VSM will be much faster.

-----Original Message----- From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Klise, Steve Sent: Monday, August 22, 2016 1:06 PM To: Sebastian Goetze; Ray Van Dolson; toasters@teaparty.net Subject: Re: SnapMirror vs. SnapVault for 70+ million files...

On 8/22/16, 9:59 AM, "toasters-bounces@teaparty.net on behalf of Sebastian Goetze" <toasters-bounces@teaparty.net on behalf of spgoetze@gmail.com> wrote:

In High File Count (HFC) situations (or highly dedupeable data) I always advise to use SnapMirror, if at all possible.

It transfers the data deduped (and compressed, if the source is compressed) and can also compress on the wire (don't do this, if your source data is already compressed...).

My 2c

Sebastian

_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters

Payne, Richard

6:08 p.m.

Just to be clear...when you say

"In High File Count (HFC) situations (or highly dedupeable data) I always advise to use SnapMirror, if at all possible."

That is referring to Volume SnapMirror (VSM) in 7mode. Qtree Snap Mirror (QSM) will have all of the issue of SnapVault as well.

--rdp

-----Original Message----- From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Sebastian Goetze Sent: Monday, August 22, 2016 12:59 PM To: Ray Van Dolson; toasters@teaparty.net Subject: Re: SnapMirror vs. SnapVault for 70+ million files...

In High File Count (HFC) situations (or highly dedupeable data) I always advise to use SnapMirror, if at all possible.

It transfers the data deduped (and compressed, if the source is compressed) and can also compress on the wire (don't do this, if your source data is already compressed...).

My 2c

Sebastian

On 8/22/2016 6:35 PM, Ray Van Dolson wrote:

...

We wanted to use SnapVault to protect a volume containing 70+ million files (probably also has around 30TB of data, though it de-dupes down to less than 6TB). However, it appears that with SnapVault a full file scan is preformed prior to the block-based replication, and that scan can take around 24 hours. I'm assuming it will do this on subsequent differential vaults, though the block transfer part should be much shorter we'll still need to wait for the file scan to complete.

As we'd like to "back up" this data at least once a day, would we be better positioned by using SnapMirror? My belief is that it does *not* scan all of the files first and simply replicates changed blocks.

We'd need to keep more snapshots on the source storage to meet our retention requirements (or maybe further replicate the volume on the destination side?).

Thanks, Ray _______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters

_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters

Sebastian Goetze

6:47 p.m.

Absolutely, yes! I should have written VSM.

(Does anybody use QSM?)

Sebastian

On 8/22/2016 7:08 PM, Payne, Richard wrote:

...

Just to be clear...when you say

"In High File Count (HFC) situations (or highly dedupeable data) I always advise to use SnapMirror, if at all possible."

That is referring to Volume SnapMirror (VSM) in 7mode. Qtree Snap Mirror (QSM) will have all of the issue of SnapVault as well.

--rdp

-----Original Message----- From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Sebastian Goetze Sent: Monday, August 22, 2016 12:59 PM To: Ray Van Dolson; toasters@teaparty.net Subject: Re: SnapMirror vs. SnapVault for 70+ million files...

In 7-Mode, SnapVault logically goes through the whole filesystem to find changed blocks. E.g. dedupe is not 'seen' at all by SnapVault. SnapMirror on the other hand just looks at the blocks and doesn't care how big or small the filesystem is.

In High File Count (HFC) situations (or highly dedupeable data) I always advise to use SnapMirror, if at all possible.

It transfers the data deduped (and compressed, if the source is compressed) and can also compress on the wire (don't do this, if your source data is already compressed...).

And yes, you could continue the replication sequence with SnapVault (e.g local on the secondary, e.g. only weekly's but further back). This could offset the possibly more storage you might need on the primary (e.g. if you don't yet do weekly's at all on the primary at the moment).

My 2c

Sebastian

On 8/22/2016 6:35 PM, Ray Van Dolson wrote:

...
We wanted to use SnapVault to protect a volume containing 70+ million files (probably also has around 30TB of data, though it de-dupes down to less than 6TB). However, it appears that with SnapVault a full file scan is preformed prior to the block-based replication, and that scan can take around 24 hours. I'm assuming it will do this on subsequent differential vaults, though the block transfer part should be much shorter we'll still need to wait for the file scan to complete.

As we'd like to "back up" this data at least once a day, would we be better positioned by using SnapMirror? My belief is that it does *not* scan all of the files first and simply replicates changed blocks.

We'd need to keep more snapshots on the source storage to meet our retention requirements (or maybe further replicate the volume on the destination side?).

Thanks, Ray _______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters

Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters

Ray Van Dolson

9:17 p.m.

Not us!

Sounds like we'll plan to shift to VSM instead of SnapVault.

(That and think about migration to cDOT -- for us challenging as we have a lot of 7-mode N-Series)

Thanks for everyone's responses.

Ray

On Mon, Aug 22, 2016 at 07:47:29PM +0200, Sebastian Goetze wrote:

...

Absolutely, yes! I should have written VSM.

(Does anybody use QSM?)

Sebastian

On 8/22/2016 7:08 PM, Payne, Richard wrote:

...
Just to be clear...when you say

"In High File Count (HFC) situations (or highly dedupeable data) I always advise to use SnapMirror, if at all possible."

That is referring to Volume SnapMirror (VSM) in 7mode. Qtree Snap Mirror (QSM) will have all of the issue of SnapVault as well.

--rdp

-----Original Message----- From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Sebastian Goetze Sent: Monday, August 22, 2016 12:59 PM To: Ray Van Dolson; toasters@teaparty.net Subject: Re: SnapMirror vs. SnapVault for 70+ million files...

In 7-Mode, SnapVault logically goes through the whole filesystem to find changed blocks. E.g. dedupe is not 'seen' at all by SnapVault. SnapMirror on the other hand just looks at the blocks and doesn't care how big or small the filesystem is.

In High File Count (HFC) situations (or highly dedupeable data) I always advise to use SnapMirror, if at all possible.

It transfers the data deduped (and compressed, if the source is compressed) and can also compress on the wire (don't do this, if your source data is already compressed...).

And yes, you could continue the replication sequence with SnapVault (e.g local on the secondary, e.g. only weekly's but further back). This could offset the possibly more storage you might need on the primary (e.g. if you don't yet do weekly's at all on the primary at the moment).

My 2c

Sebastian

On 8/22/2016 6:35 PM, Ray Van Dolson wrote:

...
We wanted to use SnapVault to protect a volume containing 70+ million files (probably also has around 30TB of data, though it de-dupes down to less than 6TB). However, it appears that with SnapVault a full file scan is preformed prior to the block-based replication, and that scan can take around 24 hours. I'm assuming it will do this on subsequent differential vaults, though the block transfer part should be much shorter we'll still need to wait for the file scan to complete.

As we'd like to "back up" this data at least once a day, would we be better positioned by using SnapMirror? My belief is that it does *not* scan all of the files first and simply replicates changed blocks.

We'd need to keep more snapshots on the source storage to meet our retention requirements (or maybe further replicate the volume on the destination side?).

Thanks, Ray

Payne, Richard

23 Aug 23 Aug

2 p.m.

"Does anybody use QSM?"

Yes, we use a lot of it. Pre 8.3 Cmode VSM mandated the destination always be same or greater OnTap version and we have lots of one to many relationships that cross the world...in same cases a filer may be the source of one relationship and destination of others.

--rdp

-----Original Message----- From: Ray Van Dolson [mailto:rvandolson@esri.com] Sent: Monday, August 22, 2016 4:18 PM To: Sebastian Goetze Cc: Payne, Richard; toasters@teaparty.net Subject: Re: SnapMirror vs. SnapVault for 70+ million files...

Not us!

Sounds like we'll plan to shift to VSM instead of SnapVault.

(That and think about migration to cDOT -- for us challenging as we have a lot of 7-mode N-Series)

Thanks for everyone's responses.

Ray

On Mon, Aug 22, 2016 at 07:47:29PM +0200, Sebastian Goetze wrote:

...

Absolutely, yes! I should have written VSM.

(Does anybody use QSM?)

Sebastian

On 8/22/2016 7:08 PM, Payne, Richard wrote:

...
Just to be clear...when you say

"In High File Count (HFC) situations (or highly dedupeable data) I always advise to use SnapMirror, if at all possible."

That is referring to Volume SnapMirror (VSM) in 7mode. Qtree Snap Mirror (QSM) will have all of the issue of SnapVault as well.

--rdp

-----Original Message----- From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Sebastian Goetze Sent: Monday, August 22, 2016 12:59 PM To: Ray Van Dolson; toasters@teaparty.net Subject: Re: SnapMirror vs. SnapVault for 70+ million files...

In 7-Mode, SnapVault logically goes through the whole filesystem to find changed blocks. E.g. dedupe is not 'seen' at all by SnapVault. SnapMirror on the other hand just looks at the blocks and doesn't care how big or small the filesystem is.

In High File Count (HFC) situations (or highly dedupeable data) I always advise to use SnapMirror, if at all possible.

It transfers the data deduped (and compressed, if the source is compressed) and can also compress on the wire (don't do this, if your source data is already compressed...).

And yes, you could continue the replication sequence with SnapVault (e.g local on the secondary, e.g. only weekly's but further back). This could offset the possibly more storage you might need on the primary (e.g. if you don't yet do weekly's at all on the primary at the moment).

My 2c

Sebastian

On 8/22/2016 6:35 PM, Ray Van Dolson wrote:

...
We wanted to use SnapVault to protect a volume containing 70+ million files (probably also has around 30TB of data, though it de-dupes down to less than 6TB). However, it appears that with SnapVault a full file scan is preformed prior to the block-based replication, and that scan can take around 24 hours. I'm assuming it will do this on subsequent differential vaults, though the block transfer part should be much shorter we'll still need to wait for the file scan to complete.

As we'd like to "back up" this data at least once a day, would we be better positioned by using SnapMirror? My belief is that it does *not* scan all of the files first and simply replicates changed blocks.

We'd need to keep more snapshots on the source storage to meet our retention requirements (or maybe further replicate the volume on the destination side?).

Thanks, Ray

3259

Age (days ago)

3260

Last active (days ago)

toasters@lists.teaparty.net

8 comments

6 participants

tags (0)

participants (6)

Jordan Slingerland
Klise, Steve
Michael Bergman
Payne, Richard
Ray Van Dolson
Sebastian Goetze