Maybe also review the dedup savings you are acheveing and consider
disabling it on volumes getting 10% savings or so or less since the
metadata required for dedup is something like 7%.
You will likely need to get a rough baseline of how long a single dedupe
run takes on a source volume.
Then look at doing multiple volumes and see how much the process slows
down.
With Spinnng media, the dedupe process can take a while on larger volumes.
You may need to get a little creative and group some of the volumes
together and created schedules for the deduping to happen at specific
times. I really doubt you are going to be able to dedupe 30-40 volumes
totalling ~100TB every hour. You should probably stagger and run throughout
the day with different schedules
you can create custom vol eff. policies that only run for a certain
duration ("-" is default for no duration or up to 999 hours (whole numbers
in hours)) and the qos-policy (for vol eff operations!) can be set to
background (run, but do not impede) or best-effort (may slightly impact
operations)
as far as the timing, you may want to look at scripting with powershell or
the NetApp SDK.
Wait for a vol eff operation to finish, then do snapshots and/or mirroring.
We could spend lots of time on this. I hope this helps at least a little.
Thank you for a NA docs pointer from a list member and the opportunity for
a little RTFM before I reply to the list!
Looks like my colleague’s recollection of earlier versions still applies
to 8.3. Essentially, we've been snapmirroring un-deduped data.
My next question is, is it realistic to run dedupes hourly on 30-40
volumes totalling ~100TB? Because that's a much easier proposition than
amending our SLAs to lengthen our snapshot cycles.
And to get both on the same cycle, is it possible to make snapshots
dependent on dedupe finishing, or do we just assume dedupe will complete,
and if so, if it doesn't, what are the consequences? For example, if a
dedupe that usually finishes at 5 minutes after the hour isn't done, and
snapmirror runs then, will snapmirror then be syncing a full hour of
full-sized changes?
Last question for now: assuming both are on the same schedule, how do I
get the current lost space back? Will it be reclaimed when the schedules
are synced and the snapshots have rolled off? Or do I need to destroy and
recreate the target volumes?
Hope to hear from you, especially from any other shop running both
snapmirror and dedupe.
Randy
(and if a solution requires DOT9 we do have an upgrade on our roadmap)
Replying to just toy for now. Hopefully this will help and you can report
back
Look at this...
https://library.netapp.com/ecm/ecm_download_file/ECMLP2348026
Page 142.
I think you may need to work out some scheduling and that may help. Going
to look a bit more...i have another idea, but that may only be ONTAP 9
related.
Get Outlook for iOS https://aka.ms/o0ukef
From: Rue, Randy rrue@fredhutch.org
Sent: Saturday, December 17, 2016 11:14 AM
Subject: snapmirror source and target aggregate usage don't match?
To: toasters@teaparty.net
Hello All,
We run two 8.3 filers with a list of vservers and their associated
volumes, with each volume snapmirrored (volume level) from the active
primary cluster to matching vserver/volumes on the passive secondary.
Both clusters have a similar set of aggregates of just about equal size.
Both clusters’ aggregates contain the same list of volumes of the same
size, with the same space total/used/available on both sets.
But on the target cluster the same aggregates are reporting 30% more used
space.
This is about on par with the dedupe savings we’re getting on the primary
so when I first noticed this my thought was to check that dedupe was OK on
the target. But if you look in the webUI, it reports that no “storage
efficiency” is available on a replication target, and ended up thinking
this meant that the secondary data would have to be full-sized. I even
recall asking someone and having this confirmed, but can’t recall if that
came from the vendor SE or our VAR SE or a support tech or.
Now we’re approaching the space limit of the secondary cluster and I’m
looking deeper. At this point, as it appears that for each volume the
total/used/free space matches after dedupe on the source, I’m thinking that
dedupe properties aren’t exposed on the target but the data is still a true
copy of the deduped original. This is supported by being able to view
dedupe stats on the target via the CLI that show the same savings as on the
source.
Note that we’re also snapshotting these volumes, and while we’re deduping
daily, we’re snapshotting hourly. A colleague mentioned remembering that
this could mean mirrored data that’s not deduped yet is being replicated
full-size. But if so, wouldn’t this be reflected in the dedupe stats on the
target?
OK, just found that “storage aggregate show -fields
usedsize,physical-used” on the primary/source cluster shows that used and
physical-used are about identical for all aggrs. On the secondary/target,
used is consistently larger than physical-used and the total difference
makes up the 30% I’m “missing.”
Is this a problem with my reporting? Are we actually OK and I need to look
at physical-used instead of used? Or if we’re not OK, where is the space
being used and can I get it back?
Thanks in advance for your guidance…
Randy
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters