Hey all,
I've got a cluster running cluster-mode 8.0.5 (don't laugh) that has an aggregate which is reporting much higher used size than I can account for based on the volumes contained on it.
according to 'aggr show' and 'df -A' the aggregate has around 17T of space consumed.
bc-gx-4b::> aggr show -aggregate gx4b_1 -fields size, usedsize, availsize aggregate size usedsize availsize --------- ------- -------- --------- gx4b_1 18.40TB 17.08TB 1.32TB
though per my database, a tally of all the volumes contained on this aggregate, the total amount of space consumed by the volumes is only about 12.5T of space, so a significant amount is being soaked up by something.
I get the same numbers from the command line as well:
ssh admin@bc-gx-4b "set -units MB; vol show -aggregate gx4b_1 -fields used" | egrep "^bc" | awk '{print $3}' | sed 's/[^0-9]*//g' | paste -sd+ | bc 12528994
so the sum of the volumes is about 12.5T, but the aggregate thinks there is 17T used.
it's been in this state for some time. There haven't been any volumes recently moved off or deleted, so there isn't any space being recovered in the background.
'vol show -state offline' and 'set diag; vol lost-found show' isn't reporting anything.
Any ideas on how I might figure out what is sucking up the un-reported space?
Do you use deduplication by any chance?
--- With best regards
Andrei Borzenkov Senior system engineer FTS WEMEAI RUC RU SC TMS FOS [cid:image001.gif@01CBF835.B3FEDA90] FUJITSU Zemlyanoy Val Street, 9, 105 064 Moscow, Russian Federation Tel.: +7 495 730 62 20 ( reception) Mob.: +7 916 678 7208 Fax: +7 495 730 62 14 E-mail: Andrei.Borzenkov@ts.fujitsu.commailto:Andrei.Borzenkov@ts.fujitsu.com Web: ru.fujitsu.comhttp://ts.fujitsu.com/ Company details: ts.fujitsu.com/imprinthttp://ts.fujitsu.com/imprint.html This communication contains information that is confidential, proprietary in nature and/or privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s) or the person responsible for delivering it to the intended recipient(s), please note that any form of dissemination, distribution or copying of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender and delete the original communication. Thank you for your cooperation. Please be advised that neither Fujitsu, its affiliates, its employees or agents accept liability for any errors, omissions or damages caused by delays of receipt or by any virus infection in this message or its attachments, or which may otherwise arise as a result of this e-mail transmission.
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Mike Thompson Sent: Wednesday, December 09, 2015 11:07 AM To: toasters@teaparty.net Lists Subject: cdot missing disk space
Hey all, I've got a cluster running cluster-mode 8.0.5 (don't laugh) that has an aggregate which is reporting much higher used size than I can account for based on the volumes contained on it.
according to 'aggr show' and 'df -A' the aggregate has around 17T of space consumed.
bc-gx-4b::> aggr show -aggregate gx4b_1 -fields size, usedsize, availsize aggregate size usedsize availsize --------- ------- -------- --------- gx4b_1 18.40TB 17.08TB 1.32TB though per my database, a tally of all the volumes contained on this aggregate, the total amount of space consumed by the volumes is only about 12.5T of space, so a significant amount is being soaked up by something.
I get the same numbers from the command line as well:
ssh admin@bc-gx-4b "set -units MB; vol show -aggregate gx4b_1 -fields used" | egrep "^bc" | awk '{print $3}' | sed 's/[^0-9]*//g' | paste -sd+ | bc 12528994 so the sum of the volumes is about 12.5T, but the aggregate thinks there is 17T used.
it's been in this state for some time. There haven't been any volumes recently moved off or deleted, so there isn't any space being recovered in the background.
'vol show -state offline' and 'set diag; vol lost-found show' isn't reporting anything. Any ideas on how I might figure out what is sucking up the un-reported space?
Thanks Andrei - sorry forgot to add to my previous response in the thread, no dedupe in use.
On Wed, Dec 9, 2015 at 1:26 AM, andrei.borzenkov@ts.fujitsu.com < andrei.borzenkov@ts.fujitsu.com> wrote:
Do you use deduplication by any chance?
With best regards
*Andre**i** Borzenkov*
Senior system engineer
FTS WEMEAI RUC RU SC TMS FOS
[image: cid:image001.gif@01CBF835.B3FEDA90]
*FUJITSU*
Zemlyanoy Val Street, 9, 105 064 Moscow, Russian Federation
Tel.: +7 495 730 62 20 ( reception)
Mob.: +7 916 678 7208
Fax: +7 495 730 62 14
E-mail: Andrei.Borzenkov@ts.fujitsu.com
Web: ru.fujitsu.com http://ts.fujitsu.com/
Company details: ts.fujitsu.com/imprint http://ts.fujitsu.com/imprint.html
This communication contains information that is confidential, proprietary in nature and/or privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s) or the person responsible for delivering it to the intended recipient(s), please note that any form of dissemination, distribution or copying of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender and delete the original communication. Thank you for your cooperation.
Please be advised that neither Fujitsu, its affiliates, its employees or agents accept liability for any errors, omissions or damages caused by delays of receipt or by any virus infection in this message or its attachments, or which may otherwise arise as a result of this e-mail transmission.
*From:* toasters-bounces@teaparty.net [mailto: toasters-bounces@teaparty.net] *On Behalf Of *Mike Thompson *Sent:* Wednesday, December 09, 2015 11:07 AM *To:* toasters@teaparty.net Lists *Subject:* cdot missing disk space
Hey all,
I've got a cluster running cluster-mode 8.0.5 (don't laugh) that has an aggregate which is reporting much higher used size than I can account for based on the volumes contained on it.
according to 'aggr show' and 'df -A' the aggregate has around 17T of space consumed.
bc-gx-4b::> aggr show -aggregate gx4b_1 -fields size, usedsize, availsize aggregate size usedsize availsize
gx4b_1 18.40TB 17.08TB 1.32TB
though per my database, a tally of all the volumes contained on this aggregate, the total amount of space consumed by the volumes is only about 12.5T of space, so a significant amount is being soaked up by something.
I get the same numbers from the command line as well:
ssh admin@bc-gx-4b "set -units MB; vol show -aggregate gx4b_1 -fields used" | egrep "^bc" | awk '{print $3}' | sed 's/[^0-9]*//g' | paste -sd+ | bc 12528994
so the sum of the volumes is about 12.5T, but the aggregate thinks there is 17T used.
it's been in this state for some time. There haven't been any volumes recently moved off or deleted, so there isn't any space being recovered in the background.
'vol show -state offline' and 'set diag; vol lost-found show' isn't reporting anything.
Any ideas on how I might figure out what is sucking up the un-reported space?
Were these set up with snap protect? Check the auto grow settings.
Sent from Mobile Outlook
On Wed, Dec 9, 2015 at 11:37 AM -0800, "Mike Thompson" mike.thompson@gmail.com wrote:
Thanks Andrei - sorry forgot to add to my previous response in the thread, no dedupe in use.
On Wed, Dec 9, 2015 at 1:26 AM, andrei.borzenkov@ts.fujitsu.com andrei.borzenkov@ts.fujitsu.com wrote:
Do you use deduplication by any chance?
---
With best regards
Andrei Borzenkov
Senior system engineer
FTS WEMEAI RUC RU SC TMS FOS
FUJITSU
Zemlyanoy Val Street, 9, 105 064 Moscow, Russian Federation
Tel.: +7 495 730 62 20 ( reception)
Mob.: +7 916 678 7208
Fax: +7 495 730 62 14
E-mail: Andrei.Borzenkov@ts.fujitsu.com
Web: ru.fujitsu.com
Company details: ts.fujitsu.com/imprint
This communication contains information that is confidential, proprietary in nature and/or privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s) or the person responsible for delivering it to the intended recipient(s), please note that any form of dissemination, distribution or copying of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender and delete the original communication. Thank you for your cooperation.
Please be advised that neither Fujitsu, its affiliates, its employees or agents accept liability for any errors, omissions or damages caused by delays of receipt or by any virus infection in this message or its attachments, or which may otherwise arise as a result of this e-mail transmission.
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Mike Thompson
Sent: Wednesday, December 09, 2015 11:07 AM
To: toasters@teaparty.net Lists
Subject: cdot missing disk space
Hey all,
I've got a cluster running cluster-mode 8.0.5 (don't laugh) that has an aggregate which is reporting much higher used size than I can account for based on the volumes contained on it.
according to 'aggr show' and 'df -A' the aggregate has around 17T of space consumed.
bc-gx-4b::> aggr show -aggregate gx4b_1 -fields size, usedsize, availsize
aggregate size usedsize availsize
--------- ------- -------- ---------
gx4b_1 18.40TB 17.08TB 1.32TB
though per my database, a tally of all the volumes contained on this aggregate, the total amount of space consumed by the volumes is only about 12.5T of space, so a significant amount is being soaked up by something.
I get the same numbers from the command line as well:
ssh admin@bc-gx-4b "set -units MB; vol show -aggregate gx4b_1 -fields used" | egrep "^bc" | awk '{print $3}' | sed 's/[^0-9]*//g' | paste -sd+ | bc
12528994
so the sum of the volumes is about 12.5T, but the aggregate thinks there is 17T used.
it's been in this state for some time. There haven't been any volumes recently moved off or deleted, so there isn't any space being recovered in the background.
'vol show -state offline' and 'set diag; vol lost-found show' isn't reporting anything.
Any ideas on how I might figure out what is sucking up the un-reported space?
nope no snap protect. autosize is off on all volumes.
if i move any volume off to another aggregate on another filer, the allocated space ends up matching the used space (well below this 12GB minimum for these nearly empty volumes).
if I move an empty volume from another filer to this aggregate, the allocated space swells up to around this 12GB number. so it seems to be something at the filer or aggregate level.
On Wed, Dec 9, 2015 at 11:52 AM, Tim McCarthy tmacmd@gmail.com wrote:
Were these set up with snap protect?
Check the auto grow settings.
Sent from Mobile Outlook https://aka.ms/qtex0l
On Wed, Dec 9, 2015 at 11:37 AM -0800, "Mike Thompson" < mike.thompson@gmail.com> wrote:
Thanks Andrei - sorry forgot to add to my previous response in the thread,
no dedupe in use.
On Wed, Dec 9, 2015 at 1:26 AM, andrei.borzenkov@ts.fujitsu.com < andrei.borzenkov@ts.fujitsu.com> wrote:
Do you use deduplication by any chance?
With best regards
*Andre**i** Borzenkov*
Senior system engineer
FTS WEMEAI RUC RU SC TMS FOS
[image: cid:image001.gif@01CBF835.B3FEDA90]
*FUJITSU*
Zemlyanoy Val Street, 9, 105 064 Moscow, Russian Federation
Tel.: +7 495 730 62 20 ( reception)
Mob.: +7 916 678 7208
Fax: +7 495 730 62 14
E-mail: Andrei.Borzenkov@ts.fujitsu.com
Web: ru.fujitsu.com http://ts.fujitsu.com/
Company details: ts.fujitsu.com/imprint http://ts.fujitsu.com/imprint.html
This communication contains information that is confidential, proprietary in nature and/or privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s) or the person responsible for delivering it to the intended recipient(s), please note that any form of dissemination, distribution or copying of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender and delete the original communication. Thank you for your cooperation.
Please be advised that neither Fujitsu, its affiliates, its employees or agents accept liability for any errors, omissions or damages caused by delays of receipt or by any virus infection in this message or its attachments, or which may otherwise arise as a result of this e-mail transmission.
*From:* toasters-bounces@teaparty.net [mailto: toasters-bounces@teaparty.net] *On Behalf Of *Mike Thompson *Sent:* Wednesday, December 09, 2015 11:07 AM *To:* toasters@teaparty.net Lists *Subject:* cdot missing disk space
Hey all,
I've got a cluster running cluster-mode 8.0.5 (don't laugh) that has an aggregate which is reporting much higher used size than I can account for based on the volumes contained on it.
according to 'aggr show' and 'df -A' the aggregate has around 17T of space consumed.
bc-gx-4b::> aggr show -aggregate gx4b_1 -fields size, usedsize, availsize aggregate size usedsize availsize
gx4b_1 18.40TB 17.08TB 1.32TB
though per my database, a tally of all the volumes contained on this aggregate, the total amount of space consumed by the volumes is only about 12.5T of space, so a significant amount is being soaked up by something.
I get the same numbers from the command line as well:
ssh admin@bc-gx-4b "set -units MB; vol show -aggregate gx4b_1 -fields used" | egrep "^bc" | awk '{print $3}' | sed 's/[^0-9]*//g' | paste -sd+ | bc 12528994
so the sum of the volumes is about 12.5T, but the aggregate thinks there is 17T used.
it's been in this state for some time. There haven't been any volumes recently moved off or deleted, so there isn't any space being recovered in the background.
'vol show -state offline' and 'set diag; vol lost-found show' isn't reporting anything.
Any ideas on how I might figure out what is sucking up the un-reported space?
Can you post the details of one of these volumes? And of the aggregate you have them in? It smells like there's some sort of minimum volume size setting somewhere.
Or maybe there's an aggregate level snapshot sitting around?
Can you upgrade? You're in cluster mode, so hopefully it shoudln't be too hard to move to 8.1, then 8.2 and onto 8.3, since there's lots of nice bug fixes.
sample vol and aggr details below
can't upgrade, not on support, we have about 1PB in production across two clusters. I am seeing this effect on both clusters, but not on all filers, which is strange.
if I move vols between affected and unaffected filers/aggrs, the allocated vs used normalizes for the volumes on the unaffected node, then re-inflate when moved back to the original node/aggr
Volume Allocated Used vol created on problem aggr v164402 11932348KB 1924KB after move to unaffected aggr v164402 2872KB 2872KB after move back to orig aggr v164402 12005552KB 75284KB
resized to 1TB, moved to another aggr, and back to orig aggr
v164402 6003080KB 37972KB
it appears to be related to the size of the volume. we thin-provision all volumes by setting them to a large size and set space guarantee to none.
vols sized at 2TB end up being allocated 12GB of space when created. vols sized to 1TB end up with 6GB allocated when created, even though they are completely empty.
what's strange is this is happening on some filers/aggregates but not others.
these clusters originated running Ontap GX, and then were upgraded to Ontap8 c-mode some time ago.
the aggregates that seem to be experiencing this allocation 'inflation' seem to be those that were created after the cluster was upgraded to Ontap8.
the aggregates that were originally created as GX aggrs, have identically matching 'allocated' and 'used' values in 'aggr show_space'
for example:
recently built aggregate under Ontap8:
Aggregate Allocated Used Avail Total space 18526684036KB 13662472732KB 1233809368KB <- massive difference!
aggregate originated under Ontap GX:
Aggregate Allocated Used Avail Total space 32035928920KB 32035928920KB 1404570796KB <- identical, which is what i would expect
maybe this allocation scheme changed at an aggregate level under Ontap 8? perhaps it's expected behavior?
things do seem to normalize as the volumes begin to fill up though, so I believe that this space is not truly gone permanently, but it certainly appears to be not available for use, since tons of space is allocated to volumes that have very little data in them, and we have a LOT of volumes in these clusters.
it's definitely making it look like we are missing a substantial percentage of our disk space, when trying to reconcile the sum of data used when tallying up volume size, and comparing it to the aggregate used/remaining sizes.
example volume and affected aggregate details:
bc-gx-4b::*> vol show v164346 -instance (volume show)
Virtual Server Name: bc Volume Name: v164346 Aggregate Name: gx4b_1 Volume Size: 2TB Name Ordinal: base Volume Data Set ID: 4041125 Volume Master Data Set ID: 2151509138 Volume State: online Volume Type: RW Volume Style: flex Volume Ownership: cluster Export Policy: default User ID: jobsys Group ID: cgi Security Style: unix Unix Permissions: ---rwxr-x--x Junction Path: /bc/shows/ID2/DFO/0820 Junction Path Source: RW_volume Junction Active: true Parent Volume: v164232 Virtual Server Root Volume: false Comment: Available Size: 1.15TB Total Size: 2TB Used Size: 146.7MB Used Percentage: 42% Autosize Enabled (for flexvols only): false Maximum Autosize (for flexvols only): 2.40TB Autosize Increment (for flexvols only): 102.4GB Total Files (for user-visible data): 31876689 Files Used (for user-visible data): 132 Maximum Directory Size: 100MB Space Guarantee Style: none Space Guarantee In Effect: true Minimum Read Ahead: false Access Time Update Enabled: true Snapshot Directory Access Enabled: true Percent of Space Reserved for Snapshots: 0% Used Percent of Snapshot Reserve: 0% Snapshot Policy: daily Creation Time: Tue Dec 08 11:22:58 2015 Language: C Striped Data Volume Count: - Striped Data Volume Stripe Width: 0.00B Current Striping Epoch: - One data-volume per member aggregate: - Concurrency Level: - Optimization Policy: - Clone Volume: false Anti-Virus On-Access Policy: default UUID of the volume: 17fa4c6d-9de1-11e5-a888-123478563412 Striped Volume Format: - Load Sharing Source Volume: - Move Target Volume: false Maximum Write Alloc Blocks: 0 Inconsistency in the file system: false
bc-gx-4b::*> aggr show -aggregate far4a_1 gx1a_1 gx1a_2 gx1b_1 gx2a_1 gx2b_1 gx3a_1 gx4b_1 near1b_1 near3b_1 root_1a root_1b root_2a root_2b root_3a root_3b root_4a root_4b slow2a_1 systems bc-gx-4b::*> aggr show -aggregate gx4b_1
Aggregate: gx4b_1 UUID: c624f85e-96d3-11e3-a6ce-00a0980bb25a Size: 18.40TB Used Size: 17.25TB Used Percentage: 94% Available Size: 1.15TB State: online Nodes: bc-gx-4b Number Of Disks: 63 Disks: bc-gx-4b:0a.64, bc-gx-4b:0e.80, ... bc-gx-4b:0a.45 Number Of Volumes: 411 Plexes: /gx4b_1/plex0(online) RAID Groups: /gx4b_1/plex0/rg0, /gx4b_1/plex0/rg1, /gx4b_1/plex0/rg2 Raid Type: raid_dp Max RAID Size: 21 RAID Status: raid_dp Checksum Enabled: true Checksum Status: active Checksum Style: block Inconsistent: false Ignore Inconsistent: off Block Checksum Protection: on Zoned Checksum Protection: - Automatic Snapshot Deletion: on Enable Thorough Scrub: off Volume Style: flex Volume Types: flex Has Mroot Volume: false Has Partner Node Mroot Volume: false Is root: false Wafliron Status: - Percent Blocks Scanned: - Last Start Error Number: - Last Start Error Info: - Aggregate Type: aggr Number of Quiesced Volumes: - Number of Volumes not Online: - Number of LS Mirror Destination Volumes: - Number of DP Mirror Destination Volumes: - Number of Move Mirror Destination Volumes: - Number of DP qtree Mirror Destination Volumes: - HA Policy: sfo Block Type: 64-bit
On Wed, Dec 9, 2015 at 12:50 PM, John Stoffel john@stoffel.org wrote:
Can you post the details of one of these volumes? And of the aggregate you have them in? It smells like there's some sort of minimum volume size setting somewhere.
Or maybe there's an aggregate level snapshot sitting around?
Can you upgrade? You're in cluster mode, so hopefully it shoudln't be too hard to move to 8.1, then 8.2 and onto 8.3, since there's lots of nice bug fixes.
I didn't do the math, but just an idea:
ONTAP calculates an average file size of 32K to determine the number of inodes per volume. Every inode takes 192 Bytes from the INODEFILE. So, if you take 2TB / 32KB * 192B = 12GB... (actually I *did* do the math right now, I was curious...)
It seems this is enforced on *newer* aggregates... According to KB https://kb.netapp.com/support/index?page=content&id=3011432 you can't go below the VolSize/32K limit.
How about provisioning the volumes small (and thin) and letting them *AutoGrow* to 2TB? That way, they're not using up space (not even for inodes), yet are able to contain up to 2TB...
Just my 2c
Sebastian
On 12/10/2015 8:43 AM, Mike Thompson wrote:
sample vol and aggr details below
can't upgrade, not on support, we have about 1PB in production across two clusters. I am seeing this effect on both clusters, but not on all filers, which is strange.
if I move vols between affected and unaffected filers/aggrs, the allocated vs used normalizes for the volumes on the unaffected node, then re-inflate when moved back to the original node/aggr
Volume Allocated Used vol created on problem aggr v164402 11932348KB 1924KB after move to unaffected aggr v164402 2872KB 2872KB after move back to orig aggr v164402 12005552KB 75284KB
resized to 1TB, moved to another aggr, and back to orig aggr
v164402 6003080KB 37972KB
it appears to be related to the size of the volume. we thin-provision all volumes by setting them to a large size and set space guarantee to none.
vols sized at 2TB end up being allocated 12GB of space when created. vols sized to 1TB end up with 6GB allocated when created, even though they are completely empty.
what's strange is this is happening on some filers/aggregates but not others.
these clusters originated running Ontap GX, and then were upgraded to Ontap8 c-mode some time ago.
the aggregates that seem to be experiencing this allocation 'inflation' seem to be those that were created after the cluster was upgraded to Ontap8.
the aggregates that were originally created as GX aggrs, have identically matching 'allocated' and 'used' values in 'aggr show_space'
for example:
recently built aggregate under Ontap8:
Aggregate Allocated Used Avail Total space 18526684036KB 13662472732KB 1233809368KB <- massive difference!
aggregate originated under Ontap GX:
Aggregate Allocated Used Avail Total space 32035928920KB 32035928920KB 1404570796KB <- identical, which is what i would expect
maybe this allocation scheme changed at an aggregate level under Ontap 8? perhaps it's expected behavior?
things do seem to normalize as the volumes begin to fill up though, so I believe that this space is not truly gone permanently, but it certainly appears to be not available for use, since tons of space is allocated to volumes that have very little data in them, and we have a LOT of volumes in these clusters.
it's definitely making it look like we are missing a substantial percentage of our disk space, when trying to reconcile the sum of data used when tallying up volume size, and comparing it to the aggregate used/remaining sizes.
example volume and affected aggregate details:
bc-gx-4b::*> vol show v164346 -instance (volume show)
Virtual Server Name: bc Volume Name: v164346 Aggregate Name: gx4b_1 Volume Size: 2TB Name Ordinal: base Volume Data Set ID: 4041125 Volume Master Data Set ID: 2151509138 Volume State: online Volume Type: RW Volume Style: flex Volume Ownership: cluster Export Policy: default User ID: jobsys Group ID: cgi Security Style: unix Unix Permissions: ---rwxr-x--x Junction Path: /bc/shows/ID2/DFO/0820 Junction Path Source: RW_volume Junction Active: true Parent Volume: v164232 Virtual Server Root Volume: false Comment: Available Size: 1.15TB Total Size: 2TB Used Size: 146.7MB Used Percentage: 42% Autosize Enabled (for flexvols only): false Maximum Autosize (for flexvols only): 2.40TB Autosize Increment (for flexvols only): 102.4GB Total Files (for user-visible data): 31876689 Files Used (for user-visible data): 132 Maximum Directory Size: 100MB Space Guarantee Style: none Space Guarantee In Effect: true Minimum Read Ahead: false Access Time Update Enabled: true Snapshot Directory Access Enabled: true Percent of Space Reserved for Snapshots: 0% Used Percent of Snapshot Reserve: 0% Snapshot Policy: daily Creation Time: Tue Dec 08 11:22:58
2015 Language: C Striped Data Volume Count: - Striped Data Volume Stripe Width: 0.00B Current Striping Epoch: - One data-volume per member aggregate: - Concurrency Level: - Optimization Policy: - Clone Volume: false Anti-Virus On-Access Policy: default UUID of the volume: 17fa4c6d-9de1-11e5-a888-123478563412 Striped Volume Format: - Load Sharing Source Volume: - Move Target Volume: false Maximum Write Alloc Blocks: 0 Inconsistency in the file system: false
bc-gx-4b::*> aggr show -aggregate far4a_1 gx1a_1 gx1a_2 gx1b_1 gx2a_1 gx2b_1 gx3a_1 gx4b_1 near1b_1 near3b_1 root_1a root_1b root_2a root_2b root_3a root_3b root_4a root_4b slow2a_1 systems bc-gx-4b::*> aggr show -aggregate gx4b_1
Aggregate: gx4b_1 UUID:
c624f85e-96d3-11e3-a6ce-00a0980bb25a Size: 18.40TB Used Size: 17.25TB Used Percentage: 94% Available Size: 1.15TB State: online Nodes: bc-gx-4b Number Of Disks: 63 Disks: bc-gx-4b:0a.64, bc-gx-4b:0e.80, ... bc-gx-4b:0a.45 Number Of Volumes: 411 Plexes: /gx4b_1/plex0(online) RAID Groups: /gx4b_1/plex0/rg0, /gx4b_1/plex0/rg1, /gx4b_1/plex0/rg2 Raid Type: raid_dp Max RAID Size: 21 RAID Status: raid_dp Checksum Enabled: true Checksum Status: active Checksum Style: block Inconsistent: false Ignore Inconsistent: off Block Checksum Protection: on Zoned Checksum Protection: - Automatic Snapshot Deletion: on Enable Thorough Scrub: off Volume Style: flex Volume Types: flex Has Mroot Volume: false Has Partner Node Mroot Volume: false Is root: false Wafliron Status: - Percent Blocks Scanned: - Last Start Error Number: - Last Start Error Info: - Aggregate Type: aggr Number of Quiesced Volumes: - Number of Volumes not Online: - Number of LS Mirror Destination Volumes: - Number of DP Mirror Destination Volumes: - Number of Move Mirror Destination Volumes: - Number of DP qtree Mirror Destination Volumes: - HA Policy: sfo Block Type: 64-bit
On Wed, Dec 9, 2015 at 12:50 PM, John Stoffel <john@stoffel.org mailto:john@stoffel.org> wrote:
Can you post the details of one of these volumes? And of the aggregate you have them in? It smells like there's some sort of minimum volume size setting somewhere. Or maybe there's an aggregate level snapshot sitting around? Can you upgrade? You're in cluster mode, so hopefully it shoudln't be too hard to move to 8.1, then 8.2 and onto 8.3, since there's lots of nice bug fixes.
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Thanks Sebastian
I as well figured it was something like that - thanks for the link - makes sense.
I'll have to experiment with the autogrow stuff - I recall we were steered away from that by NetApp themselves saying it wasn't all that stable in Ontap 8.0, but never did fool with it. I will give it a shot though.
Thanks everyone
On Thu, Dec 10, 2015 at 8:13 AM, Sebastian Goetze spgoetze@gmail.com wrote:
I didn't do the math, but just an idea:
ONTAP calculates an average file size of 32K to determine the number of inodes per volume. Every inode takes 192 Bytes from the INODEFILE. So, if you take 2TB / 32KB * 192B = 12GB... (actually I *did* do the math right now, I was curious...)
It seems this is enforced on *newer* aggregates... According to KB https://kb.netapp.com/support/index?page=content&id=3011432 you can't go below the VolSize/32K limit.
How about provisioning the volumes small (and thin) and letting them *AutoGrow* to 2TB? That way, they're not using up space (not even for inodes), yet are able to contain up to 2TB...
Just my 2c
Sebastian
On 12/10/2015 8:43 AM, Mike Thompson wrote:
sample vol and aggr details below
can't upgrade, not on support, we have about 1PB in production across two clusters. I am seeing this effect on both clusters, but not on all filers, which is strange.
if I move vols between affected and unaffected filers/aggrs, the allocated vs used normalizes for the volumes on the unaffected node, then re-inflate when moved back to the original node/aggr
Volume Allocated Used
vol created on problem aggr v164402 11932348KB 1924KB after move to unaffected aggr v164402 2872KB 2872KB after move back to orig aggr v164402 12005552KB 75284KB
resized to 1TB, moved to another aggr, and back to orig aggr
v164402 6003080KB 37972KB
it appears to be related to the size of the volume. we thin-provision all volumes by setting them to a large size and set space guarantee to none.
vols sized at 2TB end up being allocated 12GB of space when created. vols sized to 1TB end up with 6GB allocated when created, even though they are completely empty.
what's strange is this is happening on some filers/aggregates but not others.
these clusters originated running Ontap GX, and then were upgraded to Ontap8 c-mode some time ago.
the aggregates that seem to be experiencing this allocation 'inflation' seem to be those that were created after the cluster was upgraded to Ontap8.
the aggregates that were originally created as GX aggrs, have identically matching 'allocated' and 'used' values in 'aggr show_space'
for example:
recently built aggregate under Ontap8:
Aggregate Allocated Used Avail Total space 18526684036KB 13662472732KB 1233809368KB <- massive difference!
aggregate originated under Ontap GX:
Aggregate Allocated Used Avail Total space 32035928920KB 32035928920KB 1404570796KB <- identical, which is what i would expect
maybe this allocation scheme changed at an aggregate level under Ontap 8? perhaps it's expected behavior?
things do seem to normalize as the volumes begin to fill up though, so I believe that this space is not truly gone permanently, but it certainly appears to be not available for use, since tons of space is allocated to volumes that have very little data in them, and we have a LOT of volumes in these clusters.
it's definitely making it look like we are missing a substantial percentage of our disk space, when trying to reconcile the sum of data used when tallying up volume size, and comparing it to the aggregate used/remaining sizes.
example volume and affected aggregate details:
bc-gx-4b::*> vol show v164346 -instance (volume show)
Virtual Server Name: bc Volume Name: v164346 Aggregate Name: gx4b_1 Volume Size: 2TB Name Ordinal: base Volume Data Set ID: 4041125 Volume Master Data Set ID: 2151509138 Volume State: online Volume Type: RW Volume Style: flex Volume Ownership: cluster Export Policy: default User ID: jobsys Group ID: cgi Security Style: unix Unix Permissions: ---rwxr-x--x Junction Path: /bc/shows/ID2/DFO/0820 Junction Path Source: RW_volume Junction Active: true Parent Volume: v164232 Virtual Server Root Volume: false Comment: Available Size: 1.15TB Total Size: 2TB Used Size: 146.7MB Used Percentage: 42% Autosize Enabled (for flexvols only): false Maximum Autosize (for flexvols only): 2.40TB Autosize Increment (for flexvols only): 102.4GB Total Files (for user-visible data): 31876689 Files Used (for user-visible data): 132 Maximum Directory Size: 100MB Space Guarantee Style: none Space Guarantee In Effect: true Minimum Read Ahead: false Access Time Update Enabled: true Snapshot Directory Access Enabled: true Percent of Space Reserved for Snapshots: 0% Used Percent of Snapshot Reserve: 0% Snapshot Policy: daily Creation Time: Tue Dec 08 11:22:58 2015 Language: C Striped Data Volume Count: - Striped Data Volume Stripe Width: 0.00B Current Striping Epoch: - One data-volume per member aggregate: - Concurrency Level: - Optimization Policy: - Clone Volume: false Anti-Virus On-Access Policy: default UUID of the volume:
17fa4c6d-9de1-11e5-a888-123478563412 Striped Volume Format: - Load Sharing Source Volume: - Move Target Volume: false Maximum Write Alloc Blocks: 0 Inconsistency in the file system: false
bc-gx-4b::*> aggr show -aggregate far4a_1 gx1a_1 gx1a_2 gx1b_1 gx2a_1 gx2b_1 gx3a_1 gx4b_1 near1b_1 near3b_1 root_1a root_1b root_2a root_2b root_3a root_3b root_4a root_4b slow2a_1 systems bc-gx-4b::*> aggr show -aggregate gx4b_1
Aggregate: gx4b_1 UUID:
c624f85e-96d3-11e3-a6ce-00a0980bb25a Size: 18.40TB Used Size: 17.25TB Used Percentage: 94% Available Size: 1.15TB State: online Nodes: bc-gx-4b Number Of Disks: 63 Disks: bc-gx-4b:0a.64, bc-gx-4b:0e.80, ... bc-gx-4b:0a.45 Number Of Volumes: 411 Plexes: /gx4b_1/plex0(online) RAID Groups: /gx4b_1/plex0/rg0, /gx4b_1/plex0/rg1, /gx4b_1/plex0/rg2 Raid Type: raid_dp Max RAID Size: 21 RAID Status: raid_dp Checksum Enabled: true Checksum Status: active Checksum Style: block Inconsistent: false Ignore Inconsistent: off Block Checksum Protection: on Zoned Checksum Protection: - Automatic Snapshot Deletion: on Enable Thorough Scrub: off Volume Style: flex Volume Types: flex Has Mroot Volume: false Has Partner Node Mroot Volume: false Is root: false Wafliron Status: - Percent Blocks Scanned: - Last Start Error Number: - Last Start Error Info: - Aggregate Type: aggr Number of Quiesced Volumes: - Number of Volumes not Online: - Number of LS Mirror Destination Volumes: - Number of DP Mirror Destination Volumes: - Number of Move Mirror Destination Volumes: - Number of DP qtree Mirror Destination Volumes: - HA Policy: sfo Block Type: 64-bit
On Wed, Dec 9, 2015 at 12:50 PM, John Stoffel john@stoffel.org wrote:
Can you post the details of one of these volumes? And of the aggregate you have them in? It smells like there's some sort of minimum volume size setting somewhere.
Or maybe there's an aggregate level snapshot sitting around?
Can you upgrade? You're in cluster mode, so hopefully it shoudln't be too hard to move to 8.1, then 8.2 and onto 8.3, since there's lots of nice bug fixes.
Toasters mailing listToasters@teaparty.nethttp://www.teaparty.net/mailman/listinfo/toasters
I used autogrow on GX, it was one feature that was rock-solid. What version of GX are you on? GX itself was quite solid with the last version released. Are you the last person still using it :) ? I thought i'd have been!
On Thu, Dec 10, 2015 at 2:07 PM, Mike Thompson mike.thompson@gmail.com wrote:
Thanks Sebastian
I as well figured it was something like that - thanks for the link - makes sense.
I'll have to experiment with the autogrow stuff - I recall we were steered away from that by NetApp themselves saying it wasn't all that stable in Ontap 8.0, but never did fool with it. I will give it a shot though.
Thanks everyone
On Thu, Dec 10, 2015 at 8:13 AM, Sebastian Goetze spgoetze@gmail.com wrote:
I didn't do the math, but just an idea:
ONTAP calculates an average file size of 32K to determine the number of inodes per volume. Every inode takes 192 Bytes from the INODEFILE. So, if you take 2TB / 32KB * 192B = 12GB... (actually I *did* do the math right now, I was curious...)
It seems this is enforced on *newer* aggregates... According to KB https://kb.netapp.com/support/index?page=content&id=3011432 you can't go below the VolSize/32K limit.
How about provisioning the volumes small (and thin) and letting them *AutoGrow* to 2TB? That way, they're not using up space (not even for inodes), yet are able to contain up to 2TB...
Just my 2c
Sebastian
On 12/10/2015 8:43 AM, Mike Thompson wrote:
sample vol and aggr details below
can't upgrade, not on support, we have about 1PB in production across two clusters. I am seeing this effect on both clusters, but not on all filers, which is strange.
if I move vols between affected and unaffected filers/aggrs, the allocated vs used normalizes for the volumes on the unaffected node, then re-inflate when moved back to the original node/aggr
Volume Allocated Used
vol created on problem aggr v164402 11932348KB 1924KB after move to unaffected aggr v164402 2872KB 2872KB after move back to orig aggr v164402 12005552KB 75284KB
resized to 1TB, moved to another aggr, and back to orig aggr
v164402 6003080KB 37972KB
it appears to be related to the size of the volume. we thin-provision all volumes by setting them to a large size and set space guarantee to none.
vols sized at 2TB end up being allocated 12GB of space when created. vols sized to 1TB end up with 6GB allocated when created, even though they are completely empty.
what's strange is this is happening on some filers/aggregates but not others.
these clusters originated running Ontap GX, and then were upgraded to Ontap8 c-mode some time ago.
the aggregates that seem to be experiencing this allocation 'inflation' seem to be those that were created after the cluster was upgraded to Ontap8.
the aggregates that were originally created as GX aggrs, have identically matching 'allocated' and 'used' values in 'aggr show_space'
for example:
recently built aggregate under Ontap8:
Aggregate Allocated Used Avail Total space 18526684036KB 13662472732KB 1233809368KB <- massive difference!
aggregate originated under Ontap GX:
Aggregate Allocated Used Avail Total space 32035928920KB 32035928920KB 1404570796KB <- identical, which is what i would expect
maybe this allocation scheme changed at an aggregate level under Ontap 8? perhaps it's expected behavior?
things do seem to normalize as the volumes begin to fill up though, so I believe that this space is not truly gone permanently, but it certainly appears to be not available for use, since tons of space is allocated to volumes that have very little data in them, and we have a LOT of volumes in these clusters.
it's definitely making it look like we are missing a substantial percentage of our disk space, when trying to reconcile the sum of data used when tallying up volume size, and comparing it to the aggregate used/remaining sizes.
example volume and affected aggregate details:
bc-gx-4b::*> vol show v164346 -instance (volume show)
Virtual Server Name: bc Volume Name: v164346 Aggregate Name: gx4b_1 Volume Size: 2TB Name Ordinal: base Volume Data Set ID: 4041125 Volume Master Data Set ID: 2151509138 Volume State: online Volume Type: RW Volume Style: flex Volume Ownership: cluster Export Policy: default User ID: jobsys Group ID: cgi Security Style: unix Unix Permissions: ---rwxr-x--x Junction Path: /bc/shows/ID2/DFO/0820 Junction Path Source: RW_volume Junction Active: true Parent Volume: v164232 Virtual Server Root Volume: false Comment: Available Size: 1.15TB Total Size: 2TB Used Size: 146.7MB Used Percentage: 42% Autosize Enabled (for flexvols only): false Maximum Autosize (for flexvols only): 2.40TB Autosize Increment (for flexvols only): 102.4GB Total Files (for user-visible data): 31876689 Files Used (for user-visible data): 132 Maximum Directory Size: 100MB Space Guarantee Style: none Space Guarantee In Effect: true Minimum Read Ahead: false Access Time Update Enabled: true Snapshot Directory Access Enabled: true Percent of Space Reserved for Snapshots: 0% Used Percent of Snapshot Reserve: 0% Snapshot Policy: daily Creation Time: Tue Dec 08 11:22:58
2015 Language: C Striped Data Volume Count: - Striped Data Volume Stripe Width: 0.00B Current Striping Epoch: - One data-volume per member aggregate: - Concurrency Level: - Optimization Policy: - Clone Volume: false Anti-Virus On-Access Policy: default UUID of the volume: 17fa4c6d-9de1-11e5-a888-123478563412 Striped Volume Format: - Load Sharing Source Volume: - Move Target Volume: false Maximum Write Alloc Blocks: 0 Inconsistency in the file system: false
bc-gx-4b::*> aggr show -aggregate far4a_1 gx1a_1 gx1a_2 gx1b_1 gx2a_1 gx2b_1 gx3a_1 gx4b_1 near1b_1 near3b_1 root_1a root_1b root_2a root_2b root_3a root_3b root_4a root_4b slow2a_1 systems bc-gx-4b::*> aggr show -aggregate gx4b_1
Aggregate: gx4b_1 UUID:
c624f85e-96d3-11e3-a6ce-00a0980bb25a Size: 18.40TB Used Size: 17.25TB Used Percentage: 94% Available Size: 1.15TB State: online Nodes: bc-gx-4b Number Of Disks: 63 Disks: bc-gx-4b:0a.64, bc-gx-4b:0e.80, ... bc-gx-4b:0a.45 Number Of Volumes: 411 Plexes: /gx4b_1/plex0(online) RAID Groups: /gx4b_1/plex0/rg0, /gx4b_1/plex0/rg1, /gx4b_1/plex0/rg2 Raid Type: raid_dp Max RAID Size: 21 RAID Status: raid_dp Checksum Enabled: true Checksum Status: active Checksum Style: block Inconsistent: false Ignore Inconsistent: off Block Checksum Protection: on Zoned Checksum Protection: - Automatic Snapshot Deletion: on Enable Thorough Scrub: off Volume Style: flex Volume Types: flex Has Mroot Volume: false Has Partner Node Mroot Volume: false Is root: false Wafliron Status: - Percent Blocks Scanned: - Last Start Error Number: - Last Start Error Info: - Aggregate Type: aggr Number of Quiesced Volumes: - Number of Volumes not Online: - Number of LS Mirror Destination Volumes: - Number of DP Mirror Destination Volumes: - Number of Move Mirror Destination Volumes: - Number of DP qtree Mirror Destination Volumes: - HA Policy: sfo Block Type: 64-bit
On Wed, Dec 9, 2015 at 12:50 PM, John Stoffel john@stoffel.org wrote:
Can you post the details of one of these volumes? And of the aggregate you have them in? It smells like there's some sort of minimum volume size setting somewhere.
Or maybe there's an aggregate level snapshot sitting around?
Can you upgrade? You're in cluster mode, so hopefully it shoudln't be too hard to move to 8.1, then 8.2 and onto 8.3, since there's lots of nice bug fixes.
Toasters mailing listToasters@teaparty.nethttp://www.teaparty.net/mailman/listinfo/toasters
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Just caught that --- originated. C-mode is just a newer version of GX, so I'd assume the autogrow should work just fine.
On Thu, Dec 10, 2015 at 3:37 PM, Douglas Siggins siggins@gmail.com wrote:
I used autogrow on GX, it was one feature that was rock-solid. What version of GX are you on? GX itself was quite solid with the last version released. Are you the last person still using it :) ? I thought i'd have been!
On Thu, Dec 10, 2015 at 2:07 PM, Mike Thompson mike.thompson@gmail.com wrote:
Thanks Sebastian
I as well figured it was something like that - thanks for the link - makes sense.
I'll have to experiment with the autogrow stuff - I recall we were steered away from that by NetApp themselves saying it wasn't all that stable in Ontap 8.0, but never did fool with it. I will give it a shot though.
Thanks everyone
On Thu, Dec 10, 2015 at 8:13 AM, Sebastian Goetze spgoetze@gmail.com wrote:
I didn't do the math, but just an idea:
ONTAP calculates an average file size of 32K to determine the number of inodes per volume. Every inode takes 192 Bytes from the INODEFILE. So, if you take 2TB / 32KB * 192B = 12GB... (actually I *did* do the math right now, I was curious...)
It seems this is enforced on *newer* aggregates... According to KB https://kb.netapp.com/support/index?page=content&id=3011432 you can't go below the VolSize/32K limit.
How about provisioning the volumes small (and thin) and letting them *AutoGrow* to 2TB? That way, they're not using up space (not even for inodes), yet are able to contain up to 2TB...
Just my 2c
Sebastian
On 12/10/2015 8:43 AM, Mike Thompson wrote:
sample vol and aggr details below
can't upgrade, not on support, we have about 1PB in production across two clusters. I am seeing this effect on both clusters, but not on all filers, which is strange.
if I move vols between affected and unaffected filers/aggrs, the allocated vs used normalizes for the volumes on the unaffected node, then re-inflate when moved back to the original node/aggr
Volume Allocated Used
vol created on problem aggr v164402 11932348KB 1924KB after move to unaffected aggr v164402 2872KB 2872KB after move back to orig aggr v164402 12005552KB 75284KB
resized to 1TB, moved to another aggr, and back to orig aggr
v164402 6003080KB 37972KB
it appears to be related to the size of the volume. we thin-provision all volumes by setting them to a large size and set space guarantee to none.
vols sized at 2TB end up being allocated 12GB of space when created. vols sized to 1TB end up with 6GB allocated when created, even though they are completely empty.
what's strange is this is happening on some filers/aggregates but not others.
these clusters originated running Ontap GX, and then were upgraded to Ontap8 c-mode some time ago.
the aggregates that seem to be experiencing this allocation 'inflation' seem to be those that were created after the cluster was upgraded to Ontap8.
the aggregates that were originally created as GX aggrs, have identically matching 'allocated' and 'used' values in 'aggr show_space'
for example:
recently built aggregate under Ontap8:
Aggregate Allocated Used Avail Total space 18526684036KB 13662472732KB 1233809368KB <- massive difference!
aggregate originated under Ontap GX:
Aggregate Allocated Used Avail Total space 32035928920KB 32035928920KB 1404570796KB <- identical, which is what i would expect
maybe this allocation scheme changed at an aggregate level under Ontap 8? perhaps it's expected behavior?
things do seem to normalize as the volumes begin to fill up though, so I believe that this space is not truly gone permanently, but it certainly appears to be not available for use, since tons of space is allocated to volumes that have very little data in them, and we have a LOT of volumes in these clusters.
it's definitely making it look like we are missing a substantial percentage of our disk space, when trying to reconcile the sum of data used when tallying up volume size, and comparing it to the aggregate used/remaining sizes.
example volume and affected aggregate details:
bc-gx-4b::*> vol show v164346 -instance (volume show)
Virtual Server Name: bc Volume Name: v164346 Aggregate Name: gx4b_1 Volume Size: 2TB Name Ordinal: base Volume Data Set ID: 4041125 Volume Master Data Set ID: 2151509138 Volume State: online Volume Type: RW Volume Style: flex Volume Ownership: cluster Export Policy: default User ID: jobsys Group ID: cgi Security Style: unix Unix Permissions: ---rwxr-x--x Junction Path: /bc/shows/ID2/DFO/0820 Junction Path Source: RW_volume Junction Active: true Parent Volume: v164232 Virtual Server Root Volume: false Comment: Available Size: 1.15TB Total Size: 2TB Used Size: 146.7MB Used Percentage: 42% Autosize Enabled (for flexvols only): false Maximum Autosize (for flexvols only): 2.40TB Autosize Increment (for flexvols only): 102.4GB Total Files (for user-visible data): 31876689 Files Used (for user-visible data): 132 Maximum Directory Size: 100MB Space Guarantee Style: none Space Guarantee In Effect: true Minimum Read Ahead: false Access Time Update Enabled: true Snapshot Directory Access Enabled: true Percent of Space Reserved for Snapshots: 0% Used Percent of Snapshot Reserve: 0% Snapshot Policy: daily Creation Time: Tue Dec 08 11:22:58
2015 Language: C Striped Data Volume Count: - Striped Data Volume Stripe Width: 0.00B Current Striping Epoch: - One data-volume per member aggregate: - Concurrency Level: - Optimization Policy: - Clone Volume: false Anti-Virus On-Access Policy: default UUID of the volume: 17fa4c6d-9de1-11e5-a888-123478563412 Striped Volume Format: - Load Sharing Source Volume: - Move Target Volume: false Maximum Write Alloc Blocks: 0 Inconsistency in the file system: false
bc-gx-4b::*> aggr show -aggregate far4a_1 gx1a_1 gx1a_2 gx1b_1 gx2a_1 gx2b_1 gx3a_1 gx4b_1 near1b_1 near3b_1 root_1a root_1b root_2a root_2b root_3a root_3b root_4a root_4b slow2a_1 systems bc-gx-4b::*> aggr show -aggregate gx4b_1
Aggregate: gx4b_1 UUID:
c624f85e-96d3-11e3-a6ce-00a0980bb25a Size: 18.40TB Used Size: 17.25TB Used Percentage: 94% Available Size: 1.15TB State: online Nodes: bc-gx-4b Number Of Disks: 63 Disks: bc-gx-4b:0a.64, bc-gx-4b:0e.80, ... bc-gx-4b:0a.45 Number Of Volumes: 411 Plexes: /gx4b_1/plex0(online) RAID Groups: /gx4b_1/plex0/rg0, /gx4b_1/plex0/rg1, /gx4b_1/plex0/rg2 Raid Type: raid_dp Max RAID Size: 21 RAID Status: raid_dp Checksum Enabled: true Checksum Status: active Checksum Style: block Inconsistent: false Ignore Inconsistent: off Block Checksum Protection: on Zoned Checksum Protection: - Automatic Snapshot Deletion: on Enable Thorough Scrub: off Volume Style: flex Volume Types: flex Has Mroot Volume: false Has Partner Node Mroot Volume: false Is root: false Wafliron Status: - Percent Blocks Scanned: - Last Start Error Number: - Last Start Error Info: - Aggregate Type: aggr Number of Quiesced Volumes: - Number of Volumes not Online: - Number of LS Mirror Destination Volumes: - Number of DP Mirror Destination Volumes: - Number of Move Mirror Destination Volumes: - Number of DP qtree Mirror Destination Volumes: - HA Policy: sfo Block Type: 64-bit
On Wed, Dec 9, 2015 at 12:50 PM, John Stoffel john@stoffel.org wrote:
Can you post the details of one of these volumes? And of the aggregate you have them in? It smells like there's some sort of minimum volume size setting somewhere.
Or maybe there's an aggregate level snapshot sitting around?
Can you upgrade? You're in cluster mode, so hopefully it shoudln't be too hard to move to 8.1, then 8.2 and onto 8.3, since there's lots of nice bug fixes.
Toasters mailing listToasters@teaparty.nethttp://www.teaparty.net/mailman/listinfo/toasters
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Indeed. On a 7.3.7P1 immediately after creating 1TB volume with space guarantee “none” this volume consumes approximately 6GB of allocated space:
Volume Allocated Used Guarantee t 5965988KB 772KB none
which more or less matches inodefile size, although the space actually consumed by it is zero:
16K ---------- 1 root root 6.1G Dec 11 10:11 inofile
I was under impression that inofile grows as needed but probably it changed at some point.
Thank you!
--- With best regards
Andrei Borzenkov Senior system engineer FTS WEMEAI RUC RU SC TMS FOS [cid:image001.gif@01CBF835.B3FEDA90] FUJITSU Zemlyanoy Val Street, 9, 105 064 Moscow, Russian Federation Tel.: +7 495 730 62 20 ( reception) Mob.: +7 916 678 7208 Fax: +7 495 730 62 14 E-mail: Andrei.Borzenkov@ts.fujitsu.commailto:Andrei.Borzenkov@ts.fujitsu.com Web: ru.fujitsu.comhttp://ts.fujitsu.com/ Company details: ts.fujitsu.com/imprinthttp://ts.fujitsu.com/imprint.html This communication contains information that is confidential, proprietary in nature and/or privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s) or the person responsible for delivering it to the intended recipient(s), please note that any form of dissemination, distribution or copying of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender and delete the original communication. Thank you for your cooperation. Please be advised that neither Fujitsu, its affiliates, its employees or agents accept liability for any errors, omissions or damages caused by delays of receipt or by any virus infection in this message or its attachments, or which may otherwise arise as a result of this e-mail transmission.
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Sebastian Goetze Sent: Thursday, December 10, 2015 7:14 PM To: Mike Thompson; John Stoffel Cc: toasters@teaparty.net Lists Subject: Re: cdot missing disk space
I didn't do the math, but just an idea:
ONTAP calculates an average file size of 32K to determine the number of inodes per volume. Every inode takes 192 Bytes from the INODEFILE. So, if you take 2TB / 32KB * 192B = 12GB... (actually I *did* do the math right now, I was curious...)
It seems this is enforced on *newer* aggregates... According to KB https://kb.netapp.com/support/index?page=content&id=3011432 you can't go below the VolSize/32K limit.
How about provisioning the volumes small (and thin) and letting them *AutoGrow* to 2TB? That way, they're not using up space (not even for inodes), yet are able to contain up to 2TB...
Just my 2c
Sebastian
On 12/10/2015 8:43 AM, Mike Thompson wrote: sample vol and aggr details below can't upgrade, not on support, we have about 1PB in production across two clusters. I am seeing this effect on both clusters, but not on all filers, which is strange. if I move vols between affected and unaffected filers/aggrs, the allocated vs used normalizes for the volumes on the unaffected node, then re-inflate when moved back to the original node/aggr
Volume Allocated Used vol created on problem aggr v164402 11932348KB 1924KB after move to unaffected aggr v164402 2872KB 2872KB after move back to orig aggr v164402 12005552KB 75284KB
resized to 1TB, moved to another aggr, and back to orig aggr
v164402 6003080KB 37972KB it appears to be related to the size of the volume. we thin-provision all volumes by setting them to a large size and set space guarantee to none.
vols sized at 2TB end up being allocated 12GB of space when created. vols sized to 1TB end up with 6GB allocated when created, even though they are completely empty. what's strange is this is happening on some filers/aggregates but not others.
these clusters originated running Ontap GX, and then were upgraded to Ontap8 c-mode some time ago.
the aggregates that seem to be experiencing this allocation 'inflation' seem to be those that were created after the cluster was upgraded to Ontap8. the aggregates that were originally created as GX aggrs, have identically matching 'allocated' and 'used' values in 'aggr show_space' for example: recently built aggregate under Ontap8:
Aggregate Allocated Used Avail Total space 18526684036KB 13662472732KB 1233809368KB <- massive difference! aggregate originated under Ontap GX:
Aggregate Allocated Used Avail Total space 32035928920KB 32035928920KB 1404570796KB <- identical, which is what i would expect maybe this allocation scheme changed at an aggregate level under Ontap 8? perhaps it's expected behavior? things do seem to normalize as the volumes begin to fill up though, so I believe that this space is not truly gone permanently, but it certainly appears to be not available for use, since tons of space is allocated to volumes that have very little data in them, and we have a LOT of volumes in these clusters.
it's definitely making it look like we are missing a substantial percentage of our disk space, when trying to reconcile the sum of data used when tallying up volume size, and comparing it to the aggregate used/remaining sizes. example volume and affected aggregate details:
bc-gx-4b::*> vol show v164346 -instance (volume show)
Virtual Server Name: bc Volume Name: v164346 Aggregate Name: gx4b_1 Volume Size: 2TB Name Ordinal: base Volume Data Set ID: 4041125 Volume Master Data Set ID: 2151509138 Volume State: online Volume Type: RW Volume Style: flex Volume Ownership: cluster Export Policy: default User ID: jobsys Group ID: cgi Security Style: unix Unix Permissions: ---rwxr-x--x Junction Path: /bc/shows/ID2/DFO/0820 Junction Path Source: RW_volume Junction Active: true Parent Volume: v164232 Virtual Server Root Volume: false Comment: Available Size: 1.15TB Total Size: 2TB Used Size: 146.7MB Used Percentage: 42% Autosize Enabled (for flexvols only): false Maximum Autosize (for flexvols only): 2.40TB Autosize Increment (for flexvols only): 102.4GB Total Files (for user-visible data): 31876689 Files Used (for user-visible data): 132 Maximum Directory Size: 100MB Space Guarantee Style: none Space Guarantee In Effect: true Minimum Read Ahead: false Access Time Update Enabled: true Snapshot Directory Access Enabled: true Percent of Space Reserved for Snapshots: 0% Used Percent of Snapshot Reserve: 0% Snapshot Policy: daily Creation Time: Tue Dec 08 11:22:58 2015 Language: C Striped Data Volume Count: - Striped Data Volume Stripe Width: 0.00B Current Striping Epoch: - One data-volume per member aggregate: - Concurrency Level: - Optimization Policy: - Clone Volume: false Anti-Virus On-Access Policy: default UUID of the volume: 17fa4c6d-9de1-11e5-a888-123478563412 Striped Volume Format: - Load Sharing Source Volume: - Move Target Volume: false Maximum Write Alloc Blocks: 0 Inconsistency in the file system: false
bc-gx-4b::*> aggr show -aggregate far4a_1 gx1a_1 gx1a_2 gx1b_1 gx2a_1 gx2b_1 gx3a_1 gx4b_1 near1b_1 near3b_1 root_1a root_1b root_2a root_2b root_3a root_3b root_4a root_4b slow2a_1 systems bc-gx-4b::*> aggr show -aggregate gx4b_1
Aggregate: gx4b_1 UUID: c624f85e-96d3-11e3-a6ce-00a0980bb25a Size: 18.40TB Used Size: 17.25TB Used Percentage: 94% Available Size: 1.15TB State: online Nodes: bc-gx-4b Number Of Disks: 63 Disks: bc-gx-4b:0a.64, bc-gx-4b:0e.80, ... bc-gx-4b:0a.45 Number Of Volumes: 411 Plexes: /gx4b_1/plex0(online) RAID Groups: /gx4b_1/plex0/rg0, /gx4b_1/plex0/rg1, /gx4b_1/plex0/rg2 Raid Type: raid_dp Max RAID Size: 21 RAID Status: raid_dp Checksum Enabled: true Checksum Status: active Checksum Style: block Inconsistent: false Ignore Inconsistent: off Block Checksum Protection: on Zoned Checksum Protection: - Automatic Snapshot Deletion: on Enable Thorough Scrub: off Volume Style: flex Volume Types: flex Has Mroot Volume: false Has Partner Node Mroot Volume: false Is root: false Wafliron Status: - Percent Blocks Scanned: - Last Start Error Number: - Last Start Error Info: - Aggregate Type: aggr Number of Quiesced Volumes: - Number of Volumes not Online: - Number of LS Mirror Destination Volumes: - Number of DP Mirror Destination Volumes: - Number of Move Mirror Destination Volumes: - Number of DP qtree Mirror Destination Volumes: - HA Policy: sfo Block Type: 64-bit
On Wed, Dec 9, 2015 at 12:50 PM, John Stoffel <john@stoffel.orgmailto:john@stoffel.org> wrote:
Can you post the details of one of these volumes? And of the aggregate you have them in? It smells like there's some sort of minimum volume size setting somewhere.
Or maybe there's an aggregate level snapshot sitting around?
Can you upgrade? You're in cluster mode, so hopefully it shoudln't be too hard to move to 8.1, then 8.2 and onto 8.3, since there's lots of nice bug fixes.
_______________________________________________
Toasters mailing list
Toasters@teaparty.netmailto:Toasters@teaparty.net
I have a feeling, that the FILE has a space guarantee... Therefore even with a volume guarantee of NONE, the space for the inodefile, being guaranteed, is taken from the aggregate, just to make sure to be able to store the metadata for the assumed maximum number of files to be stored in the volume...
Sebastian
On Fri, Dec 11, 2015, 08:24 andrei.borzenkov@ts.fujitsu.com < andrei.borzenkov@ts.fujitsu.com> wrote:
Indeed. On a 7.3.7P1 immediately after creating 1TB volume with space guarantee “none” this volume consumes approximately 6GB of allocated space:
Volume Allocated Used Guarantee
t 5965988KB 772KB none
which more or less matches inodefile size, although the space actually consumed by it is zero:
16K ---------- 1 root root 6.1G Dec 11 10:11 inofile
I was under impression that inofile grows as needed but probably it changed at some point.
Thank you!
With best regards
*Andre**i** Borzenkov*
Senior system engineer
FTS WEMEAI RUC RU SC TMS FOS
[image: cid:image001.gif@01CBF835.B3FEDA90]
*FUJITSU*
Zemlyanoy Val Street, 9, 105 064 Moscow, Russian Federation
Tel.: +7 495 730 62 20 ( reception)
Mob.: +7 916 678 7208
Fax: +7 495 730 62 14
E-mail: Andrei.Borzenkov@ts.fujitsu.com
Web: ru.fujitsu.com http://ts.fujitsu.com/
Company details: ts.fujitsu.com/imprint http://ts.fujitsu.com/imprint.html
This communication contains information that is confidential, proprietary in nature and/or privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s) or the person responsible for delivering it to the intended recipient(s), please note that any form of dissemination, distribution or copying of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender and delete the original communication. Thank you for your cooperation.
Please be advised that neither Fujitsu, its affiliates, its employees or agents accept liability for any errors, omissions or damages caused by delays of receipt or by any virus infection in this message or its attachments, or which may otherwise arise as a result of this e-mail transmission.
*From:* toasters-bounces@teaparty.net [mailto: toasters-bounces@teaparty.net] *On Behalf Of *Sebastian Goetze *Sent:* Thursday, December 10, 2015 7:14 PM *To:* Mike Thompson; John Stoffel
*Cc:* toasters@teaparty.net Lists *Subject:* Re: cdot missing disk space
I didn't do the math, but just an idea:
ONTAP calculates an average file size of 32K to determine the number of inodes per volume. Every inode takes 192 Bytes from the INODEFILE. So, if you take 2TB / 32KB * 192B = 12GB... (actually I *did* do the math right now, I was curious...)
It seems this is enforced on *newer* aggregates... According to KB https://kb.netapp.com/support/index?page=content&id=3011432 you can't go below the VolSize/32K limit.
How about provisioning the volumes small (and thin) and letting them *AutoGrow* to 2TB? That way, they're not using up space (not even for inodes), yet are able to contain up to 2TB...
Just my 2c
Sebastian
On 12/10/2015 8:43 AM, Mike Thompson wrote:
sample vol and aggr details below
can't upgrade, not on support, we have about 1PB in production across two clusters. I am seeing this effect on both clusters, but not on all filers, which is strange.
if I move vols between affected and unaffected filers/aggrs, the allocated vs used normalizes for the volumes on the unaffected node, then re-inflate when moved back to the original node/aggr
Volume Allocated Used
vol created on problem aggr v164402 11932348KB 1924KB after move to unaffected aggr v164402 2872KB 2872KB after move back to orig aggr v164402 12005552KB 75284KB
resized to 1TB, moved to another aggr, and back to orig aggr
v164402 6003080KB 37972KB
it appears to be related to the size of the volume. we thin-provision all volumes by setting them to a large size and set space guarantee to none.
vols sized at 2TB end up being allocated 12GB of space when created. vols sized to 1TB end up with 6GB allocated when created, even though they are completely empty.
what's strange is this is happening on some filers/aggregates but not others.
these clusters originated running Ontap GX, and then were upgraded to Ontap8 c-mode some time ago.
the aggregates that seem to be experiencing this allocation 'inflation' seem to be those that were created after the cluster was upgraded to Ontap8.
the aggregates that were originally created as GX aggrs, have identically matching 'allocated' and 'used' values in 'aggr show_space'
for example:
recently built aggregate under Ontap8:
Aggregate Allocated Used Avail Total space 18526684036KB 13662472732KB 1233809368KB <- massive difference!
aggregate originated under Ontap GX:
Aggregate Allocated Used Avail Total space 32035928920KB 32035928920KB 1404570796KB <- identical, which is what i would expect
maybe this allocation scheme changed at an aggregate level under Ontap 8? perhaps it's expected behavior?
things do seem to normalize as the volumes begin to fill up though, so I believe that this space is not truly gone permanently, but it certainly appears to be not available for use, since tons of space is allocated to volumes that have very little data in them, and we have a LOT of volumes in these clusters.
it's definitely making it look like we are missing a substantial percentage of our disk space, when trying to reconcile the sum of data used when tallying up volume size, and comparing it to the aggregate used/remaining sizes.
example volume and affected aggregate details:
bc-gx-4b::*> vol show v164346 -instance (volume show)
Virtual Server Name: bc Volume Name: v164346 Aggregate Name: gx4b_1 Volume Size: 2TB Name Ordinal: base Volume Data Set ID: 4041125 Volume Master Data Set ID: 2151509138 Volume State: online Volume Type: RW Volume Style: flex Volume Ownership: cluster Export Policy: default User ID: jobsys Group ID: cgi Security Style: unix Unix Permissions: ---rwxr-x--x Junction Path: /bc/shows/ID2/DFO/0820 Junction Path Source: RW_volume Junction Active: true Parent Volume: v164232 Virtual Server Root Volume: false Comment: Available Size: 1.15TB Total Size: 2TB Used Size: 146.7MB Used Percentage: 42% Autosize Enabled (for flexvols only): false Maximum Autosize (for flexvols only): 2.40TB Autosize Increment (for flexvols only): 102.4GB Total Files (for user-visible data): 31876689 Files Used (for user-visible data): 132 Maximum Directory Size: 100MB Space Guarantee Style: none Space Guarantee In Effect: true Minimum Read Ahead: false Access Time Update Enabled: true Snapshot Directory Access Enabled: true Percent of Space Reserved for Snapshots: 0% Used Percent of Snapshot Reserve: 0% Snapshot Policy: daily Creation Time: Tue Dec 08 11:22:58 2015 Language: C Striped Data Volume Count: - Striped Data Volume Stripe Width: 0.00B Current Striping Epoch: - One data-volume per member aggregate: - Concurrency Level: - Optimization Policy: - Clone Volume: false Anti-Virus On-Access Policy: default UUID of the volume:
17fa4c6d-9de1-11e5-a888-123478563412 Striped Volume Format: - Load Sharing Source Volume: - Move Target Volume: false Maximum Write Alloc Blocks: 0 Inconsistency in the file system: false
bc-gx-4b::*> aggr show -aggregate far4a_1 gx1a_1 gx1a_2 gx1b_1 gx2a_1 gx2b_1 gx3a_1 gx4b_1 near1b_1 near3b_1 root_1a root_1b root_2a root_2b root_3a root_3b root_4a root_4b slow2a_1 systems bc-gx-4b::*> aggr show -aggregate gx4b_1
Aggregate: gx4b_1 UUID:
c624f85e-96d3-11e3-a6ce-00a0980bb25a Size: 18.40TB Used Size: 17.25TB Used Percentage: 94% Available Size: 1.15TB State: online Nodes: bc-gx-4b Number Of Disks: 63 Disks: bc-gx-4b:0a.64, bc-gx-4b:0e.80, ... bc-gx-4b:0a.45 Number Of Volumes: 411 Plexes: /gx4b_1/plex0(online) RAID Groups: /gx4b_1/plex0/rg0, /gx4b_1/plex0/rg1, /gx4b_1/plex0/rg2 Raid Type: raid_dp Max RAID Size: 21 RAID Status: raid_dp Checksum Enabled: true Checksum Status: active Checksum Style: block Inconsistent: false Ignore Inconsistent: off Block Checksum Protection: on Zoned Checksum Protection: - Automatic Snapshot Deletion: on Enable Thorough Scrub: off Volume Style: flex Volume Types: flex Has Mroot Volume: false Has Partner Node Mroot Volume: false Is root: false Wafliron Status: - Percent Blocks Scanned: - Last Start Error Number: - Last Start Error Info: - Aggregate Type: aggr Number of Quiesced Volumes: - Number of Volumes not Online: - Number of LS Mirror Destination Volumes: - Number of DP Mirror Destination Volumes: - Number of Move Mirror Destination Volumes: - Number of DP qtree Mirror Destination Volumes: - HA Policy: sfo Block Type: 64-bit
On Wed, Dec 9, 2015 at 12:50 PM, John Stoffel john@stoffel.org wrote:
Can you post the details of one of these volumes? And of the aggregate you have them in? It smells like there's some sort of minimum volume size setting somewhere.
Or maybe there's an aggregate level snapshot sitting around?
Can you upgrade? You're in cluster mode, so hopefully it shoudln't be too hard to move to 8.1, then 8.2 and onto 8.3, since there's lots of nice bug fixes.
Toasters mailing list
Toasters@teaparty.net
Not sure what the cdot equivalent would be, but you could check "aggr show_space" from the nodeshell. On Dec 9, 2015 3:07 AM, "Mike Thompson" mike.thompson@gmail.com wrote:
Hey all,
I've got a cluster running cluster-mode 8.0.5 (don't laugh) that has an aggregate which is reporting much higher used size than I can account for based on the volumes contained on it.
according to 'aggr show' and 'df -A' the aggregate has around 17T of space consumed.
bc-gx-4b::> aggr show -aggregate gx4b_1 -fields size, usedsize, availsize aggregate size usedsize availsize
gx4b_1 18.40TB 17.08TB 1.32TB
though per my database, a tally of all the volumes contained on this aggregate, the total amount of space consumed by the volumes is only about 12.5T of space, so a significant amount is being soaked up by something.
I get the same numbers from the command line as well:
ssh admin@bc-gx-4b "set -units MB; vol show -aggregate gx4b_1 -fields used" | egrep "^bc" | awk '{print $3}' | sed 's/[^0-9]*//g' | paste -sd+ | bc 12528994
so the sum of the volumes is about 12.5T, but the aggregate thinks there is 17T used.
it's been in this state for some time. There haven't been any volumes recently moved off or deleted, so there isn't any space being recovered in the background.
'vol show -state offline' and 'set diag; vol lost-found show' isn't reporting anything.
Any ideas on how I might figure out what is sucking up the un-reported space?
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Thanks Rob!
'node run local aggr show_space' that seems to have pointed me to where the space is being consumed.
It would appear that every volume on this node/aggregate is being allocated a minimum of 12GB of space, regardless of how much is actually being used - and there are around 450 volumes on this node, so that adds up quickly to several TB.
Volume Allocated Used Guarantee v155739 12114904KB 375536KB none v152384 12116928KB 442628KB none v151867 12119996KB 458980KB none v3943 13931776KB 2349252KB none v160916 113845300KB 102425476KB none v160922 6106299552KB 6079321492KB none v164234 12080808KB 152172KB none v164239 12080980KB 152332KB none v164244 12080680KB 152044KB none v164249 12080872KB 152268KB none v164254 12080860KB 152228KB none v164259 12080876KB 152200KB none ...
this behavior seems to be specific to this filer (or aggregate, there is only one aggr besides the root aggr on this node) our other filers/aggregates seem to be allocating normally.
we set the space guarantee style to 'none' and size all of our volumes to 2TB in size, to basically thin provision everything and stick a 2TB 'quota' on them. we also set the snap reserve to 0% on all volumes.
so not sure what would be tweaking this filer or aggregate into having a 'floor' value for allocated space per volume.
any ideas from anyone appreciated
On Wed, Dec 9, 2015 at 4:22 AM, Rob Bush bushrsa@gmail.com wrote:
Not sure what the cdot equivalent would be, but you could check "aggr show_space" from the nodeshell. On Dec 9, 2015 3:07 AM, "Mike Thompson" mike.thompson@gmail.com wrote:
Hey all,
I've got a cluster running cluster-mode 8.0.5 (don't laugh) that has an aggregate which is reporting much higher used size than I can account for based on the volumes contained on it.
according to 'aggr show' and 'df -A' the aggregate has around 17T of space consumed.
bc-gx-4b::> aggr show -aggregate gx4b_1 -fields size, usedsize, availsize aggregate size usedsize availsize
gx4b_1 18.40TB 17.08TB 1.32TB
though per my database, a tally of all the volumes contained on this aggregate, the total amount of space consumed by the volumes is only about 12.5T of space, so a significant amount is being soaked up by something.
I get the same numbers from the command line as well:
ssh admin@bc-gx-4b "set -units MB; vol show -aggregate gx4b_1 -fields used" | egrep "^bc" | awk '{print $3}' | sed 's/[^0-9]*//g' | paste -sd+ | bc 12528994
so the sum of the volumes is about 12.5T, but the aggregate thinks there is 17T used.
it's been in this state for some time. There haven't been any volumes recently moved off or deleted, so there isn't any space being recovered in the background.
'vol show -state offline' and 'set diag; vol lost-found show' isn't reporting anything.
Any ideas on how I might figure out what is sucking up the un-reported space?
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
What is on these volumes? After quick test I actually see similar discrepancy, although on smaller scale – on non-space guaranteed volume with empty LUN of 512M used space is 900K but allocated space is 8M.
--- With best regards
Andrei Borzenkov Senior system engineer FTS WEMEAI RUC RU SC TMS FOS [cid:image001.gif@01CBF835.B3FEDA90] FUJITSU Zemlyanoy Val Street, 9, 105 064 Moscow, Russian Federation Tel.: +7 495 730 62 20 ( reception) Mob.: +7 916 678 7208 Fax: +7 495 730 62 14 E-mail: Andrei.Borzenkov@ts.fujitsu.commailto:Andrei.Borzenkov@ts.fujitsu.com Web: ru.fujitsu.comhttp://ts.fujitsu.com/ Company details: ts.fujitsu.com/imprinthttp://ts.fujitsu.com/imprint.html This communication contains information that is confidential, proprietary in nature and/or privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s) or the person responsible for delivering it to the intended recipient(s), please note that any form of dissemination, distribution or copying of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender and delete the original communication. Thank you for your cooperation. Please be advised that neither Fujitsu, its affiliates, its employees or agents accept liability for any errors, omissions or damages caused by delays of receipt or by any virus infection in this message or its attachments, or which may otherwise arise as a result of this e-mail transmission.
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Mike Thompson Sent: Wednesday, December 09, 2015 10:29 PM To: Rob Bush Cc: toasters@teaparty.net Lists Subject: Re: cdot missing disk space
Thanks Rob!
'node run local aggr show_space' that seems to have pointed me to where the space is being consumed. It would appear that every volume on this node/aggregate is being allocated a minimum of 12GB of space, regardless of how much is actually being used - and there are around 450 volumes on this node, so that adds up quickly to several TB.
Volume Allocated Used Guarantee v155739 12114904KB 375536KB none v152384 12116928KB 442628KB none v151867 12119996KB 458980KB none v3943 13931776KB 2349252KB none v160916 113845300KB 102425476KB none v160922 6106299552KB 6079321492KB none v164234 12080808KB 152172KB none v164239 12080980KB 152332KB none v164244 12080680KB 152044KB none v164249 12080872KB 152268KB none v164254 12080860KB 152228KB none v164259 12080876KB 152200KB none ... this behavior seems to be specific to this filer (or aggregate, there is only one aggr besides the root aggr on this node) our other filers/aggregates seem to be allocating normally. we set the space guarantee style to 'none' and size all of our volumes to 2TB in size, to basically thin provision everything and stick a 2TB 'quota' on them. we also set the snap reserve to 0% on all volumes. so not sure what would be tweaking this filer or aggregate into having a 'floor' value for allocated space per volume. any ideas from anyone appreciated
On Wed, Dec 9, 2015 at 4:22 AM, Rob Bush <bushrsa@gmail.commailto:bushrsa@gmail.com> wrote:
Not sure what the cdot equivalent would be, but you could check "aggr show_space" from the nodeshell. On Dec 9, 2015 3:07 AM, "Mike Thompson" <mike.thompson@gmail.commailto:mike.thompson@gmail.com> wrote: Hey all, I've got a cluster running cluster-mode 8.0.5 (don't laugh) that has an aggregate which is reporting much higher used size than I can account for based on the volumes contained on it.
according to 'aggr show' and 'df -A' the aggregate has around 17T of space consumed.
bc-gx-4b::> aggr show -aggregate gx4b_1 -fields size, usedsize, availsize aggregate size usedsize availsize --------- ------- -------- --------- gx4b_1 18.40TB 17.08TB 1.32TB though per my database, a tally of all the volumes contained on this aggregate, the total amount of space consumed by the volumes is only about 12.5T of space, so a significant amount is being soaked up by something.
I get the same numbers from the command line as well:
ssh admin@bc-gx-4b "set -units MB; vol show -aggregate gx4b_1 -fields used" | egrep "^bc" | awk '{print $3}' | sed 's/[^0-9]*//g' | paste -sd+ | bc 12528994 so the sum of the volumes is about 12.5T, but the aggregate thinks there is 17T used.
it's been in this state for some time. There haven't been any volumes recently moved off or deleted, so there isn't any space being recovered in the background.
'vol show -state offline' and 'set diag; vol lost-found show' isn't reporting anything. Any ideas on how I might figure out what is sucking up the un-reported space?
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Hi There,
I think the CDOT equivalent to good old "aggr show_space" is:
cluster-shell::> storage aggregate show-space
That will show the volume "footprints" and aggregate metadata sizes.
There was (is) the potential for some pretty serious issues with de-duplication in 8.0.5/8.1. Resolved in 8.1.2P4 I think. I assume both 7-mode and CDOT would be effected.
See the NetApp Support Bulletin / KB entry with ID 7010056. Essentially ONTAP might leak stale deduplication metadata fingerprints. The KB recommends running "sis check -c" and "sis status -l".
Good luck!
Cheers, Robb.