cdot missing disk space

List overview All Threads
Download

newer

older

Grafana setup

list meta information

Mike Thompson

9 Dec 2015 9 Dec '15

8:06 a.m.

Hey all,

I've got a cluster running cluster-mode 8.0.5 (don't laugh) that has an aggregate which is reporting much higher used size than I can account for based on the volumes contained on it.

according to 'aggr show' and 'df -A' the aggregate has around 17T of space consumed.

bc-gx-4b::> aggr show -aggregate gx4b_1 -fields size, usedsize, availsize aggregate size usedsize availsize --------- ------- -------- --------- gx4b_1 18.40TB 17.08TB 1.32TB

though per my database, a tally of all the volumes contained on this aggregate, the total amount of space consumed by the volumes is only about 12.5T of space, so a significant amount is being soaked up by something.

I get the same numbers from the command line as well:

ssh admin@bc-gx-4b "set -units MB; vol show -aggregate gx4b_1 -fields used" | egrep "^bc" | awk '{print $3}' | sed 's/[^0-9]*//g' | paste -sd+ | bc 12528994

so the sum of the volumes is about 12.5T, but the aggregate thinks there is 17T used.

it's been in this state for some time. There haven't been any volumes recently moved off or deleted, so there isn't any space being recovered in the background.

'vol show -state offline' and 'set diag; vol lost-found show' isn't reporting anything.

Any ideas on how I might figure out what is sucking up the un-reported space?

Attachments:

attachment.html (text/html — 1.7 KB)

Show replies by date

andrei.borzenkov＠ts.fujitsu.com

9 Dec 9 Dec

9:26 a.m.

Do you use deduplication by any chance?

--- With best regards

Andrei Borzenkov Senior system engineer FTS WEMEAI RUC RU SC TMS FOS [cid:image001.gif@01CBF835.B3FEDA90] FUJITSU Zemlyanoy Val Street, 9, 105 064 Moscow, Russian Federation Tel.: +7 495 730 62 20 ( reception) Mob.: +7 916 678 7208 Fax: +7 495 730 62 14 E-mail: Andrei.Borzenkov@ts.fujitsu.commailto:Andrei.Borzenkov@ts.fujitsu.com Web: ru.fujitsu.comhttp://ts.fujitsu.com/ Company details: ts.fujitsu.com/imprinthttp://ts.fujitsu.com/imprint.html This communication contains information that is confidential, proprietary in nature and/or privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s) or the person responsible for delivering it to the intended recipient(s), please note that any form of dissemination, distribution or copying of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender and delete the original communication. Thank you for your cooperation. Please be advised that neither Fujitsu, its affiliates, its employees or agents accept liability for any errors, omissions or damages caused by delays of receipt or by any virus infection in this message or its attachments, or which may otherwise arise as a result of this e-mail transmission.

From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Mike Thompson Sent: Wednesday, December 09, 2015 11:07 AM To: toasters@teaparty.net Lists Subject: cdot missing disk space

Hey all, I've got a cluster running cluster-mode 8.0.5 (don't laugh) that has an aggregate which is reporting much higher used size than I can account for based on the volumes contained on it.

according to 'aggr show' and 'df -A' the aggregate has around 17T of space consumed.

bc-gx-4b::> aggr show -aggregate gx4b_1 -fields size, usedsize, availsize aggregate size usedsize availsize --------- ------- -------- --------- gx4b_1 18.40TB 17.08TB 1.32TB though per my database, a tally of all the volumes contained on this aggregate, the total amount of space consumed by the volumes is only about 12.5T of space, so a significant amount is being soaked up by something.

I get the same numbers from the command line as well:

ssh admin@bc-gx-4b "set -units MB; vol show -aggregate gx4b_1 -fields used" | egrep "^bc" | awk '{print $3}' | sed 's/[^0-9]*//g' | paste -sd+ | bc 12528994 so the sum of the volumes is about 12.5T, but the aggregate thinks there is 17T used.

it's been in this state for some time. There haven't been any volumes recently moved off or deleted, so there isn't any space being recovered in the background.

'vol show -state offline' and 'set diag; vol lost-found show' isn't reporting anything. Any ideas on how I might figure out what is sucking up the un-reported space?

Mike Thompson

7:33 p.m.

Thanks Andrei - sorry forgot to add to my previous response in the thread, no dedupe in use.

On Wed, Dec 9, 2015 at 1:26 AM, andrei.borzenkov@ts.fujitsu.com < andrei.borzenkov@ts.fujitsu.com> wrote:

...

Do you use deduplication by any chance?

With best regards

*Andre**i** Borzenkov*

Senior system engineer

FTS WEMEAI RUC RU SC TMS FOS

[image: cid:image001.gif@01CBF835.B3FEDA90]

*FUJITSU*

Zemlyanoy Val Street, 9, 105 064 Moscow, Russian Federation

Tel.: +7 495 730 62 20 ( reception)

Mob.: +7 916 678 7208

Fax: +7 495 730 62 14

E-mail: Andrei.Borzenkov@ts.fujitsu.com

Web: ru.fujitsu.com http://ts.fujitsu.com/

Company details: ts.fujitsu.com/imprint http://ts.fujitsu.com/imprint.html

This communication contains information that is confidential, proprietary in nature and/or privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s) or the person responsible for delivering it to the intended recipient(s), please note that any form of dissemination, distribution or copying of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender and delete the original communication. Thank you for your cooperation.

Please be advised that neither Fujitsu, its affiliates, its employees or agents accept liability for any errors, omissions or damages caused by delays of receipt or by any virus infection in this message or its attachments, or which may otherwise arise as a result of this e-mail transmission.

*From:* toasters-bounces@teaparty.net [mailto: toasters-bounces@teaparty.net] *On Behalf Of *Mike Thompson *Sent:* Wednesday, December 09, 2015 11:07 AM *To:* toasters@teaparty.net Lists *Subject:* cdot missing disk space

Hey all,

I've got a cluster running cluster-mode 8.0.5 (don't laugh) that has an aggregate which is reporting much higher used size than I can account for based on the volumes contained on it.

according to 'aggr show' and 'df -A' the aggregate has around 17T of space consumed.

bc-gx-4b::> aggr show -aggregate gx4b_1 -fields size, usedsize, availsize aggregate size usedsize availsize

gx4b_1 18.40TB 17.08TB 1.32TB

though per my database, a tally of all the volumes contained on this aggregate, the total amount of space consumed by the volumes is only about 12.5T of space, so a significant amount is being soaked up by something.

I get the same numbers from the command line as well:

ssh admin@bc-gx-4b "set -units MB; vol show -aggregate gx4b_1 -fields used" | egrep "^bc" | awk '{print $3}' | sed 's/[^0-9]*//g' | paste -sd+ | bc 12528994

so the sum of the volumes is about 12.5T, but the aggregate thinks there is 17T used.

it's been in this state for some time. There haven't been any volumes recently moved off or deleted, so there isn't any space being recovered in the background.

'vol show -state offline' and 'set diag; vol lost-found show' isn't reporting anything.

Any ideas on how I might figure out what is sucking up the un-reported space?

Tim McCarthy

7:52 p.m.

Were these set up with snap protect? Check the auto grow settings.

Sent from Mobile Outlook

On Wed, Dec 9, 2015 at 11:37 AM -0800, "Mike Thompson" mike.thompson@gmail.com wrote:

Thanks Andrei - sorry forgot to add to my previous response in the thread, no dedupe in use.

On Wed, Dec 9, 2015 at 1:26 AM, andrei.borzenkov@ts.fujitsu.com andrei.borzenkov@ts.fujitsu.com wrote:

Do you use deduplication by any chance?

---

With best regards

Andrei Borzenkov

Senior system engineer

FTS WEMEAI RUC RU SC TMS FOS

FUJITSU

Zemlyanoy Val Street, 9, 105 064 Moscow, Russian Federation

Tel.: +7 495 730 62 20 ( reception)

Mob.: +7 916 678 7208

Fax: +7 495 730 62 14

E-mail: Andrei.Borzenkov@ts.fujitsu.com

Web: ru.fujitsu.com

Company details: ts.fujitsu.com/imprint

This communication contains information that is confidential, proprietary in nature and/or privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s) or the person responsible for delivering it to the intended recipient(s), please note that any form of dissemination, distribution or copying of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender and delete the original communication. Thank you for your cooperation.

Please be advised that neither Fujitsu, its affiliates, its employees or agents accept liability for any errors, omissions or damages caused by delays of receipt or by any virus infection in this message or its attachments, or which may otherwise arise as a result of this e-mail transmission.

From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Mike Thompson

Sent: Wednesday, December 09, 2015 11:07 AM

To: toasters@teaparty.net Lists

Subject: cdot missing disk space

Hey all,

I've got a cluster running cluster-mode 8.0.5 (don't laugh) that has an aggregate which is reporting much higher used size than I can account for based on the volumes contained on it.

according to 'aggr show' and 'df -A' the aggregate has around 17T of space consumed.

bc-gx-4b::> aggr show -aggregate gx4b_1 -fields size, usedsize, availsize

aggregate size usedsize availsize

--------- ------- -------- ---------

gx4b_1 18.40TB 17.08TB 1.32TB

I get the same numbers from the command line as well:

ssh admin@bc-gx-4b "set -units MB; vol show -aggregate gx4b_1 -fields used" | egrep "^bc" | awk '{print $3}' | sed 's/[^0-9]*//g' | paste -sd+ | bc

12528994

so the sum of the volumes is about 12.5T, but the aggregate thinks there is 17T used.

it's been in this state for some time. There haven't been any volumes recently moved off or deleted, so there isn't any space being recovered in the background.

'vol show -state offline' and 'set diag; vol lost-found show' isn't reporting anything.

Any ideas on how I might figure out what is sucking up the un-reported space?

Mike Thompson

8:31 p.m.

nope no snap protect. autosize is off on all volumes.

if i move any volume off to another aggregate on another filer, the allocated space ends up matching the used space (well below this 12GB minimum for these nearly empty volumes).

if I move an empty volume from another filer to this aggregate, the allocated space swells up to around this 12GB number. so it seems to be something at the filer or aggregate level.

On Wed, Dec 9, 2015 at 11:52 AM, Tim McCarthy tmacmd@gmail.com wrote:

...

Were these set up with snap protect?

Check the auto grow settings.

Sent from Mobile Outlook https://aka.ms/qtex0l

On Wed, Dec 9, 2015 at 11:37 AM -0800, "Mike Thompson" < mike.thompson@gmail.com> wrote:

Thanks Andrei - sorry forgot to add to my previous response in the thread,

...
no dedupe in use.

On Wed, Dec 9, 2015 at 1:26 AM, andrei.borzenkov@ts.fujitsu.com < andrei.borzenkov@ts.fujitsu.com> wrote:

...
Do you use deduplication by any chance?

With best regards

*Andre**i** Borzenkov*

Senior system engineer

FTS WEMEAI RUC RU SC TMS FOS

[image: cid:image001.gif@01CBF835.B3FEDA90]

*FUJITSU*

Zemlyanoy Val Street, 9, 105 064 Moscow, Russian Federation

Tel.: +7 495 730 62 20 ( reception)

Mob.: +7 916 678 7208

Fax: +7 495 730 62 14

E-mail: Andrei.Borzenkov@ts.fujitsu.com

Web: ru.fujitsu.com http://ts.fujitsu.com/

Company details: ts.fujitsu.com/imprint http://ts.fujitsu.com/imprint.html

This communication contains information that is confidential, proprietary in nature and/or privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s) or the person responsible for delivering it to the intended recipient(s), please note that any form of dissemination, distribution or copying of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender and delete the original communication. Thank you for your cooperation.

Please be advised that neither Fujitsu, its affiliates, its employees or agents accept liability for any errors, omissions or damages caused by delays of receipt or by any virus infection in this message or its attachments, or which may otherwise arise as a result of this e-mail transmission.

*From:* toasters-bounces@teaparty.net [mailto: toasters-bounces@teaparty.net] *On Behalf Of *Mike Thompson *Sent:* Wednesday, December 09, 2015 11:07 AM *To:* toasters@teaparty.net Lists *Subject:* cdot missing disk space

Hey all,

I've got a cluster running cluster-mode 8.0.5 (don't laugh) that has an aggregate which is reporting much higher used size than I can account for based on the volumes contained on it.

according to 'aggr show' and 'df -A' the aggregate has around 17T of space consumed.

bc-gx-4b::> aggr show -aggregate gx4b_1 -fields size, usedsize, availsize aggregate size usedsize availsize

gx4b_1 18.40TB 17.08TB 1.32TB

though per my database, a tally of all the volumes contained on this aggregate, the total amount of space consumed by the volumes is only about 12.5T of space, so a significant amount is being soaked up by something.

I get the same numbers from the command line as well:

ssh admin@bc-gx-4b "set -units MB; vol show -aggregate gx4b_1 -fields used" | egrep "^bc" | awk '{print $3}' | sed 's/[^0-9]*//g' | paste -sd+ | bc 12528994

so the sum of the volumes is about 12.5T, but the aggregate thinks there is 17T used.

it's been in this state for some time. There haven't been any volumes recently moved off or deleted, so there isn't any space being recovered in the background.

'vol show -state offline' and 'set diag; vol lost-found show' isn't reporting anything.

Any ideas on how I might figure out what is sucking up the un-reported space?

John Stoffel

8:50 p.m.

Can you post the details of one of these volumes? And of the aggregate you have them in? It smells like there's some sort of minimum volume size setting somewhere.

Or maybe there's an aggregate level snapshot sitting around?

Can you upgrade? You're in cluster mode, so hopefully it shoudln't be too hard to move to 8.1, then 8.2 and onto 8.3, since there's lots of nice bug fixes.

Mike Thompson

10 Dec 10 Dec

7:43 a.m.

sample vol and aggr details below

can't upgrade, not on support, we have about 1PB in production across two clusters. I am seeing this effect on both clusters, but not on all filers, which is strange.

if I move vols between affected and unaffected filers/aggrs, the allocated vs used normalizes for the volumes on the unaffected node, then re-inflate when moved back to the original node/aggr

Volume Allocated Used vol created on problem aggr v164402 11932348KB 1924KB after move to unaffected aggr v164402 2872KB 2872KB after move back to orig aggr v164402 12005552KB 75284KB

resized to 1TB, moved to another aggr, and back to orig aggr

v164402 6003080KB 37972KB

it appears to be related to the size of the volume. we thin-provision all volumes by setting them to a large size and set space guarantee to none.

vols sized at 2TB end up being allocated 12GB of space when created. vols sized to 1TB end up with 6GB allocated when created, even though they are completely empty.

what's strange is this is happening on some filers/aggregates but not others.

these clusters originated running Ontap GX, and then were upgraded to Ontap8 c-mode some time ago.

the aggregates that seem to be experiencing this allocation 'inflation' seem to be those that were created after the cluster was upgraded to Ontap8.

the aggregates that were originally created as GX aggrs, have identically matching 'allocated' and 'used' values in 'aggr show_space'

for example:

recently built aggregate under Ontap8:

Aggregate Allocated Used Avail Total space 18526684036KB 13662472732KB 1233809368KB <- massive difference!

aggregate originated under Ontap GX:

Aggregate Allocated Used Avail Total space 32035928920KB 32035928920KB 1404570796KB <- identical, which is what i would expect

maybe this allocation scheme changed at an aggregate level under Ontap 8? perhaps it's expected behavior?

things do seem to normalize as the volumes begin to fill up though, so I believe that this space is not truly gone permanently, but it certainly appears to be not available for use, since tons of space is allocated to volumes that have very little data in them, and we have a LOT of volumes in these clusters.

it's definitely making it look like we are missing a substantial percentage of our disk space, when trying to reconcile the sum of data used when tallying up volume size, and comparing it to the aggregate used/remaining sizes.

example volume and affected aggregate details:

bc-gx-4b::*> vol show v164346 -instance (volume show)

Virtual Server Name: bc Volume Name: v164346 Aggregate Name: gx4b_1 Volume Size: 2TB Name Ordinal: base Volume Data Set ID: 4041125 Volume Master Data Set ID: 2151509138 Volume State: online Volume Type: RW Volume Style: flex Volume Ownership: cluster Export Policy: default User ID: jobsys Group ID: cgi Security Style: unix Unix Permissions: ---rwxr-x--x Junction Path: /bc/shows/ID2/DFO/0820 Junction Path Source: RW_volume Junction Active: true Parent Volume: v164232 Virtual Server Root Volume: false Comment: Available Size: 1.15TB Total Size: 2TB Used Size: 146.7MB Used Percentage: 42% Autosize Enabled (for flexvols only): false Maximum Autosize (for flexvols only): 2.40TB Autosize Increment (for flexvols only): 102.4GB Total Files (for user-visible data): 31876689 Files Used (for user-visible data): 132 Maximum Directory Size: 100MB Space Guarantee Style: none Space Guarantee In Effect: true Minimum Read Ahead: false Access Time Update Enabled: true Snapshot Directory Access Enabled: true Percent of Space Reserved for Snapshots: 0% Used Percent of Snapshot Reserve: 0% Snapshot Policy: daily Creation Time: Tue Dec 08 11:22:58 2015 Language: C Striped Data Volume Count: - Striped Data Volume Stripe Width: 0.00B Current Striping Epoch: - One data-volume per member aggregate: - Concurrency Level: - Optimization Policy: - Clone Volume: false Anti-Virus On-Access Policy: default UUID of the volume: 17fa4c6d-9de1-11e5-a888-123478563412 Striped Volume Format: - Load Sharing Source Volume: - Move Target Volume: false Maximum Write Alloc Blocks: 0 Inconsistency in the file system: false

bc-gx-4b::*> aggr show -aggregate far4a_1 gx1a_1 gx1a_2 gx1b_1 gx2a_1 gx2b_1 gx3a_1 gx4b_1 near1b_1 near3b_1 root_1a root_1b root_2a root_2b root_3a root_3b root_4a root_4b slow2a_1 systems bc-gx-4b::*> aggr show -aggregate gx4b_1

Aggregate: gx4b_1 UUID: c624f85e-96d3-11e3-a6ce-00a0980bb25a Size: 18.40TB Used Size: 17.25TB Used Percentage: 94% Available Size: 1.15TB State: online Nodes: bc-gx-4b Number Of Disks: 63 Disks: bc-gx-4b:0a.64, bc-gx-4b:0e.80, ... bc-gx-4b:0a.45 Number Of Volumes: 411 Plexes: /gx4b_1/plex0(online) RAID Groups: /gx4b_1/plex0/rg0, /gx4b_1/plex0/rg1, /gx4b_1/plex0/rg2 Raid Type: raid_dp Max RAID Size: 21 RAID Status: raid_dp Checksum Enabled: true Checksum Status: active Checksum Style: block Inconsistent: false Ignore Inconsistent: off Block Checksum Protection: on Zoned Checksum Protection: - Automatic Snapshot Deletion: on Enable Thorough Scrub: off Volume Style: flex Volume Types: flex Has Mroot Volume: false Has Partner Node Mroot Volume: false Is root: false Wafliron Status: - Percent Blocks Scanned: - Last Start Error Number: - Last Start Error Info: - Aggregate Type: aggr Number of Quiesced Volumes: - Number of Volumes not Online: - Number of LS Mirror Destination Volumes: - Number of DP Mirror Destination Volumes: - Number of Move Mirror Destination Volumes: - Number of DP qtree Mirror Destination Volumes: - HA Policy: sfo Block Type: 64-bit

On Wed, Dec 9, 2015 at 12:50 PM, John Stoffel john@stoffel.org wrote:

...

Can you post the details of one of these volumes? And of the aggregate you have them in? It smells like there's some sort of minimum volume size setting somewhere.

Or maybe there's an aggregate level snapshot sitting around?

Can you upgrade? You're in cluster mode, so hopefully it shoudln't be too hard to move to 8.1, then 8.2 and onto 8.3, since there's lots of nice bug fixes.

Sebastian Goetze

4:13 p.m.

I didn't do the math, but just an idea:

ONTAP calculates an average file size of 32K to determine the number of inodes per volume. Every inode takes 192 Bytes from the INODEFILE. So, if you take 2TB / 32KB * 192B = 12GB... (actually I *did* do the math right now, I was curious...)

It seems this is enforced on *newer* aggregates... According to KB https://kb.netapp.com/support/index?page=content&id=3011432 you can't go below the VolSize/32K limit.

How about provisioning the volumes small (and thin) and letting them *AutoGrow* to 2TB? That way, they're not using up space (not even for inodes), yet are able to contain up to 2TB...

Just my 2c

Sebastian

On 12/10/2015 8:43 AM, Mike Thompson wrote:

...

sample vol and aggr details below

can't upgrade, not on support, we have about 1PB in production across two clusters. I am seeing this effect on both clusters, but not on all filers, which is strange.

if I move vols between affected and unaffected filers/aggrs, the allocated vs used normalizes for the volumes on the unaffected node, then re-inflate when moved back to the original node/aggr

Volume Allocated Used vol created on problem aggr v164402 11932348KB 1924KB after move to unaffected aggr v164402 2872KB 2872KB after move back to orig aggr v164402 12005552KB 75284KB

resized to 1TB, moved to another aggr, and back to orig aggr
                          v164402 6003080KB   37972KB
it appears to be related to the size of the volume. we thin-provision all volumes by setting them to a large size and set space guarantee to none.

vols sized at 2TB end up being allocated 12GB of space when created. vols sized to 1TB end up with 6GB allocated when created, even though they are completely empty.

what's strange is this is happening on some filers/aggregates but not others.

these clusters originated running Ontap GX, and then were upgraded to Ontap8 c-mode some time ago.

the aggregates that seem to be experiencing this allocation 'inflation' seem to be those that were created after the cluster was upgraded to Ontap8.

the aggregates that were originally created as GX aggrs, have identically matching 'allocated' and 'used' values in 'aggr show_space'

for example:

recently built aggregate under Ontap8:

Aggregate Allocated Used Avail Total space 18526684036KB 13662472732KB 1233809368KB <- massive difference!

aggregate originated under Ontap GX:

Aggregate Allocated Used Avail Total space 32035928920KB 32035928920KB 1404570796KB <- identical, which is what i would expect

maybe this allocation scheme changed at an aggregate level under Ontap 8? perhaps it's expected behavior?

things do seem to normalize as the volumes begin to fill up though, so I believe that this space is not truly gone permanently, but it certainly appears to be not available for use, since tons of space is allocated to volumes that have very little data in them, and we have a LOT of volumes in these clusters.

it's definitely making it look like we are missing a substantial percentage of our disk space, when trying to reconcile the sum of data used when tallying up volume size, and comparing it to the aggregate used/remaining sizes.

example volume and affected aggregate details:

bc-gx-4b::*> vol show v164346 -instance (volume show)
                          Virtual Server Name: bc
                                  Volume Name: v164346
                               Aggregate Name: gx4b_1
                                  Volume Size: 2TB
                                 Name Ordinal: base
                           Volume Data Set ID: 4041125
                    Volume Master Data Set ID: 2151509138
                                 Volume State: online
                                  Volume Type: RW
                                 Volume Style: flex
                             Volume Ownership: cluster
                                Export Policy: default
                                      User ID: jobsys
                                     Group ID: cgi
                               Security Style: unix
                             Unix Permissions: ---rwxr-x--x
                                Junction Path: /bc/shows/ID2/DFO/0820
                         Junction Path Source: RW_volume
                              Junction Active: true
                                Parent Volume: v164232
                   Virtual Server Root Volume: false
                                      Comment:
                               Available Size: 1.15TB
                                   Total Size: 2TB
                                    Used Size: 146.7MB
                              Used Percentage: 42%
         Autosize Enabled (for flexvols only): false
         Maximum Autosize (for flexvols only): 2.40TB
       Autosize Increment (for flexvols only): 102.4GB
          Total Files (for user-visible data): 31876689
           Files Used (for user-visible data): 132
                       Maximum Directory Size: 100MB
                        Space Guarantee Style: none
                    Space Guarantee In Effect: true
                           Minimum Read Ahead: false
                   Access Time Update Enabled: true
            Snapshot Directory Access Enabled: true
      Percent of Space Reserved for Snapshots: 0%
             Used Percent of Snapshot Reserve: 0%
                              Snapshot Policy: daily
                                Creation Time: Tue Dec 08 11:22:58 
2015 Language: C Striped Data Volume Count: - Striped Data Volume Stripe Width: 0.00B Current Striping Epoch: - One data-volume per member aggregate: - Concurrency Level: - Optimization Policy: - Clone Volume: false Anti-Virus On-Access Policy: default UUID of the volume: 17fa4c6d-9de1-11e5-a888-123478563412 Striped Volume Format: - Load Sharing Source Volume: - Move Target Volume: false Maximum Write Alloc Blocks: 0 Inconsistency in the file system: false

bc-gx-4b::*> aggr show -aggregate far4a_1 gx1a_1 gx1a_2 gx1b_1 gx2a_1 gx2b_1 gx3a_1 gx4b_1 near1b_1 near3b_1 root_1a root_1b root_2a root_2b root_3a root_3b root_4a root_4b slow2a_1 systems bc-gx-4b::*> aggr show -aggregate gx4b_1
                                Aggregate: gx4b_1
                                     UUID: 
c624f85e-96d3-11e3-a6ce-00a0980bb25a Size: 18.40TB Used Size: 17.25TB Used Percentage: 94% Available Size: 1.15TB State: online Nodes: bc-gx-4b Number Of Disks: 63 Disks: bc-gx-4b:0a.64, bc-gx-4b:0e.80, ... bc-gx-4b:0a.45 Number Of Volumes: 411 Plexes: /gx4b_1/plex0(online) RAID Groups: /gx4b_1/plex0/rg0, /gx4b_1/plex0/rg1, /gx4b_1/plex0/rg2 Raid Type: raid_dp Max RAID Size: 21 RAID Status: raid_dp Checksum Enabled: true Checksum Status: active Checksum Style: block Inconsistent: false Ignore Inconsistent: off Block Checksum Protection: on Zoned Checksum Protection: - Automatic Snapshot Deletion: on Enable Thorough Scrub: off Volume Style: flex Volume Types: flex Has Mroot Volume: false Has Partner Node Mroot Volume: false Is root: false Wafliron Status: - Percent Blocks Scanned: - Last Start Error Number: - Last Start Error Info: - Aggregate Type: aggr Number of Quiesced Volumes: - Number of Volumes not Online: - Number of LS Mirror Destination Volumes: - Number of DP Mirror Destination Volumes: - Number of Move Mirror Destination Volumes: - Number of DP qtree Mirror Destination Volumes: - HA Policy: sfo Block Type: 64-bit

On Wed, Dec 9, 2015 at 12:50 PM, John Stoffel <john@stoffel.org mailto:john@stoffel.org> wrote:
Can you post the details of one of these volumes?  And of the
aggregate you have them in?  It smells like there's some sort of
minimum volume size setting somewhere.

Or maybe there's an aggregate level snapshot sitting around?

Can you upgrade?  You're in cluster mode, so hopefully it shoudln't be
too hard to move to 8.1, then 8.2 and onto 8.3, since there's lots of
nice bug fixes.
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters

Mike Thompson

7:07 p.m.

Thanks Sebastian

I as well figured it was something like that - thanks for the link - makes sense.

I'll have to experiment with the autogrow stuff - I recall we were steered away from that by NetApp themselves saying it wasn't all that stable in Ontap 8.0, but never did fool with it. I will give it a shot though.

Thanks everyone

On Thu, Dec 10, 2015 at 8:13 AM, Sebastian Goetze spgoetze@gmail.com wrote:

...

I didn't do the math, but just an idea:

ONTAP calculates an average file size of 32K to determine the number of inodes per volume. Every inode takes 192 Bytes from the INODEFILE. So, if you take 2TB / 32KB * 192B = 12GB... (actually I *did* do the math right now, I was curious...)

It seems this is enforced on *newer* aggregates... According to KB https://kb.netapp.com/support/index?page=content&id=3011432 you can't go below the VolSize/32K limit.

How about provisioning the volumes small (and thin) and letting them *AutoGrow* to 2TB? That way, they're not using up space (not even for inodes), yet are able to contain up to 2TB...

Just my 2c

Sebastian

On 12/10/2015 8:43 AM, Mike Thompson wrote:

sample vol and aggr details below

can't upgrade, not on support, we have about 1PB in production across two clusters. I am seeing this effect on both clusters, but not on all filers, which is strange.

if I move vols between affected and unaffected filers/aggrs, the allocated vs used normalizes for the volumes on the unaffected node, then re-inflate when moved back to the original node/aggr
                           Volume    Allocated      Used
vol created on problem aggr v164402 11932348KB 1924KB after move to unaffected aggr v164402 2872KB 2872KB after move back to orig aggr v164402 12005552KB 75284KB

resized to 1TB, moved to another aggr, and back to orig aggr
                          v164402    6003080KB   37972KB
it appears to be related to the size of the volume. we thin-provision all volumes by setting them to a large size and set space guarantee to none.

vols sized at 2TB end up being allocated 12GB of space when created. vols sized to 1TB end up with 6GB allocated when created, even though they are completely empty.

what's strange is this is happening on some filers/aggregates but not others.

these clusters originated running Ontap GX, and then were upgraded to Ontap8 c-mode some time ago.

the aggregates that seem to be experiencing this allocation 'inflation' seem to be those that were created after the cluster was upgraded to Ontap8.

the aggregates that were originally created as GX aggrs, have identically matching 'allocated' and 'used' values in 'aggr show_space'

for example:

recently built aggregate under Ontap8:

Aggregate Allocated Used Avail Total space 18526684036KB 13662472732KB 1233809368KB <- massive difference!

aggregate originated under Ontap GX:

Aggregate Allocated Used Avail Total space 32035928920KB 32035928920KB 1404570796KB <- identical, which is what i would expect

maybe this allocation scheme changed at an aggregate level under Ontap 8? perhaps it's expected behavior?

things do seem to normalize as the volumes begin to fill up though, so I believe that this space is not truly gone permanently, but it certainly appears to be not available for use, since tons of space is allocated to volumes that have very little data in them, and we have a LOT of volumes in these clusters.

it's definitely making it look like we are missing a substantial percentage of our disk space, when trying to reconcile the sum of data used when tallying up volume size, and comparing it to the aggregate used/remaining sizes.

example volume and affected aggregate details:

bc-gx-4b::*> vol show v164346 -instance (volume show)
                          Virtual Server Name: bc
                                  Volume Name: v164346
                               Aggregate Name: gx4b_1
                                  Volume Size: 2TB
                                 Name Ordinal: base
                           Volume Data Set ID: 4041125
                    Volume Master Data Set ID: 2151509138
                                 Volume State: online
                                  Volume Type: RW
                                 Volume Style: flex
                             Volume Ownership: cluster
                                Export Policy: default
                                      User ID: jobsys
                                     Group ID: cgi
                               Security Style: unix
                             Unix Permissions: ---rwxr-x--x
                                Junction Path: /bc/shows/ID2/DFO/0820
                         Junction Path Source: RW_volume
                              Junction Active: true
                                Parent Volume: v164232
                   Virtual Server Root Volume: false
                                      Comment:
                               Available Size: 1.15TB
                                   Total Size: 2TB
                                    Used Size: 146.7MB
                              Used Percentage: 42%
         Autosize Enabled (for flexvols only): false
         Maximum Autosize (for flexvols only): 2.40TB
       Autosize Increment (for flexvols only): 102.4GB
          Total Files (for user-visible data): 31876689
           Files Used (for user-visible data): 132
                       Maximum Directory Size: 100MB
                        Space Guarantee Style: none
                    Space Guarantee In Effect: true
                           Minimum Read Ahead: false
                   Access Time Update Enabled: true
            Snapshot Directory Access Enabled: true
      Percent of Space Reserved for Snapshots: 0%
             Used Percent of Snapshot Reserve: 0%
                              Snapshot Policy: daily
                                Creation Time: Tue Dec 08 11:22:58 2015
                                     Language: C
                    Striped Data Volume Count: -
             Striped Data Volume Stripe Width: 0.00B
                       Current Striping Epoch: -
         One data-volume per member aggregate: -
                            Concurrency Level: -
                          Optimization Policy: -
                                 Clone Volume: false
                  Anti-Virus On-Access Policy: default
                           UUID of the volume:
17fa4c6d-9de1-11e5-a888-123478563412 Striped Volume Format: - Load Sharing Source Volume: - Move Target Volume: false Maximum Write Alloc Blocks: 0 Inconsistency in the file system: false

bc-gx-4b::*> aggr show -aggregate far4a_1 gx1a_1 gx1a_2 gx1b_1 gx2a_1 gx2b_1 gx3a_1 gx4b_1 near1b_1 near3b_1 root_1a root_1b root_2a root_2b root_3a root_3b root_4a root_4b slow2a_1 systems bc-gx-4b::*> aggr show -aggregate gx4b_1
                                Aggregate: gx4b_1
                                     UUID:
c624f85e-96d3-11e3-a6ce-00a0980bb25a Size: 18.40TB Used Size: 17.25TB Used Percentage: 94% Available Size: 1.15TB State: online Nodes: bc-gx-4b Number Of Disks: 63 Disks: bc-gx-4b:0a.64, bc-gx-4b:0e.80, ... bc-gx-4b:0a.45 Number Of Volumes: 411 Plexes: /gx4b_1/plex0(online) RAID Groups: /gx4b_1/plex0/rg0, /gx4b_1/plex0/rg1, /gx4b_1/plex0/rg2 Raid Type: raid_dp Max RAID Size: 21 RAID Status: raid_dp Checksum Enabled: true Checksum Status: active Checksum Style: block Inconsistent: false Ignore Inconsistent: off Block Checksum Protection: on Zoned Checksum Protection: - Automatic Snapshot Deletion: on Enable Thorough Scrub: off Volume Style: flex Volume Types: flex Has Mroot Volume: false Has Partner Node Mroot Volume: false Is root: false Wafliron Status: - Percent Blocks Scanned: - Last Start Error Number: - Last Start Error Info: - Aggregate Type: aggr Number of Quiesced Volumes: - Number of Volumes not Online: - Number of LS Mirror Destination Volumes: - Number of DP Mirror Destination Volumes: - Number of Move Mirror Destination Volumes: - Number of DP qtree Mirror Destination Volumes: - HA Policy: sfo Block Type: 64-bit

On Wed, Dec 9, 2015 at 12:50 PM, John Stoffel john@stoffel.org wrote:

...
Can you post the details of one of these volumes? And of the aggregate you have them in? It smells like there's some sort of minimum volume size setting somewhere.

Or maybe there's an aggregate level snapshot sitting around?

Can you upgrade? You're in cluster mode, so hopefully it shoudln't be too hard to move to 8.1, then 8.2 and onto 8.3, since there's lots of nice bug fixes.

Toasters mailing listToasters@teaparty.nethttp://www.teaparty.net/mailman/listinfo/toasters

Douglas Siggins

8:37 p.m.

I used autogrow on GX, it was one feature that was rock-solid. What version of GX are you on? GX itself was quite solid with the last version released. Are you the last person still using it :) ? I thought i'd have been!

On Thu, Dec 10, 2015 at 2:07 PM, Mike Thompson mike.thompson@gmail.com wrote:

...

Thanks Sebastian

I as well figured it was something like that - thanks for the link - makes sense.

I'll have to experiment with the autogrow stuff - I recall we were steered away from that by NetApp themselves saying it wasn't all that stable in Ontap 8.0, but never did fool with it. I will give it a shot though.

Thanks everyone

On Thu, Dec 10, 2015 at 8:13 AM, Sebastian Goetze spgoetze@gmail.com wrote:

...
I didn't do the math, but just an idea:

ONTAP calculates an average file size of 32K to determine the number of inodes per volume. Every inode takes 192 Bytes from the INODEFILE. So, if you take 2TB / 32KB * 192B = 12GB... (actually I *did* do the math right now, I was curious...)

It seems this is enforced on *newer* aggregates... According to KB https://kb.netapp.com/support/index?page=content&id=3011432 you can't go below the VolSize/32K limit.

How about provisioning the volumes small (and thin) and letting them *AutoGrow* to 2TB? That way, they're not using up space (not even for inodes), yet are able to contain up to 2TB...

Just my 2c

Sebastian

On 12/10/2015 8:43 AM, Mike Thompson wrote:

sample vol and aggr details below

can't upgrade, not on support, we have about 1PB in production across two clusters. I am seeing this effect on both clusters, but not on all filers, which is strange.

if I move vols between affected and unaffected filers/aggrs, the allocated vs used normalizes for the volumes on the unaffected node, then re-inflate when moved back to the original node/aggr
                           Volume    Allocated      Used
vol created on problem aggr v164402 11932348KB 1924KB after move to unaffected aggr v164402 2872KB 2872KB after move back to orig aggr v164402 12005552KB 75284KB

resized to 1TB, moved to another aggr, and back to orig aggr
                          v164402    6003080KB   37972KB
it appears to be related to the size of the volume. we thin-provision all volumes by setting them to a large size and set space guarantee to none.

vols sized at 2TB end up being allocated 12GB of space when created. vols sized to 1TB end up with 6GB allocated when created, even though they are completely empty.

what's strange is this is happening on some filers/aggregates but not others.

these clusters originated running Ontap GX, and then were upgraded to Ontap8 c-mode some time ago.

the aggregates that seem to be experiencing this allocation 'inflation' seem to be those that were created after the cluster was upgraded to Ontap8.

the aggregates that were originally created as GX aggrs, have identically matching 'allocated' and 'used' values in 'aggr show_space'

for example:

recently built aggregate under Ontap8:

Aggregate Allocated Used Avail Total space 18526684036KB 13662472732KB 1233809368KB <- massive difference!

aggregate originated under Ontap GX:

Aggregate Allocated Used Avail Total space 32035928920KB 32035928920KB 1404570796KB <- identical, which is what i would expect

maybe this allocation scheme changed at an aggregate level under Ontap 8? perhaps it's expected behavior?

things do seem to normalize as the volumes begin to fill up though, so I believe that this space is not truly gone permanently, but it certainly appears to be not available for use, since tons of space is allocated to volumes that have very little data in them, and we have a LOT of volumes in these clusters.

it's definitely making it look like we are missing a substantial percentage of our disk space, when trying to reconcile the sum of data used when tallying up volume size, and comparing it to the aggregate used/remaining sizes.

example volume and affected aggregate details:

bc-gx-4b::*> vol show v164346 -instance (volume show)
                          Virtual Server Name: bc
                                  Volume Name: v164346
                               Aggregate Name: gx4b_1
                                  Volume Size: 2TB
                                 Name Ordinal: base
                           Volume Data Set ID: 4041125
                    Volume Master Data Set ID: 2151509138
                                 Volume State: online
                                  Volume Type: RW
                                 Volume Style: flex
                             Volume Ownership: cluster
                                Export Policy: default
                                      User ID: jobsys
                                     Group ID: cgi
                               Security Style: unix
                             Unix Permissions: ---rwxr-x--x
                                Junction Path: /bc/shows/ID2/DFO/0820
                         Junction Path Source: RW_volume
                              Junction Active: true
                                Parent Volume: v164232
                   Virtual Server Root Volume: false
                                      Comment:
                               Available Size: 1.15TB
                                   Total Size: 2TB
                                    Used Size: 146.7MB
                              Used Percentage: 42%
         Autosize Enabled (for flexvols only): false
         Maximum Autosize (for flexvols only): 2.40TB
       Autosize Increment (for flexvols only): 102.4GB
          Total Files (for user-visible data): 31876689
           Files Used (for user-visible data): 132
                       Maximum Directory Size: 100MB
                        Space Guarantee Style: none
                    Space Guarantee In Effect: true
                           Minimum Read Ahead: false
                   Access Time Update Enabled: true
            Snapshot Directory Access Enabled: true
      Percent of Space Reserved for Snapshots: 0%
             Used Percent of Snapshot Reserve: 0%
                              Snapshot Policy: daily
                                Creation Time: Tue Dec 08 11:22:58
2015 Language: C Striped Data Volume Count: - Striped Data Volume Stripe Width: 0.00B Current Striping Epoch: - One data-volume per member aggregate: - Concurrency Level: - Optimization Policy: - Clone Volume: false Anti-Virus On-Access Policy: default UUID of the volume: 17fa4c6d-9de1-11e5-a888-123478563412 Striped Volume Format: - Load Sharing Source Volume: - Move Target Volume: false Maximum Write Alloc Blocks: 0 Inconsistency in the file system: false

bc-gx-4b::*> aggr show -aggregate far4a_1 gx1a_1 gx1a_2 gx1b_1 gx2a_1 gx2b_1 gx3a_1 gx4b_1 near1b_1 near3b_1 root_1a root_1b root_2a root_2b root_3a root_3b root_4a root_4b slow2a_1 systems bc-gx-4b::*> aggr show -aggregate gx4b_1
                                Aggregate: gx4b_1
                                     UUID:
c624f85e-96d3-11e3-a6ce-00a0980bb25a Size: 18.40TB Used Size: 17.25TB Used Percentage: 94% Available Size: 1.15TB State: online Nodes: bc-gx-4b Number Of Disks: 63 Disks: bc-gx-4b:0a.64, bc-gx-4b:0e.80, ... bc-gx-4b:0a.45 Number Of Volumes: 411 Plexes: /gx4b_1/plex0(online) RAID Groups: /gx4b_1/plex0/rg0, /gx4b_1/plex0/rg1, /gx4b_1/plex0/rg2 Raid Type: raid_dp Max RAID Size: 21 RAID Status: raid_dp Checksum Enabled: true Checksum Status: active Checksum Style: block Inconsistent: false Ignore Inconsistent: off Block Checksum Protection: on Zoned Checksum Protection: - Automatic Snapshot Deletion: on Enable Thorough Scrub: off Volume Style: flex Volume Types: flex Has Mroot Volume: false Has Partner Node Mroot Volume: false Is root: false Wafliron Status: - Percent Blocks Scanned: - Last Start Error Number: - Last Start Error Info: - Aggregate Type: aggr Number of Quiesced Volumes: - Number of Volumes not Online: - Number of LS Mirror Destination Volumes: - Number of DP Mirror Destination Volumes: - Number of Move Mirror Destination Volumes: - Number of DP qtree Mirror Destination Volumes: - HA Policy: sfo Block Type: 64-bit

On Wed, Dec 9, 2015 at 12:50 PM, John Stoffel john@stoffel.org wrote:

...
Can you post the details of one of these volumes? And of the aggregate you have them in? It smells like there's some sort of minimum volume size setting somewhere.

Or maybe there's an aggregate level snapshot sitting around?

Can you upgrade? You're in cluster mode, so hopefully it shoudln't be too hard to move to 8.1, then 8.2 and onto 8.3, since there's lots of nice bug fixes.

Toasters mailing listToasters@teaparty.nethttp://www.teaparty.net/mailman/listinfo/toasters
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters

Douglas Siggins

8:39 p.m.

Just caught that --- originated. C-mode is just a newer version of GX, so I'd assume the autogrow should work just fine.

On Thu, Dec 10, 2015 at 3:37 PM, Douglas Siggins siggins@gmail.com wrote:

...

I used autogrow on GX, it was one feature that was rock-solid. What version of GX are you on? GX itself was quite solid with the last version released. Are you the last person still using it :) ? I thought i'd have been!

On Thu, Dec 10, 2015 at 2:07 PM, Mike Thompson mike.thompson@gmail.com wrote:

...
Thanks Sebastian

I as well figured it was something like that - thanks for the link - makes sense.

I'll have to experiment with the autogrow stuff - I recall we were steered away from that by NetApp themselves saying it wasn't all that stable in Ontap 8.0, but never did fool with it. I will give it a shot though.

Thanks everyone

On Thu, Dec 10, 2015 at 8:13 AM, Sebastian Goetze spgoetze@gmail.com wrote:

...
I didn't do the math, but just an idea:

ONTAP calculates an average file size of 32K to determine the number of inodes per volume. Every inode takes 192 Bytes from the INODEFILE. So, if you take 2TB / 32KB * 192B = 12GB... (actually I *did* do the math right now, I was curious...)

It seems this is enforced on *newer* aggregates... According to KB https://kb.netapp.com/support/index?page=content&id=3011432 you can't go below the VolSize/32K limit.

How about provisioning the volumes small (and thin) and letting them *AutoGrow* to 2TB? That way, they're not using up space (not even for inodes), yet are able to contain up to 2TB...

Just my 2c

Sebastian

On 12/10/2015 8:43 AM, Mike Thompson wrote:

sample vol and aggr details below

can't upgrade, not on support, we have about 1PB in production across two clusters. I am seeing this effect on both clusters, but not on all filers, which is strange.

if I move vols between affected and unaffected filers/aggrs, the allocated vs used normalizes for the volumes on the unaffected node, then re-inflate when moved back to the original node/aggr
                           Volume    Allocated      Used
vol created on problem aggr v164402 11932348KB 1924KB after move to unaffected aggr v164402 2872KB 2872KB after move back to orig aggr v164402 12005552KB 75284KB

resized to 1TB, moved to another aggr, and back to orig aggr
                          v164402    6003080KB   37972KB
it appears to be related to the size of the volume. we thin-provision all volumes by setting them to a large size and set space guarantee to none.

vols sized at 2TB end up being allocated 12GB of space when created. vols sized to 1TB end up with 6GB allocated when created, even though they are completely empty.

what's strange is this is happening on some filers/aggregates but not others.

these clusters originated running Ontap GX, and then were upgraded to Ontap8 c-mode some time ago.

the aggregates that seem to be experiencing this allocation 'inflation' seem to be those that were created after the cluster was upgraded to Ontap8.

the aggregates that were originally created as GX aggrs, have identically matching 'allocated' and 'used' values in 'aggr show_space'

for example:

recently built aggregate under Ontap8:

Aggregate Allocated Used Avail Total space 18526684036KB 13662472732KB 1233809368KB <- massive difference!

aggregate originated under Ontap GX:

Aggregate Allocated Used Avail Total space 32035928920KB 32035928920KB 1404570796KB <- identical, which is what i would expect

maybe this allocation scheme changed at an aggregate level under Ontap 8? perhaps it's expected behavior?

things do seem to normalize as the volumes begin to fill up though, so I believe that this space is not truly gone permanently, but it certainly appears to be not available for use, since tons of space is allocated to volumes that have very little data in them, and we have a LOT of volumes in these clusters.

it's definitely making it look like we are missing a substantial percentage of our disk space, when trying to reconcile the sum of data used when tallying up volume size, and comparing it to the aggregate used/remaining sizes.

example volume and affected aggregate details:

bc-gx-4b::*> vol show v164346 -instance (volume show)
                          Virtual Server Name: bc
                                  Volume Name: v164346
                               Aggregate Name: gx4b_1
                                  Volume Size: 2TB
                                 Name Ordinal: base
                           Volume Data Set ID: 4041125
                    Volume Master Data Set ID: 2151509138
                                 Volume State: online
                                  Volume Type: RW
                                 Volume Style: flex
                             Volume Ownership: cluster
                                Export Policy: default
                                      User ID: jobsys
                                     Group ID: cgi
                               Security Style: unix
                             Unix Permissions: ---rwxr-x--x
                                Junction Path: /bc/shows/ID2/DFO/0820
                         Junction Path Source: RW_volume
                              Junction Active: true
                                Parent Volume: v164232
                   Virtual Server Root Volume: false
                                      Comment:
                               Available Size: 1.15TB
                                   Total Size: 2TB
                                    Used Size: 146.7MB
                              Used Percentage: 42%
         Autosize Enabled (for flexvols only): false
         Maximum Autosize (for flexvols only): 2.40TB
       Autosize Increment (for flexvols only): 102.4GB
          Total Files (for user-visible data): 31876689
           Files Used (for user-visible data): 132
                       Maximum Directory Size: 100MB
                        Space Guarantee Style: none
                    Space Guarantee In Effect: true
                           Minimum Read Ahead: false
                   Access Time Update Enabled: true
            Snapshot Directory Access Enabled: true
      Percent of Space Reserved for Snapshots: 0%
             Used Percent of Snapshot Reserve: 0%
                              Snapshot Policy: daily
                                Creation Time: Tue Dec 08 11:22:58
2015 Language: C Striped Data Volume Count: - Striped Data Volume Stripe Width: 0.00B Current Striping Epoch: - One data-volume per member aggregate: - Concurrency Level: - Optimization Policy: - Clone Volume: false Anti-Virus On-Access Policy: default UUID of the volume: 17fa4c6d-9de1-11e5-a888-123478563412 Striped Volume Format: - Load Sharing Source Volume: - Move Target Volume: false Maximum Write Alloc Blocks: 0 Inconsistency in the file system: false

bc-gx-4b::*> aggr show -aggregate far4a_1 gx1a_1 gx1a_2 gx1b_1 gx2a_1 gx2b_1 gx3a_1 gx4b_1 near1b_1 near3b_1 root_1a root_1b root_2a root_2b root_3a root_3b root_4a root_4b slow2a_1 systems bc-gx-4b::*> aggr show -aggregate gx4b_1
                                Aggregate: gx4b_1
                                     UUID:
c624f85e-96d3-11e3-a6ce-00a0980bb25a Size: 18.40TB Used Size: 17.25TB Used Percentage: 94% Available Size: 1.15TB State: online Nodes: bc-gx-4b Number Of Disks: 63 Disks: bc-gx-4b:0a.64, bc-gx-4b:0e.80, ... bc-gx-4b:0a.45 Number Of Volumes: 411 Plexes: /gx4b_1/plex0(online) RAID Groups: /gx4b_1/plex0/rg0, /gx4b_1/plex0/rg1, /gx4b_1/plex0/rg2 Raid Type: raid_dp Max RAID Size: 21 RAID Status: raid_dp Checksum Enabled: true Checksum Status: active Checksum Style: block Inconsistent: false Ignore Inconsistent: off Block Checksum Protection: on Zoned Checksum Protection: - Automatic Snapshot Deletion: on Enable Thorough Scrub: off Volume Style: flex Volume Types: flex Has Mroot Volume: false Has Partner Node Mroot Volume: false Is root: false Wafliron Status: - Percent Blocks Scanned: - Last Start Error Number: - Last Start Error Info: - Aggregate Type: aggr Number of Quiesced Volumes: - Number of Volumes not Online: - Number of LS Mirror Destination Volumes: - Number of DP Mirror Destination Volumes: - Number of Move Mirror Destination Volumes: - Number of DP qtree Mirror Destination Volumes: - HA Policy: sfo Block Type: 64-bit

On Wed, Dec 9, 2015 at 12:50 PM, John Stoffel john@stoffel.org wrote:

...
Can you post the details of one of these volumes? And of the aggregate you have them in? It smells like there's some sort of minimum volume size setting somewhere.

Or maybe there's an aggregate level snapshot sitting around?

Can you upgrade? You're in cluster mode, so hopefully it shoudln't be too hard to move to 8.1, then 8.2 and onto 8.3, since there's lots of nice bug fixes.

Toasters mailing listToasters@teaparty.nethttp://www.teaparty.net/mailman/listinfo/toasters
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters

andrei.borzenkov＠ts.fujitsu.com

11 Dec 11 Dec

7:24 a.m.

Indeed. On a 7.3.7P1 immediately after creating 1TB volume with space guarantee “none” this volume consumes approximately 6GB of allocated space:

Volume Allocated Used Guarantee t 5965988KB 772KB none

which more or less matches inodefile size, although the space actually consumed by it is zero:

16K ---------- 1 root root 6.1G Dec 11 10:11 inofile

I was under impression that inofile grows as needed but probably it changed at some point.

Thank you!

--- With best regards

From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Sebastian Goetze Sent: Thursday, December 10, 2015 7:14 PM To: Mike Thompson; John Stoffel Cc: toasters@teaparty.net Lists Subject: Re: cdot missing disk space

I didn't do the math, but just an idea:

It seems this is enforced on *newer* aggregates... According to KB https://kb.netapp.com/support/index?page=content&id=3011432 you can't go below the VolSize/32K limit.

How about provisioning the volumes small (and thin) and letting them *AutoGrow* to 2TB? That way, they're not using up space (not even for inodes), yet are able to contain up to 2TB...

Just my 2c

Sebastian

On 12/10/2015 8:43 AM, Mike Thompson wrote: sample vol and aggr details below can't upgrade, not on support, we have about 1PB in production across two clusters. I am seeing this effect on both clusters, but not on all filers, which is strange. if I move vols between affected and unaffected filers/aggrs, the allocated vs used normalizes for the volumes on the unaffected node, then re-inflate when moved back to the original node/aggr

Volume Allocated Used vol created on problem aggr v164402 11932348KB 1924KB after move to unaffected aggr v164402 2872KB 2872KB after move back to orig aggr v164402 12005552KB 75284KB

resized to 1TB, moved to another aggr, and back to orig aggr

v164402 6003080KB 37972KB it appears to be related to the size of the volume. we thin-provision all volumes by setting them to a large size and set space guarantee to none.

vols sized at 2TB end up being allocated 12GB of space when created. vols sized to 1TB end up with 6GB allocated when created, even though they are completely empty. what's strange is this is happening on some filers/aggregates but not others.

these clusters originated running Ontap GX, and then were upgraded to Ontap8 c-mode some time ago.

the aggregates that seem to be experiencing this allocation 'inflation' seem to be those that were created after the cluster was upgraded to Ontap8. the aggregates that were originally created as GX aggrs, have identically matching 'allocated' and 'used' values in 'aggr show_space' for example: recently built aggregate under Ontap8:

Aggregate Allocated Used Avail Total space 18526684036KB 13662472732KB 1233809368KB <- massive difference! aggregate originated under Ontap GX:

Aggregate Allocated Used Avail Total space 32035928920KB 32035928920KB 1404570796KB <- identical, which is what i would expect maybe this allocation scheme changed at an aggregate level under Ontap 8? perhaps it's expected behavior? things do seem to normalize as the volumes begin to fill up though, so I believe that this space is not truly gone permanently, but it certainly appears to be not available for use, since tons of space is allocated to volumes that have very little data in them, and we have a LOT of volumes in these clusters.

bc-gx-4b::*> vol show v164346 -instance (volume show)

On Wed, Dec 9, 2015 at 12:50 PM, John Stoffel <john@stoffel.orgmailto:john@stoffel.org> wrote:

Can you post the details of one of these volumes? And of the aggregate you have them in? It smells like there's some sort of minimum volume size setting somewhere.

Or maybe there's an aggregate level snapshot sitting around?

Can you upgrade? You're in cluster mode, so hopefully it shoudln't be too hard to move to 8.1, then 8.2 and onto 8.3, since there's lots of nice bug fixes.

_______________________________________________

Toasters mailing list

Toasters@teaparty.netmailto:Toasters@teaparty.net

http://www.teaparty.net/mailman/listinfo/toasters

Sebastian P. Goetze

7:35 a.m.

I have a feeling, that the FILE has a space guarantee... Therefore even with a volume guarantee of NONE, the space for the inodefile, being guaranteed, is taken from the aggregate, just to make sure to be able to store the metadata for the assumed maximum number of files to be stored in the volume...

Sebastian

On Fri, Dec 11, 2015, 08:24 andrei.borzenkov@ts.fujitsu.com < andrei.borzenkov@ts.fujitsu.com> wrote:

...

Indeed. On a 7.3.7P1 immediately after creating 1TB volume with space guarantee “none” this volume consumes approximately 6GB of allocated space:

Volume Allocated Used Guarantee

t 5965988KB 772KB none

which more or less matches inodefile size, although the space actually consumed by it is zero:

16K ---------- 1 root root 6.1G Dec 11 10:11 inofile

I was under impression that inofile grows as needed but probably it changed at some point.

Thank you!

With best regards

*Andre**i** Borzenkov*

Senior system engineer

FTS WEMEAI RUC RU SC TMS FOS

[image: cid:image001.gif@01CBF835.B3FEDA90]

*FUJITSU*

Zemlyanoy Val Street, 9, 105 064 Moscow, Russian Federation

Tel.: +7 495 730 62 20 ( reception)

Mob.: +7 916 678 7208

Fax: +7 495 730 62 14

E-mail: Andrei.Borzenkov@ts.fujitsu.com

Web: ru.fujitsu.com http://ts.fujitsu.com/

Company details: ts.fujitsu.com/imprint http://ts.fujitsu.com/imprint.html

This communication contains information that is confidential, proprietary in nature and/or privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s) or the person responsible for delivering it to the intended recipient(s), please note that any form of dissemination, distribution or copying of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender and delete the original communication. Thank you for your cooperation.

Please be advised that neither Fujitsu, its affiliates, its employees or agents accept liability for any errors, omissions or damages caused by delays of receipt or by any virus infection in this message or its attachments, or which may otherwise arise as a result of this e-mail transmission.

*From:* toasters-bounces@teaparty.net [mailto: toasters-bounces@teaparty.net] *On Behalf Of *Sebastian Goetze *Sent:* Thursday, December 10, 2015 7:14 PM *To:* Mike Thompson; John Stoffel

*Cc:* toasters@teaparty.net Lists *Subject:* Re: cdot missing disk space

I didn't do the math, but just an idea:

ONTAP calculates an average file size of 32K to determine the number of inodes per volume. Every inode takes 192 Bytes from the INODEFILE. So, if you take 2TB / 32KB * 192B = 12GB... (actually I *did* do the math right now, I was curious...)

It seems this is enforced on *newer* aggregates... According to KB https://kb.netapp.com/support/index?page=content&id=3011432 you can't go below the VolSize/32K limit.

How about provisioning the volumes small (and thin) and letting them *AutoGrow* to 2TB? That way, they're not using up space (not even for inodes), yet are able to contain up to 2TB...

Just my 2c

Sebastian

On 12/10/2015 8:43 AM, Mike Thompson wrote:

sample vol and aggr details below

can't upgrade, not on support, we have about 1PB in production across two clusters. I am seeing this effect on both clusters, but not on all filers, which is strange.

if I move vols between affected and unaffected filers/aggrs, the allocated vs used normalizes for the volumes on the unaffected node, then re-inflate when moved back to the original node/aggr
                           Volume    Allocated      Used
vol created on problem aggr v164402 11932348KB 1924KB after move to unaffected aggr v164402 2872KB 2872KB after move back to orig aggr v164402 12005552KB 75284KB

resized to 1TB, moved to another aggr, and back to orig aggr
                          v164402    6003080KB   37972KB
it appears to be related to the size of the volume. we thin-provision all volumes by setting them to a large size and set space guarantee to none.

vols sized at 2TB end up being allocated 12GB of space when created. vols sized to 1TB end up with 6GB allocated when created, even though they are completely empty.

what's strange is this is happening on some filers/aggregates but not others.

these clusters originated running Ontap GX, and then were upgraded to Ontap8 c-mode some time ago.

the aggregates that seem to be experiencing this allocation 'inflation' seem to be those that were created after the cluster was upgraded to Ontap8.

the aggregates that were originally created as GX aggrs, have identically matching 'allocated' and 'used' values in 'aggr show_space'

for example:

recently built aggregate under Ontap8:

Aggregate Allocated Used Avail Total space 18526684036KB 13662472732KB 1233809368KB <- massive difference!

aggregate originated under Ontap GX:

Aggregate Allocated Used Avail Total space 32035928920KB 32035928920KB 1404570796KB <- identical, which is what i would expect

maybe this allocation scheme changed at an aggregate level under Ontap 8? perhaps it's expected behavior?

things do seem to normalize as the volumes begin to fill up though, so I believe that this space is not truly gone permanently, but it certainly appears to be not available for use, since tons of space is allocated to volumes that have very little data in them, and we have a LOT of volumes in these clusters.

it's definitely making it look like we are missing a substantial percentage of our disk space, when trying to reconcile the sum of data used when tallying up volume size, and comparing it to the aggregate used/remaining sizes.

example volume and affected aggregate details:

bc-gx-4b::*> vol show v164346 -instance (volume show)
                          Virtual Server Name: bc
                                  Volume Name: v164346
                               Aggregate Name: gx4b_1
                                  Volume Size: 2TB
                                 Name Ordinal: base
                           Volume Data Set ID: 4041125
                    Volume Master Data Set ID: 2151509138
                                 Volume State: online
                                  Volume Type: RW
                                 Volume Style: flex
                             Volume Ownership: cluster
                                Export Policy: default
                                      User ID: jobsys
                                     Group ID: cgi
                               Security Style: unix
                             Unix Permissions: ---rwxr-x--x
                                Junction Path: /bc/shows/ID2/DFO/0820
                         Junction Path Source: RW_volume
                              Junction Active: true
                                Parent Volume: v164232
                   Virtual Server Root Volume: false
                                      Comment:
                               Available Size: 1.15TB
                                   Total Size: 2TB
                                    Used Size: 146.7MB
                              Used Percentage: 42%
         Autosize Enabled (for flexvols only): false
         Maximum Autosize (for flexvols only): 2.40TB
       Autosize Increment (for flexvols only): 102.4GB
          Total Files (for user-visible data): 31876689
           Files Used (for user-visible data): 132
                       Maximum Directory Size: 100MB
                        Space Guarantee Style: none
                    Space Guarantee In Effect: true
                           Minimum Read Ahead: false
                   Access Time Update Enabled: true
            Snapshot Directory Access Enabled: true
      Percent of Space Reserved for Snapshots: 0%
             Used Percent of Snapshot Reserve: 0%
                              Snapshot Policy: daily
                                Creation Time: Tue Dec 08 11:22:58 2015
                                     Language: C
                    Striped Data Volume Count: -
             Striped Data Volume Stripe Width: 0.00B
                       Current Striping Epoch: -
         One data-volume per member aggregate: -
                            Concurrency Level: -
                          Optimization Policy: -
                                 Clone Volume: false
                  Anti-Virus On-Access Policy: default
                           UUID of the volume:
17fa4c6d-9de1-11e5-a888-123478563412 Striped Volume Format: - Load Sharing Source Volume: - Move Target Volume: false Maximum Write Alloc Blocks: 0 Inconsistency in the file system: false

bc-gx-4b::*> aggr show -aggregate far4a_1 gx1a_1 gx1a_2 gx1b_1 gx2a_1 gx2b_1 gx3a_1 gx4b_1 near1b_1 near3b_1 root_1a root_1b root_2a root_2b root_3a root_3b root_4a root_4b slow2a_1 systems bc-gx-4b::*> aggr show -aggregate gx4b_1
                                Aggregate: gx4b_1
                                     UUID:
c624f85e-96d3-11e3-a6ce-00a0980bb25a Size: 18.40TB Used Size: 17.25TB Used Percentage: 94% Available Size: 1.15TB State: online Nodes: bc-gx-4b Number Of Disks: 63 Disks: bc-gx-4b:0a.64, bc-gx-4b:0e.80, ... bc-gx-4b:0a.45 Number Of Volumes: 411 Plexes: /gx4b_1/plex0(online) RAID Groups: /gx4b_1/plex0/rg0, /gx4b_1/plex0/rg1, /gx4b_1/plex0/rg2 Raid Type: raid_dp Max RAID Size: 21 RAID Status: raid_dp Checksum Enabled: true Checksum Status: active Checksum Style: block Inconsistent: false Ignore Inconsistent: off Block Checksum Protection: on Zoned Checksum Protection: - Automatic Snapshot Deletion: on Enable Thorough Scrub: off Volume Style: flex Volume Types: flex Has Mroot Volume: false Has Partner Node Mroot Volume: false Is root: false Wafliron Status: - Percent Blocks Scanned: - Last Start Error Number: - Last Start Error Info: - Aggregate Type: aggr Number of Quiesced Volumes: - Number of Volumes not Online: - Number of LS Mirror Destination Volumes: - Number of DP Mirror Destination Volumes: - Number of Move Mirror Destination Volumes: - Number of DP qtree Mirror Destination Volumes: - HA Policy: sfo Block Type: 64-bit

On Wed, Dec 9, 2015 at 12:50 PM, John Stoffel john@stoffel.org wrote:

Can you post the details of one of these volumes? And of the aggregate you have them in? It smells like there's some sort of minimum volume size setting somewhere.

Or maybe there's an aggregate level snapshot sitting around?

Can you upgrade? You're in cluster mode, so hopefully it shoudln't be too hard to move to 8.1, then 8.2 and onto 8.3, since there's lots of nice bug fixes.

Toasters mailing list

Toasters@teaparty.net

http://www.teaparty.net/mailman/listinfo/toasters

-- sent from my mobile, spellcheck might have messed up...

Rob Bush

9 Dec 9 Dec

12:22 p.m.

Not sure what the cdot equivalent would be, but you could check "aggr show_space" from the nodeshell. On Dec 9, 2015 3:07 AM, "Mike Thompson" mike.thompson@gmail.com wrote:

...

Hey all,

I've got a cluster running cluster-mode 8.0.5 (don't laugh) that has an aggregate which is reporting much higher used size than I can account for based on the volumes contained on it.

according to 'aggr show' and 'df -A' the aggregate has around 17T of space consumed.

bc-gx-4b::> aggr show -aggregate gx4b_1 -fields size, usedsize, availsize aggregate size usedsize availsize

gx4b_1 18.40TB 17.08TB 1.32TB

though per my database, a tally of all the volumes contained on this aggregate, the total amount of space consumed by the volumes is only about 12.5T of space, so a significant amount is being soaked up by something.

I get the same numbers from the command line as well:

ssh admin@bc-gx-4b "set -units MB; vol show -aggregate gx4b_1 -fields used" | egrep "^bc" | awk '{print $3}' | sed 's/[^0-9]*//g' | paste -sd+ | bc 12528994

so the sum of the volumes is about 12.5T, but the aggregate thinks there is 17T used.

it's been in this state for some time. There haven't been any volumes recently moved off or deleted, so there isn't any space being recovered in the background.

'vol show -state offline' and 'set diag; vol lost-found show' isn't reporting anything.

Any ideas on how I might figure out what is sucking up the un-reported space?

Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters

Mike Thompson

7:28 p.m.

Thanks Rob!

'node run local aggr show_space' that seems to have pointed me to where the space is being consumed.

It would appear that every volume on this node/aggregate is being allocated a minimum of 12GB of space, regardless of how much is actually being used - and there are around 450 volumes on this node, so that adds up quickly to several TB.

this behavior seems to be specific to this filer (or aggregate, there is only one aggr besides the root aggr on this node) our other filers/aggregates seem to be allocating normally.

we set the space guarantee style to 'none' and size all of our volumes to 2TB in size, to basically thin provision everything and stick a 2TB 'quota' on them. we also set the snap reserve to 0% on all volumes.

so not sure what would be tweaking this filer or aggregate into having a 'floor' value for allocated space per volume.

any ideas from anyone appreciated

On Wed, Dec 9, 2015 at 4:22 AM, Rob Bush bushrsa@gmail.com wrote:

...

Not sure what the cdot equivalent would be, but you could check "aggr show_space" from the nodeshell. On Dec 9, 2015 3:07 AM, "Mike Thompson" mike.thompson@gmail.com wrote:

...
Hey all,

I've got a cluster running cluster-mode 8.0.5 (don't laugh) that has an aggregate which is reporting much higher used size than I can account for based on the volumes contained on it.

according to 'aggr show' and 'df -A' the aggregate has around 17T of space consumed.

bc-gx-4b::> aggr show -aggregate gx4b_1 -fields size, usedsize, availsize aggregate size usedsize availsize

gx4b_1 18.40TB 17.08TB 1.32TB

though per my database, a tally of all the volumes contained on this aggregate, the total amount of space consumed by the volumes is only about 12.5T of space, so a significant amount is being soaked up by something.

I get the same numbers from the command line as well:

ssh admin@bc-gx-4b "set -units MB; vol show -aggregate gx4b_1 -fields used" | egrep "^bc" | awk '{print $3}' | sed 's/[^0-9]*//g' | paste -sd+ | bc 12528994

so the sum of the volumes is about 12.5T, but the aggregate thinks there is 17T used.

it's been in this state for some time. There haven't been any volumes recently moved off or deleted, so there isn't any space being recovered in the background.

'vol show -state offline' and 'set diag; vol lost-found show' isn't reporting anything.

Any ideas on how I might figure out what is sucking up the un-reported space?

Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters

andrei.borzenkov＠ts.fujitsu.com

10 Dec 10 Dec

6:45 a.m.

What is on these volumes? After quick test I actually see similar discrepancy, although on smaller scale – on non-space guaranteed volume with empty LUN of 512M used space is 900K but allocated space is 8M.

--- With best regards

From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Mike Thompson Sent: Wednesday, December 09, 2015 10:29 PM To: Rob Bush Cc: toasters@teaparty.net Lists Subject: Re: cdot missing disk space

Thanks Rob!

'node run local aggr show_space' that seems to have pointed me to where the space is being consumed. It would appear that every volume on this node/aggregate is being allocated a minimum of 12GB of space, regardless of how much is actually being used - and there are around 450 volumes on this node, so that adds up quickly to several TB.

Volume Allocated Used Guarantee v155739 12114904KB 375536KB none v152384 12116928KB 442628KB none v151867 12119996KB 458980KB none v3943 13931776KB 2349252KB none v160916 113845300KB 102425476KB none v160922 6106299552KB 6079321492KB none v164234 12080808KB 152172KB none v164239 12080980KB 152332KB none v164244 12080680KB 152044KB none v164249 12080872KB 152268KB none v164254 12080860KB 152228KB none v164259 12080876KB 152200KB none ... this behavior seems to be specific to this filer (or aggregate, there is only one aggr besides the root aggr on this node) our other filers/aggregates seem to be allocating normally. we set the space guarantee style to 'none' and size all of our volumes to 2TB in size, to basically thin provision everything and stick a 2TB 'quota' on them. we also set the snap reserve to 0% on all volumes. so not sure what would be tweaking this filer or aggregate into having a 'floor' value for allocated space per volume. any ideas from anyone appreciated

On Wed, Dec 9, 2015 at 4:22 AM, Rob Bush <bushrsa@gmail.commailto:bushrsa@gmail.com> wrote:

Not sure what the cdot equivalent would be, but you could check "aggr show_space" from the nodeshell. On Dec 9, 2015 3:07 AM, "Mike Thompson" <mike.thompson@gmail.commailto:mike.thompson@gmail.com> wrote: Hey all, I've got a cluster running cluster-mode 8.0.5 (don't laugh) that has an aggregate which is reporting much higher used size than I can account for based on the volumes contained on it.

according to 'aggr show' and 'df -A' the aggregate has around 17T of space consumed.

I get the same numbers from the command line as well:

it's been in this state for some time. There haven't been any volumes recently moved off or deleted, so there isn't any space being recovered in the background.

'vol show -state offline' and 'set diag; vol lost-found show' isn't reporting anything. Any ideas on how I might figure out what is sucking up the un-reported space?

_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters

Walfherder

9 Dec 9 Dec

3:07 p.m.

Hi There,

I think the CDOT equivalent to good old "aggr show_space" is:

cluster-shell::> storage aggregate show-space

That will show the volume "footprints" and aggregate metadata sizes.

There was (is) the potential for some pretty serious issues with de-duplication in 8.0.5/8.1. Resolved in 8.1.2P4 I think. I assume both 7-mode and CDOT would be effected.

See the NetApp Support Bulletin / KB entry with ID 7010056. Essentially ONTAP might leak stale deduplication metadata fingerprints. The KB recommends running "sis check -c" and "sis status -l".

Good luck!

Cheers, Robb.

3612

Age (days ago)

3614

Last active (days ago)

toasters@lists.teaparty.net

16 comments

9 participants

tags (0)

participants (9)

andrei.borzenkov＠ts.fujitsu.com
Douglas Siggins
John Stoffel
Mike Thompson
Rob Bush
Sebastian Goetze
Sebastian P. Goetze
Tim McCarthy
Walfherder