Thanks Rob!

'node run local aggr show_space'  that seems to have pointed me to where the space is being consumed.

It would appear that every volume on this node/aggregate is being allocated a minimum of 12GB of space, regardless of how much is actually being used - and there are around 450 volumes on this node, so that adds up quickly to several TB.

Volume                          Allocated            Used       Guarantee
v155739                        12114904KB        375536KB            none
v152384                        12116928KB        442628KB            none
v151867                        12119996KB        458980KB            none
v3943                          13931776KB       2349252KB            none
v160916                       113845300KB     102425476KB            none
v160922                      6106299552KB    6079321492KB            none
v164234                        12080808KB        152172KB            none
v164239                        12080980KB        152332KB            none
v164244                        12080680KB        152044KB            none
v164249                        12080872KB        152268KB            none
v164254                        12080860KB        152228KB            none
v164259                        12080876KB        152200KB            none
...

this behavior seems to be specific to this filer (or aggregate, there is only one aggr besides the root aggr on this node)   our other filers/aggregates seem to be allocating normally.

we set the space guarantee style to 'none' and size all of our volumes to 2TB in size, to basically thin provision everything and stick a 2TB 'quota' on them.
we also set the snap reserve to 0% on all volumes.

so not sure what would be tweaking this filer or aggregate into having a 'floor' value for allocated space per volume.

any ideas from anyone appreciated



On Wed, Dec 9, 2015 at 4:22 AM, Rob Bush <bushrsa@gmail.com> wrote:

Not sure what the cdot equivalent would be, but you could check "aggr show_space" from the nodeshell.

On Dec 9, 2015 3:07 AM, "Mike Thompson" <mike.thompson@gmail.com> wrote:
Hey all,

I've got a cluster running cluster-mode 8.0.5 (don't laugh) that has an aggregate which is reporting much higher used size than I can account for based on the volumes contained on it.

according to 'aggr show' and 'df -A' the aggregate has around 17T of space consumed.

bc-gx-4b::> aggr show -aggregate gx4b_1 -fields size, usedsize, availsize
aggregate size    usedsize availsize
--------- ------- -------- ---------
gx4b_1    18.40TB 17.08TB  1.32TB


though per my database, a tally of all the volumes contained on this aggregate, the total amount of space consumed by the volumes is only about 12.5T of space, so a significant amount is being soaked up by something.

I get the same numbers from the command line as well:

ssh admin@bc-gx-4b "set -units MB; vol show -aggregate gx4b_1 -fields used" | egrep "^bc" | awk '{print $3}' | sed 's/[^0-9]*//g' | paste -sd+ | bc
12528994


so the sum of the volumes is about 12.5T, but the aggregate thinks there is 17T used.

it's been in this state for some time.  There haven't been any volumes recently moved off or deleted, so there isn't any space being recovered in the background.

'vol show -state offline' and 'set diag; vol lost-found show' isn't reporting anything.

Any ideas on how I might figure out what is sucking up the un-reported space?


_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters