Never seen this before
FAS3250 ONTAP 8.1.2
at 2am the volume hosting 110 VMs fills up and starts alerting
Get the call and delete several snapshots to take it to 83%, reboot the 110 VMs ;(
I have lots of tools (splunk, inmon etc) they are not able to show why this volume filled up
ideas?
Ok, let's start here:
Are you thin provisioning? If so, it is possible that some other volume on the same aggregate got out of hand.
Do you have OnCommand Unified Manager, OnCommand Performance Manager? Those might have been able to predict (if you are using cDOT)
--tmac
*Tim McCarthy* *Principal Consultant*
On Sun, Apr 26, 2015 at 8:49 AM, Fletcher Cocquyt fcocquyt@stanford.edu wrote:
Never seen this before
FAS3250 ONTAP 8.1.2
at 2am the volume hosting 110 VMs fills up and starts alerting
Get the call and delete several snapshots to take it to 83%, reboot the 110 VMs ;(
I have lots of tools (splunk, inmon etc) they are not able to show why this volume filled up
ideas?
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Ontap 8.1.2 has a bug that with dedupe filling volumes with metadata that requires running "sis start -s" occasionally. Not sure if that's what's going on here: NetApp Knowledgebase - Stale metadata not automatically removed as part of the 'sis start' operation on the volume when running Data ONTAP® 8.1x Fred | | | | | | | | | | | NetApp Knowledgebase - Stale metadata not automatically ...This Knowledgebase article is not available to the public. Please login with your NetApp Support Site account in order to view the KB. | | | | View on kb.netapp.com | Preview by Yahoo | | | | |
From: tmac tmacmd@gmail.com To: Fletcher Cocquyt fcocquyt@stanford.edu Cc: "toasters@teaparty.net Lists" toasters@teaparty.net Sent: Sunday, April 26, 2015 9:24 AM Subject: Re: Vol 100% - which VM?
Ok, let's start here: Are you thin provisioning?If so, it is possible that some other volume on the same aggregate got out of hand. Do you have OnCommand Unified Manager, OnCommand Performance Manager?Those might have been able to predict (if you are using cDOT)
--tmac Tim McCarthyPrincipal Consultant
On Sun, Apr 26, 2015 at 8:49 AM, Fletcher Cocquyt fcocquyt@stanford.edu wrote:
Never seen this before
FAS3250 ONTAP 8.1.2
at 2am the volume hosting 110 VMs fills up and starts alerting
Get the call and delete several snapshots to take it to 83%, reboot the 110 VMs ;(
I have lots of tools (splunk, inmon etc) they are not able to show why this volume filled up
ideas?
_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Fletcher,
Once you’ve caught up on your sleep, maybe you could share a bit more information about your volume space situation.
Which snapshots did you delete? Most recent? Least recent? Did you happen to notice the ‘snap delta’ of those snapshots you deleted?
How full has your volume been historically (e.g. last two weeks)? Maybe you’ve been skating close to the edge to begin with?
Are you storing any LUNs in this volume?
I’m assuming 7-mode here:
‘vol options <vol>’
If you are running dedupe, then ‘sis status -l <vol>’ and ‘sis check -c <vol>’ from diag mode.
aggr show_space -g <your-volume’s-container-aggregate>
Francis Kim | Engineer 510-644-1599 x334 | fkim@berkcom.commailto:fkim@berkcom.com
BerkCom | www.berkcom.comhttp://www.berkcom.com/ NetApp | Cisco | Supermicro | Brocade | VMware
On Apr 26, 2015, at 7:11 AM, Fred Grieco <fredgrieco@yahoo.commailto:fredgrieco@yahoo.com> wrote:
Ontap 8.1.2 has a bug that with dedupe filling volumes with metadata that requires running "sis start -s" occasionally. Not sure if that's what's going on here:
NetApp Knowledgebase - Stale metadata not automatically removed as part of the 'sis start' operation on the volume when running Data ONTAP® 8.1xhttps://kb.netapp.com/support/index?page=content&id=7010056&actp=LIST
Fred
[image]https://kb.netapp.com/support/index?page=content&id=7010056&actp=LIST
NetApp Knowledgebase - Stale metadata not automatically ...https://kb.netapp.com/support/index?page=content&id=7010056&actp=LIST This Knowledgebase article is not available to the public. Please login with your NetApp Support Site account in order to view the KB.
View on kb.netapp.comhttps://kb.netapp.com/support/index?page=content&id=7010056&actp=LIST
Preview by Yahoo
________________________________ From: tmac <tmacmd@gmail.commailto:tmacmd@gmail.com> To: Fletcher Cocquyt <fcocquyt@stanford.edumailto:fcocquyt@stanford.edu> Cc: "<toasters@teaparty.netmailto:toasters@teaparty.net> Lists" <toasters@teaparty.netmailto:toasters@teaparty.net> Sent: Sunday, April 26, 2015 9:24 AM Subject: Re: Vol 100% - which VM?
Ok, let's start here:
Are you thin provisioning? If so, it is possible that some other volume on the same aggregate got out of hand.
Do you have OnCommand Unified Manager, OnCommand Performance Manager? Those might have been able to predict (if you are using cDOT)
--tmac
Tim McCarthy Principal Consultant
On Sun, Apr 26, 2015 at 8:49 AM, Fletcher Cocquyt <fcocquyt@stanford.edumailto:fcocquyt@stanford.edu> wrote: Never seen this before
FAS3250 ONTAP 8.1.2
at 2am the volume hosting 110 VMs fills up and starts alerting
Get the call and delete several snapshots to take it to 83%, reboot the 110 VMs ;(
I have lots of tools (splunk, inmon etc) they are not able to show why this volume filled up
ideas?
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Snapshot filling is the result of data churn. If there's no obvious data writes, then next place I'd look is whether someone's trying to defrag or similar. Doesn't change the data, but does reshuffle the blocks and thus the snapshot. Never seen this before
FAS3250 ONTAP 8.1.2
at 2am the volume hosting 110 VMs fills up and starts alerting
Get the call and delete several snapshots to take it to 83%, reboot the 110 VMs ;(
I have lots of tools (splunk, inmon etc) they are not able to show why this volume filled up
ideas?
_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Operations Manager is a good tool to identify the bully VM(s)
In this case the DBA team was able to finger our bully VM which had a huge amount of churn (100+ Gb/day) data imports to DB.
I made a new volume and storage vmotioned it off to isolate it and prevent it from impacting the rest of the VMs.
daily snap deltas are back in the 1% range for the volume.
thanks
On Apr 26, 2015, at 6:28 AM, Edward Rolison <ed.rolison@gmail.com mailto:ed.rolison@gmail.com> wrote:
Snapshot filling is the result of data churn. If there's no obvious data writes, then next place I'd look is whether someone's trying to defrag or similar. Doesn't change the data, but does reshuffle the blocks and thus the snapshot.
Never seen this before
FAS3250 ONTAP 8.1.2
at 2am the volume hosting 110 VMs fills up and starts alerting
Get the call and delete several snapshots to take it to 83%, reboot the 110 VMs ;(
I have lots of tools (splunk, inmon etc) they are not able to show why this volume filled up
ideas?
Toasters mailing list Toasters@teaparty.net mailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters http://www.teaparty.net/mailman/listinfo/toasters