We've a DFM (an older version) that's misbehaving. At startup we get 'Service did not reach 'started' state due to timeout'. It does actually start and we can use the web interface... but after a few minutes (10ish) it starts to get very sluggish.
restarting clears it up, but only (very) temporarily.
I've turned off PerfAdvisor, because I spotted some substantially large logs.
Is anyone able to offer me suggestions of where else to look to troubleshoot? It looks like dbsrv10.exe is consuming all of a processor, trying to do 'something' but I have no idea what that might be. Perhaps a post restart 'tidy up'?
I suspect this might be an 'open a case' situation, but I thought I'd seek some wisdom here too.
Thanks, Ed.
How long have you been running this for, and what version? I had to do some db cleanup that really helped with mostly storage consumption space.. I think a couple versions back (5 something).
Thanks
Steve Klise
From: Edward Rolison <ed.rolison@gmail.commailto:ed.rolison@gmail.com> Date: Wednesday, April 1, 2015 at 8:45 AM To: "toasters@teaparty.netmailto:toasters@teaparty.net" <toasters@teaparty.netmailto:toasters@teaparty.net> Subject: DFM misbehaving - 'going sluggish' about 10m after restart
We've a DFM (an older version) that's misbehaving. At startup we get 'Service did not reach 'started' state due to timeout'. It does actually start and we can use the web interface... but after a few minutes (10ish) it starts to get very sluggish.
restarting clears it up, but only (very) temporarily.
I've turned off PerfAdvisor, because I spotted some substantially large logs.
Is anyone able to offer me suggestions of where else to look to troubleshoot? It looks like dbsrv10.exe is consuming all of a processor, trying to do 'something' but I have no idea what that might be. Perhaps a post restart 'tidy up'?
I suspect this might be an 'open a case' situation, but I thought I'd seek some wisdom here too.
Thanks, Ed.
Ed,
If you’re running an older version of DFM, there are some things that will happen as the database ages that can cause it to become sluggish.
One suggestion is to upgrade to a newer version - current for monitoring 7-mode systems is 5.2.1 - which buys you several things:
* Beginning with 5.0, OnCommand Unified Manager Core (formerly Operations Manager, formerly Datafabric Manager) is only available as a 64-bit application. This removes some limits on the number of CPU cores it can use and the amount of memory it will support. * Upgrading - the upgrade process itself - to any 5.x version, although 5.2 does the most - will do a significant cleanup of dead items in the database, completely rebuild the database, including all indexes, and balance the data across the database pages. This can result in significant performance improvements.
You may also wish to open a case, as NetApp Customer Success Services does have the ability to assist with your issue.
Phil Bachman
On Apr 1, 2015, at 11:45 AM, Edward Rolison <ed.rolison@gmail.commailto:ed.rolison@gmail.com> wrote:
We've a DFM (an older version) that's misbehaving. At startup we get 'Service did not reach 'started' state due to timeout'. It does actually start and we can use the web interface... but after a few minutes (10ish) it starts to get very sluggish.
restarting clears it up, but only (very) temporarily.
I've turned off PerfAdvisor, because I spotted some substantially large logs.
Is anyone able to offer me suggestions of where else to look to troubleshoot? It looks like dbsrv10.exe is consuming all of a processor, trying to do 'something' but I have no idea what that might be. Perhaps a post restart 'tidy up'?
I suspect this might be an 'open a case' situation, but I thought I'd seek some wisdom here too.
Thanks, Ed. _______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Looks like my 'culprit' is the DFM Monitor service. It's fine until we start that. The thing that stands out in the logs is 'error inserting disk data' complaining about an invalid value - DiskUsedBlocks being negative, which I assume is an overflow problem on bigger drives.
Would anyone be able to tell me exactly what this services does? Is shutting it down for a few days a viable workaround for anything? (e.g. will alerting still occur, etc.).
On 1 April 2015 at 22:17, Bachman, Philip Philip.Bachman@netapp.com wrote:
Ed,
If you’re running an older version of DFM, there are some things that will happen as the database ages that can cause it to become sluggish.
One suggestion is to upgrade to a newer version - current for monitoring 7-mode systems is 5.2.1 - which buys you several things:
- Beginning with 5.0, OnCommand Unified Manager Core (formerly
Operations Manager, formerly Datafabric Manager) is only available as a 64-bit application. This removes some limits on the number of CPU cores it can use and the amount of memory it will support.
- Upgrading - the upgrade process itself - to any 5.x version,
although 5.2 does the most - will do a significant cleanup of dead items in the database, completely rebuild the database, including all indexes, and balance the data across the database pages. This can result in significant performance improvements.
You may also wish to open a case, as NetApp Customer Success Services does have the ability to assist with your issue.
Phil Bachman
On Apr 1, 2015, at 11:45 AM, Edward Rolison ed.rolison@gmail.com wrote:
We've a DFM (an older version) that's misbehaving. At startup we get 'Service did not reach 'started' state due to timeout'. It does actually start and we can use the web interface... but after a few minutes (10ish) it starts to get very sluggish.
restarting clears it up, but only (very) temporarily.
I've turned off PerfAdvisor, because I spotted some substantially large logs.
Is anyone able to offer me suggestions of where else to look to troubleshoot? It looks like dbsrv10.exe is consuming all of a processor, trying to do 'something' but I have no idea what that might be. Perhaps a post restart 'tidy up'?
I suspect this might be an 'open a case' situation, but I thought I'd seek some wisdom here too.
Thanks, Ed. _______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Hi Edward,
The monitor service is pretty critical. With it disabled DFM will not gather data from your storage environment. If it’s having problems inserting monitoring data into your DFM DB, then you might already be missing events.
It really sounds like you’ve got some DB issues. NetApp offers a tool in the toolbox called dfmpurge which does a great job of cleaning up and reloading the DFM database. It now ships with DFM 5.2.x. It’s limiting factor, however, is that it’s written for a specific version of the sybase DB. If the dfmpurge tool doesn’t match you sybase version it will abort the cleanup process and not make any changes to your DB.
On Apr 2, 2015, at 5:14 AM, Edward Rolison ed.rolison@gmail.com wrote:
Looks like my 'culprit' is the DFM Monitor service. It's fine until we start that. The thing that stands out in the logs is 'error inserting disk data' complaining about an invalid value - DiskUsedBlocks being negative, which I assume is an overflow problem on bigger drives.
Would anyone be able to tell me exactly what this services does? Is shutting it down for a few days a viable workaround for anything? (e.g. will alerting still occur, etc.).
On 1 April 2015 at 22:17, Bachman, Philip <Philip.Bachman@netapp.com mailto:Philip.Bachman@netapp.com> wrote: Ed,
If you’re running an older version of DFM, there are some things that will happen as the database ages that can cause it to become sluggish.
One suggestion is to upgrade to a newer version - current for monitoring 7-mode systems is 5.2.1 - which buys you several things:
Beginning with 5.0, OnCommand Unified Manager Core (formerly Operations Manager, formerly Datafabric Manager) is only available as a 64-bit application. This removes some limits on the number of CPU cores it can use and the amount of memory it will support. Upgrading - the upgrade process itself - to any 5.x version, although 5.2 does the most - will do a significant cleanup of dead items in the database, completely rebuild the database, including all indexes, and balance the data across the database pages. This can result in significant performance improvements.
You may also wish to open a case, as NetApp Customer Success Services does have the ability to assist with your issue.
Phil Bachman
On Apr 1, 2015, at 11:45 AM, Edward Rolison <ed.rolison@gmail.com mailto:ed.rolison@gmail.com> wrote:
We've a DFM (an older version) that's misbehaving. At startup we get 'Service did not reach 'started' state due to timeout'. It does actually start and we can use the web interface... but after a few minutes (10ish) it starts to get very sluggish.
restarting clears it up, but only (very) temporarily.
I've turned off PerfAdvisor, because I spotted some substantially large logs.
Is anyone able to offer me suggestions of where else to look to troubleshoot? It looks like dbsrv10.exe is consuming all of a processor, trying to do 'something' but I have no idea what that might be. Perhaps a post restart 'tidy up'?
I suspect this might be an 'open a case' situation, but I thought I'd seek some wisdom here too.
Thanks, Ed. _______________________________________________ Toasters mailing list Toasters@teaparty.net mailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters http://www.teaparty.net/mailman/listinfo/toasters
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters