Ops Mgr/DFM Monitoring of CPs

List overview All Threads
Download

newer

older

Changing a LUN type

sdcli dll error on Windows 2008 R2

Fred Grieco

21 Jan 2010 21 Jan '10

1:40 p.m.

Does anyone know a good way to monitor Consistency Points and disk utilization in newer versions of OpsMgr? Or another way? These correspond to "CP ty" and "Disk Util" on a sysstat -x output.

In previous versions of DFM and Management console (3.6 and 2.2), Management console had View under the logical views called "storage system - CP Details." It's not listed anymore in version 3.8. This was very handy, as high back to back CPs corresponded to performance issues we had.

TIA,

Fred

Attachments:

attachment.html (text/html — 824 bytes)

Show replies by date

Suresh Rajagopalan

22 Jan 22 Jan

6:25 a.m.

New subject: Loop A failure not triggering failover

We have a active/active setup on our filers,standard loop A/loop B cabling (no multipath HA).

We had a recent event with our filers where intermittent failure of loop A did not trigger a failover to the partner. I'd like to know why that is the case. According to the Netapp failover cause and effect document at

http://now.netapp.com/NOW/knowledge/docs/ontap/rel727/html/ontap/cluster /failing_over/reference/r_oc_fo_failover-events.html

This event should have caused a failover.

The log message from the filer on loop A was:

Sun Jan 17 15:41:56 PST [netapp1: fci.link.break:error]: Link break detected on Fibre Channel adapter 0e.

Is there a option or timeout setting to make the failover happen

Thanks

Suresh

LOhit

7:09 a.m.

New subject: Loop A failure not triggering failover

Hi Suresh,

I think this should have happened, when the loop failed. (Taken from ONTAP docs)

How disk shelf comparison takeover works

Describes the way a node uses disk shelf comparison with its partner node to determine if it is impaired.

When communication between nodes is first established through the cluster interconnect adapters, the nodes exchange a list of disk shelves that are visible on the A and B loops of each node. If, later, a system sees that the B loop disk shelf count on its partner is greater than its local A loop disk shelf count, the system concludes that it is impaired and prompts its partner to initiate a takeover. Note: Disk shelf comparison does not function for active/active configurations using software-based disk ownership, or for fabric-attached MetroClusters.

options cf.takeover.detection.seconds number_of_seconds (But, I think this affects only cluster interconnect timeouts not the loop failure)

The valid values for number_of_seconds are 10 through 180; the default is 15.

Attention: If the specified time is less than 15 seconds, unnecessary takeovers can occur, and a core might not be generated for some system panics. Use caution when assigning a takeover time of less than 15 seconds.

On Fri, Jan 22, 2010 at 11:55 AM, Suresh Rajagopalan < SRajagopalan@williamoneil.com> wrote:

...

We have a active/active setup on our filers,standard loop A/loop B cabling (no multipath HA).

We had a recent event with our filers where intermittent failure of loop A did not trigger a failover to the partner. I’d like to know why that is the case. According to the Netapp failover cause and effect document at

http://now.netapp.com/NOW/knowledge/docs/ontap/rel727/html/ontap/cluster/fai...

This event should have caused a failover.

The log message from the filer on loop A was:

*Sun Jan 17 15:41:56 PST [netapp1: fci.link.break:error]: Link break detected on Fibre Channel adapter 0e.*

Is there a option or timeout setting to make the failover happen

Thanks

Suresh

-- LOhit

Coatney, Sue

10:01 p.m.

New subject: Loop A failure not triggering failover

The cf.takeover.on_disk_shelf_miscompare option needs to be turned on for takeover to happen when a disk shelf mis-compare happens.

Sue Coatney High Availability Team NetApp

________________________________

From: LOhit [mailto:lohit.b@gmail.com] Sent: Thu 1/21/2010 11:09 PM To: Suresh Rajagopalan Cc: toasters@mathworks.com Subject: Re: Loop A failure not triggering failover

Hi Suresh,

I think this should have happened, when the loop failed. (Taken from ONTAP docs)

How disk shelf comparison takeover works

Describes the way a node uses disk shelf comparison with its partner node to determine if it is impaired.

Note: Disk shelf comparison does not function for active/active configurations using software-based disk ownership, or for fabric-attached MetroClusters.

options cf.takeover.detection.seconds number_of_seconds (But, I think this affects only cluster interconnect timeouts not the loop failure)

The valid values for number_of_seconds are 10 through 180; the default is 15.

On Fri, Jan 22, 2010 at 11:55 AM, Suresh Rajagopalan SRajagopalan@williamoneil.com wrote:

We have a active/active setup on our filers,standard loop A/loop B cabling (no multipath HA).

http://now.netapp.com/NOW/knowledge/docs/ontap/rel727/html/ontap/cluster/fai...

This event should have caused a failover.

The log message from the filer on loop A was:

Sun Jan 17 15:41:56 PST [netapp1: fci.link.break:error]: Link break detected on Fibre Channel adapter 0e.

Is there a option or timeout setting to make the failover happen

Thanks

Suresh

-- LOhit

Suresh Rajagopalan

23 Jan 23 Jan

1:34 a.m.

New subject: Loop A failure not triggering failover

We use a 6030 series with software disk ownership. According to the documentation below this option does not apply to us. Is that right?

Please note that this was a intermittent Loop A failure and not a complete failure. So Loop A kept going up/down and we had no failover during this period.

Suresh

From: Coatney, Sue [mailto:Sue.Coatney@netapp.com] Sent: Friday, January 22, 2010 2:02 PM To: LOhit; Suresh Rajagopalan Cc: toasters@mathworks.com Subject: RE: Loop A failure not triggering failover

The cf.takeover.on_disk_shelf_miscompare option needs to be turned on for takeover to happen when a disk shelf mis-compare happens.

Sue Coatney

High Availability Team

NetApp

________________________________

From: LOhit [mailto:lohit.b@gmail.com] Sent: Thu 1/21/2010 11:09 PM To: Suresh Rajagopalan Cc: toasters@mathworks.com Subject: Re: Loop A failure not triggering failover

Hi Suresh,

I think this should have happened, when the loop failed. (Taken from ONTAP docs)

How disk shelf comparison takeover works

Describes the way a node uses disk shelf comparison with its partner node to determine if it is impaired.

Note: Disk shelf comparison does not function for active/active configurations using software-based disk ownership, or for fabric-attached MetroClusters.

options cf.takeover.detection.seconds number_of_seconds (But, I think this affects only cluster interconnect timeouts not the loop failure)

The valid values for number_of_seconds are 10 through 180; the default is 15.

On Fri, Jan 22, 2010 at 11:55 AM, Suresh Rajagopalan SRajagopalan@williamoneil.com wrote:

We have a active/active setup on our filers,standard loop A/loop B cabling (no multipath HA).

http://now.netapp.com/NOW/knowledge/docs/ontap/rel727/html/ontap/cluster /failing_over/reference/r_oc_fo_failover-events.html

This event should have caused a failover.

The log message from the filer on loop A was:

Sun Jan 17 15:41:56 PST [netapp1: fci.link.break:error]: Link break detected on Fibre Channel adapter 0e.

Is there a option or timeout setting to make the failover happen

Thanks

Suresh

-- LOhit

5903

Age (days ago)

5905

Last active (days ago)

toasters@lists.teaparty.net

4 comments

4 participants

tags (0)

participants (4)

Coatney, Sue
Fred Grieco
LOhit
Suresh Rajagopalan