Hello list,
First post here … my role w/NTAP in the last three years has been with the OnCommand Performance Manager (OPM) program.
For your first question – OPM 1.0 had many limitations, OPM 1.1 is the GA version and has many improvements in the troubleshooting analysis page. I can help you figure out 1.0 but we’ve fixed many of the confusing issues in it with the 1.1 release. We also have OPM 2.0 in the works coming out within weeks. I suggest you switch to 2.0 once it ships. It’s a much …
[View More]improved product.
OPM 2.0 calculates an Aggregate and Node utilization metric, shows it front and center in the dashboard and allows you to set alert thresholds. All metrics in OPM 2.0 are hyperlinked with an explorer interface.
As for figuring out what is pushing an aggregate, again there are improvements in 1.1 to help with that and 2.0 is MUCH BETTER (and yes I know I’m speaking loudly).
All the best,
Joseph "Yossi" Weihs
Sr. Manager, Product Management
Manageability Products Group
NetApp
“Simplicity is complexity resolved” - Constantin Brâncuși<http://en.wikipedia.org/wiki/Constantin_Br%C3%A2ncu%C8%99i>
From: toasters-bounces(a)teaparty.net<mailto:toasters-bounces@teaparty.net> [mailto:toasters-bounces@teaparty.net] On Behalf Of Basil
Sent: Thursday, June 25, 2015 8:51 AM
To: toasters(a)teaparty.net<mailto:toasters@teaparty.net>
Subject: OnCommand Performance Manager events
I'm using 1.0.0 r2 with a CDOT cluster running 8.2.3. When I look at an event, it says things like "3 victim volumes slow due to 5 bully volumes causing contention on aggr_SAS". When I drill in, I can see a list of volumes with their latency at that period, but no indication whether they're being considered victims or bullies. Has anyone had any experience with this software?
Another question- is there a way to look directly at an aggregate's utilization? If I drill into a volume, I can see the aggregate with "break down data by - components - disk operations", but that doesn't show me the aggregate itself. Just the contribution of this volume to the aggregate. When I look into an event, the aggregate utilization is listed, but I can't get to there on its own page with a controllable graph.
Lastly, if I know an aggregate is being pushed to 100% but I can't see any volumes pushing harder, is there a way I can check whether system work is responsible? My first thought was that it was dedupe, but that's scheduled to kick off with the lowest possible priority about two hours after this event occurred.
Cheers!
Basil
[View Less]
Hi,
I've been configuring aggregates on 6 nodes cluster today and managed to
assign a disk to a node that the disk is not "cabled" to, ie it sits in a
disk shelf that is managed by a different HA-pair:
na201::> disk show -container-type unknown
Usable Disk Container Container
Disk Size Shelf Bay Type Type Name Owner
---------------- ---------- ----- --- ------- ----------- --------- --------
6.45.16 - …
[View More]45 16 SAS unknown -
na201node-3a
The disk is "cabled" to na201node-1a, so I'm trying to change the ownership:
na201::> storage disk assign -disk 6.45.16 -owner na201node-1a -force
Error: command failed: Failed to assign disks. Reason: Disk 6.45.16 does
not exist.
OK, this time trying to remove the ownership:
na201::> storage disk removeowner -disk 6.45.16 -force
Warning: Disks may be automatically assigned to the node because the disk's
auto-assign option is enabled. If the affected volumes are not offline, the
disks may be
auto-assigned during the remove owner operation, which will cause
unexpected results. To verify that the volumes are offline, abort this
command and use "volume
show".
Do you want to continue? {y|n}: y
Error: command failed: Disk 6.45.16 is not connected to na201node-3a
I'm out of ideas at this point .. any hints?
Cheers,
[View Less]
I'm using 1.0.0 r2 with a CDOT cluster running 8.2.3. When I look at an
event, it says things like "3 victim volumes slow due to 5 bully volumes
causing contention on aggr_SAS". When I drill in, I can see a list of
volumes with their latency at that period, but no indication whether
they're being considered victims or bullies. Has anyone had any experience
with this software?
Another question- is there a way to look directly at an aggregate's
utilization? If I drill into a volume, I can see …
[View More]the aggregate with "break
down data by - components - disk operations", but that doesn't show me the
aggregate itself. Just the contribution of this volume to the aggregate.
When I look into an event, the aggregate utilization is listed, but I can't
get to there on its own page with a controllable graph.
Lastly, if I know an aggregate is being pushed to 100% but I can't see any
volumes pushing harder, is there a way I can check whether system work is
responsible? My first thought was that it was dedupe, but that's scheduled
to kick off with the lowest possible priority about two hours after this
event occurred.
Cheers!
Basil
[View Less]
Perhaps a slightly daft question, but I can't find the answer rummaging.
Someone's just asked if there's a problem with going past 16 bit UIDs
(>65535).
Is anyone able to point me in the right direction?
Yes, and when I'm just looking to see what just happened, that's where I
go. When I'm looking to have something light up at the command center to
generate an off-hours page, though, we use specific SNMP software (HP
Openview, to be specific) and we try to use it whenever possible.
On Mon, Jun 22, 2015 at 3:04 PM, Waltham, Christopher <
Christopher.Waltham(a)netapp.com> wrote:
> Can you use syslog (via EMS) for those instead?
>
> From: <toasters-bounces(a)teaparty.net> …
[View More]on behalf of "
> basilberntsen(a)gmail.com" <basilberntsen(a)gmail.com>
> Date: Monday, June 22, 2015 at 3:01 PM
> To: Steve Francis <sfrancis(a)logicmonitor.com>
> Cc: "toasters(a)teaparty.net" <toasters(a)teaparty.net>
>
> Subject: Re: snmp and cdot 8.3
>
> We use snmp for environmental things like failovers, power events,
> hardware events, etc.
>
> Sent from my BlackBerry 10 smartphone on the Bell network.
> *From: *Steve Francis
> *Sent: *Monday, June 22, 2015 2:55 PM
> *To: *Basil
> *Cc: *Jordan Slingerland; toasters(a)teaparty.net
> *Subject: *Re: snmp and cdot 8.3
>
> The CDOT MIB doesn't expose most of the things you care to monitor.
> 7-mode did, with the exception of things like volume or aggregate latency,
> but for CDOT, almost everything you want to know is only accessible via the
> API. Very little SNMP support.
>
> On Mon, Jun 22, 2015 at 9:59 AM, Basil <basilberntsen(a)gmail.com> wrote:
>
>> There's a MIB file for the CDOT box, as well as one for oncommand unified
>> manager. The one for CDOT is backward compatible to the 7-mode version- we
>> have our 7-mode systems and CDOT systems using the same MIB file.
>>
>> On Mon, Jun 22, 2015 at 12:51 PM, Jordan Slingerland <
>> Jordan.Slingerland(a)independenthealth.com> wrote:
>>
>>> In 7m you can find the mib in /etc/mib.
>>>
>>>
>>>
>>> I do a lot of snmp monitoring of 7 mode but not cm. Here are some oid’s
>>> I use in 7m polling. (not trap based)
>>>
>>>
>>>
>>>
>>>
>>> #Cp type oids
>>>
>>> cpFromTimeroid=.1.3.6.1.4.1.789.1.2.6.2.0
>>>
>>> cpFromSnapshotoid=.1.3.6.1.4.1.789.1.2.6.3.0
>>>
>>> cpFromLowWateroid=.1.3.6.1.4.1.789.1.2.6.4.0
>>>
>>> cpFromHighWateroid=.1.3.6.1.4.1.789.1.2.6.5.0
>>>
>>> cpFromLogFulloid=.1.3.6.1.4.1.789.1.2.6.6.0
>>>
>>> cpFromCpoid=.1.3.6.1.4.1.789.1.2.6.7.0
>>>
>>> cpTotaloid=.1.3.6.1.4.1.789.1.2.6.8.0
>>>
>>> cpFromFlushoid=.1.3.6.1.4.1.789.1.2.6.9.0
>>>
>>> cpFromSyncoid=.1.3.6.1.4.1.789.1.2.6.10.0
>>>
>>>
>>>
>>> #general performance
>>>
>>>
>>>
>>> cpuidleoid=.1.3.6.1.4.1.789.1.2.1.5.0
>>>
>>> cifsopsoid=.1.3.6.1.4.1.789.1.7.3.1.1.1.0
>>>
>>> nfssopsoid=1.3.6.1.4.1.789.1.2.2.1.0
>>>
>>> cifsreadsoid=.1.3.6.1.4.1.789.1.7.3.1.1.5.0
>>>
>>> cifswritesoid=.1.3.6.1.4.1.789.1.7.3.1.1.6.0
>>>
>>>
>>>
>>> #failed components
>>>
>>> ##should be zero
>>>
>>> diskFailedCountoid=.1.3.6.1.4.1.789.1.6.4.7.0
>>>
>>> envOverTemperatureoid=.1.3.6.1.4.1.789.1.2.4.1.0
>>>
>>> failedFanCountoid=.1.3.6.1.4.1.789.1.2.4.2.0
>>>
>>> failedPowerSupplyoid=.1.3.6.1.4.1.789.1.2.4.4.0
>>>
>>>
>>>
>>>
>>>
>>> #system Status
>>>
>>> ##global status of 3 is normal
>>>
>>> GlobalStatusoid=.1.3.6.1.4.1.789.1.2.2.4.0
>>>
>>> ##interconnect status of 4 is up
>>>
>>> InterconnectStatusoid=.1.3.6.1.4.1.789.1.2.3.8.0
>>>
>>> #partner status of 2 us okay
>>>
>>> PartnerStatusoid=.1.3.6.1.4.1.789.1.2.3.4.0
>>>
>>> #Failover state of 2 is good. Means ready to take over if needed, but
>>> active-active
>>>
>>> FailoverStateoid=1.3.6.1.4.1.789.1.2.3.2.0
>>>
>>> #nvram status of 1 means good
>>>
>>> NvramBatteryStatusoid=.1.3.6.1.4.1.789.1.2.5.1.0
>>>
>>>
>>>
>>> #ambient temp in C
>>>
>>> ambienttmpoid=.1.3.6.1.4.1.789.1.21.1.2.1.25.1
>>>
>>>
>>>
>>> ##advanced NFS stats
>>>
>>> nfsv3cRenamesoid=.1.3.6.1.4.1.789.1.3.1.2.4.1.15.0
>>>
>>> nfsv3cAccessesoid=.1.3.6.1.4.1.789.1.3.1.2.4.1.5.0
>>>
>>> nfsv3cCreatesoid=.1.3.6.1.4.1.789.1.3.1.2.4.1.9.0
>>>
>>> nfsv3cGetattrsoid=.1.3.6.1.4.1.789.1.3.1.2.4.1.2.0
>>>
>>> nfsv3cLinksoid=.1.3.6.1.4.1.789.1.3.1.2.4.1.16.0
>>>
>>> nfsv3cLookupsoid=.1.3.6.1.4.1.789.1.3.1.2.4.1.4.0
>>>
>>> nfsv3cMkdirsoid=.1.3.6.1.4.1.789.1.3.1.2.4.1.10.0
>>>
>>> nfsv3cNullsoid=.1.3.6.1.4.1.789.1.3.1.2.4.1.1.0
>>>
>>> nfsv3cReadCallsoid=.1.3.6.1.4.1.789.1.3.1.2.4.1.7.0
>>>
>>> nfsv3cReaddirPlussoid=.1.3.6.1.4.1.789.1.3.1.2.4.1.18.0
>>>
>>> nfsv3cReadDirsoid=.1.3.6.1.4.1.789.1.3.1.2.4.1.17.0
>>>
>>> nfsv3cRemovesoid=.1.3.6.1.4.1.789.1.3.1.2.4.1.13.0
>>>
>>> nfsv3cSymlinksoid=.1.3.6.1.4.1.789.1.3.1.2.4.1.11.0
>>>
>>> nfsv3cWritesoid=.1.3.6.1.4.1.789.1.3.1.2.4.1.8.0
>>>
>>>
>>>
>>> snapmirrorWrittenBytesoid=.1.3.6.1.4.1.789.1.9.10.0
>>>
>>> snapmirrorReadBytesoid=.1.3.6.1.4.1.789.1.9.11.0
>>>
>>> snapmirrorActiveDstNumberoid=.1.3.6.1.4.1.789.1.9.12.0
>>>
>>> snapmirrorActiveSrcNumberoid=.1.3.6.1.4.1.789.1.9.13.0
>>>
>>> snapmirrorFilerTotalDstSuccessesoid=.1.3.6.1.4.1.789.1.9.14.0
>>>
>>> snapmirrorFilerTotalSrcSuccessesoid=.1.3.6.1.4.1.789.1.9.15.0
>>>
>>> snapmirrorFilerTotalSrcFailuresoid=.1.3.6.1.4.1.789.1.9.16.0
>>>
>>> snapmirrorFilerTotalDstFailuresoid=.1.3.6.1.4.1.789.1.9.17.0
>>>
>>> snapmirrorFilerTotalDstDefermentsoid=.1.3.6.1.4.1.789.1.9.18.0
>>>
>>>
>>>
>>>
>>>
>>> *From:* toasters-bounces(a)teaparty.net [mailto:
>>> toasters-bounces(a)teaparty.net] *On Behalf Of *Sayla, Mustafa
>>> *Sent:* Monday, June 22, 2015 12:31 PM
>>> *To:* toasters(a)teaparty.net
>>> *Subject:* snmp and cdot 8.3
>>>
>>>
>>>
>>> Anyone has experience with monitoring CDOT 8.3 with snmp? I have
>>> configured the CDOT system and configured SNMP trap host and community
>>> string yet we are not receiving traps. We were told by our third party
>>> vendor Solar Winds that netapp does not use standard snmp OID. We are not
>>> using snmp v3.
>>>
>>>
>>>
>>> Mustafa Sayla
>>>
>>> *Visit us on the Web at mesirowfinancial.com
>>> <http://mesirowfinancial.com>*
>>>
>>> *This communication may contain privileged and/or confidential
>>> information. It is intended solely for the use of the addressee. If you are
>>> not the intended recipient, you are strictly prohibited from disclosing,
>>> copying, distributing or using any of this information. If you received
>>> this communication in error, please contact the sender immediately and
>>> destroy the material in its entirety, whether electronic or hard copy.
>>> Confidential, proprietary or time-sensitive communications should not be
>>> transmitted via the Internet, as there can be no assurance of actual or
>>> timely delivery, receipt and/or confidentiality. This is not an offer, or
>>> solicitation of any offer to buy or sell any security, investment or other
>>> product.*
>>>
>>> _______________________________________________
>>> Toasters mailing list
>>> Toasters(a)teaparty.net
>>> http://www.teaparty.net/mailman/listinfo/toasters
>>>
>>>
>>
>> _______________________________________________
>> Toasters mailing list
>> Toasters(a)teaparty.net
>> http://www.teaparty.net/mailman/listinfo/toasters
>>
>>
>
>
> --
> *Steve Francis* | Chief Product Officer
> steve(a)logicmonitor.com
> 805 698 0770
>
> <http://www.LogicMonitor.com>
> *Cloud-based performance monitoring*
>
> * <https://twitter.com/SteveFrancisLM> *
>
>
[View Less]
Anyone has experience with monitoring CDOT 8.3 with snmp? I have configured the CDOT system and configured SNMP trap host and community string yet we are not receiving traps. We were told by our third party vendor Solar Winds that netapp does not use standard snmp OID. We are not using snmp v3.
Mustafa Sayla
Visit us on the Web at mesirowfinancial.com
This communication may contain privileged and/or confidential information. It is intended solely for the use of the addressee. If you are not …
[View More]the intended recipient, you are strictly prohibited from disclosing, copying, distributing or using any of this information. If you received this communication in error, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. Confidential, proprietary or time-sensitive communications should not be transmitted via the Internet, as there can be no assurance of actual or timely delivery, receipt and/or confidentiality. This is not an offer, or solicitation of any offer to buy or sell any security, investment or other product.
[View Less]
What are the recommendations for the "best" way to preserve a copy data on
a filer for future use?
Background: A company has ceased operations and wants to preserve copies
of its data "just in case someone wants it in the future". All the
interesting data has been copied to the a single "archive" filer. There
are about eight volumes in using about 50TB (with dedupe but no
compression). This will be backup copy #1. The other filers that
previously held the data are being sold and scheduled …
[View More]to be wiped.
The data is a mixed of virtual machines and file shares.
Now I'm trying to see if if a way to make another copy of the data for safe
keeping at a reasonable cost. Speed of restore would not be high the
feature list of this secondary archive.
One of the first thoughts simply dump the data onto an AWS volume for "cold
storage". Sane? Tips/tricks? Data presumably gets scrubbed to prevent
bit rot. Conveniently available just about anywhere. Has a recurring cost
but avoids some upfront cost.
Second thought is find a relatively inexpensive array of disks and dump the
data to that. Most likely benefit is fastest to write to. Maybe cheapest
but in the end all the data is on non-spinning disks without verification.
Tapes? I don't like tapes. Don't know if a working tape system available
but if there was is this a good case for it. Tape static storage better
than disk?
Thanks.
arnold
[View Less]
Hello,
We've recently acquired a 2 node FAS8020 cluster running Clustered
DataONTAP 8.2. We inherited a rather unusual/silly network topology, in
which we have two subnets (a publicly routable one and an RFC1918 one
for internal use only) running on the same VLAN. In other words, there's
no router between these two subnets.
In Linux, we're able to get boxes on the publicly routable subnet to
communicate with the internal subnet (10.1.0.0/16) as follows:
route add -net 10.1.0.0 netmask 255.…
[View More]255.0.0 dev eth0
This command tells the Linux box that it can send packets to 10.1.0.0/16
hosts out eth0 without going through an intervening router. However,
we've yet to find a way to do this with Cluster DataONTAP. We've played
around with routing groups and routes, but we cannot figure out a way
to tell Clustered DataONTAP to route through an interface. Does anyone
know if this is possible? NetApp has advised us to set up a router
between our two subnets. We share NetApp's concern over fail over
We'll set up a router if we can't figure out any other way to make this
work.
Thanks
John
[View Less]
I recently upgraded one of my 7mode controller from 8.1P1 to 8.2.3P3 and ran into issue during the second controller upgrade. I installed the upgrade version on both the controller than performed the cf takeover from controller B to controller A and it went fine. I then waited about 10 minutes and did the cf giveback and that also went fine. after waiting another 10 minutes I tried the cf takeover from controller A to controller B and I got the version mismatch error. I then issued cf takeover -…
[View More]n and at the time controller A did not took over controller B but just rebooted controller B. This caused a downtime with our vmware farm as servers lost connection to the data store on controller B. This is with NFS protocol. I only see this error in the logs
Fri May 29 15:53:59 CDT [rddata1b: cf_takeover: kern.rc.msg:notice]: vlan: interface creation failed due to internal errors Partner vlan create command failed
This explains why the takeover did not work but does not explain why this happened in the first place. I have worked with Netapp and validated that the rc file is correct and we have done failover and failback in the past without any issue. I also did the failover and failback after this to rule out configuration issue but could not recreate it. I need to upgrade 3 more HA pairs and I want to know if anyone has seen this issue before and does this have anything to do with the upgrade.
Thank you
Mustafa Sayla
Visit us on the Web at mesirowfinancial.com
This communication may contain privileged and/or confidential information. It is intended solely for the use of the addressee. If you are not the intended recipient, you are strictly prohibited from disclosing, copying, distributing or using any of this information. If you received this communication in error, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. Confidential, proprietary or time-sensitive communications should not be transmitted via the Internet, as there can be no assurance of actual or timely delivery, receipt and/or confidentiality. This is not an offer, or solicitation of any offer to buy or sell any security, investment or other product.
[View Less]
cifs.smb2.client.enable
This option enables SMB 2.0 client capability on the Filer. When this
option is enabled, Filer-initiated connections to Windows servers use
the SMB 2.0 protocol. If the Windows server does not support the SMB
2.0 protocol, the Filer uses SMB 1.0. If a session was established
over SMB 2.0 and then this option is disabled, existing sessions are
not terminated. The Filer continues to use SMB 2.0 for the existing
sessions; new sessions do not use SMB 2.0.
…
[View More]Probably missing something obvious -- but would someone describe to me
a scenario in which the NetApp would act as a CIFS client?
Thanks,
Ray
[View Less]