toasters

toasters@lists.teaparty.net

1 participants
13532 discussions

nfsiostat and client NFS connections
by Randy Rue 08 Feb '12

08 Feb '12

Hello All, Trying to troubleshoot performance problems on our v3170. This is not specifically a NetApp question but if nothing else, the tool I'm using appears to have been written by a NetApp engineer, does that count? Running nfsiostat on a handful of impacted clients on another network and uploading the results to a timestamped log file served on a web server so I can reach them. I have a script that parses those pages, takes the last timestamped entry from each file (one file for each client), and tallies up a) the average RTT time for all mounts to the vFiler and b) the worst single instance. Now that I have it all working, the numbers don't believe like I think they should. The clients are adding an entry to each file every minute. I can see that from the timestamps before each entry. But the numbers don't change. Or they change by a few hundredths of a ms. My gather script returns an average that varies only by .01ms or so, and my "worst" figure has been steady for an hour now. We're running nfsiostat with no arguments. Am I getting real time readings or some average? If average, since when? Is there a way to get real-time? Hope to hear from you, Randy Rue

9 16

Re: FlexShare
by Jack Lyons 08 Feb '12

08 Feb '12

I had an SE say the same thing and stayed away from it also for same reason - we have an older filer that gets pounded in evenings and was wondering if anyone had a big success story. Thanks Jack ------Original Message------ From: Scott Eno To: Jack Lyons Cc: toasters(a)teaparty.net Subject: Re: FlexShare Sent: Feb 8, 2012 9:06 AM Hi Jack, I have done some small testing with it, but I shied away from implementing it when a NetApp instructor said that to get the best results you have to apply it to ALL the volumes on the filer hosting the particular volume(s) you are looking to improve performance on. Not sure if that's true or not, but it was his suggestion. I figured if that were the case I would be spending a lot of time tweaking all volumes on a filer just to possibly squeeze a bit more out of one or two. --- Scott Eno (First time poster, long time reader) On Feb 8, 2012, at 7:20 AM, Jack Lyons <jack1729(a)gmail.com> wrote: > Has anyone used FlexShare and had extremely positive results. IT seems > that the improvement would be minor and misconfiguration could lead to > no improvement for intended volumes and degradation for others. From > reading, it seems that you need to assign priorities to each volume > otherwise the volumes in the default queue will experience degradation. > > Would anyone like to share details of how the implemented flexshare and > the impact it had on the performance issue they were working to resolve? > > Would also like to know if anyone has try to 'schedule flexshare' > settings using rsh or api e.g. we want to give higher priorities to > certain volumes from 6pm to 11pm nightly. > > Thanks > Jack > _______________________________________________ > Toasters mailing list > Toasters(a)teaparty.net > http://www.teaparty.net/mailman/listinfo/toasters Sent from my Verizon Wireless BlackBerry

2 1

vfiler migrate
by Randy Rue 08 Feb '12

08 Feb '12

A co-worker of mine already posted on this but we're coming up with more questions. We successfully tested the migration of a test vFiler from one node to the other and back. Took 3-4 minutes each way for a vFiler with one aggregate and two "disks" (it's actually got a 3Par SAN on the back end). Then we tried a vFiler with two aggregates of 14 and 15 "disks." It went from A to B with no trouble, took closer to six minutes. When moving it back from B to A it blew up. Got the following at the CLI: tungsten-a> vfiler migrate -m nocopy fhdata@tungsten-b tungsten-b's Administrative login: root tungsten-b's Administrative password: Stopping remote vfiler.... vfiler migrate: Aggregate 'fhdata_aggr1' has failed and cannot be brought online. Problem bringing aggr fhdata_aggr1 online: CR_VOL_FAILED vfiler migrate: No aggregate named 'fhdata_aggr2' exists. Couldn't find aggr fhdata_aggr2 in order to bring it online vfiler migrate: No volume named 'fhdata_root' exists. Couldn't find flex-volume fhdata_root in order to bring it online vfiler migrate: No volume named 'fhdata_shared' exists. Couldn't find flex-volume fhdata_shared in order to bring it online vfiler migrate: No volume named 'fhdata_viddshared' exists. Couldn't find flex-volume fhdata_viddshared in order to bring it online Problem bringing vfiler fhdata volumes online. tungsten-a> tungsten-a> vfiler status -r fhdata Vfiler not found: fhdata tungsten-a> At first there was no sign of the vFiler, aggregates or volumes on either A or B. After a few minutes, part of one aggregate was visible on A. A few minutes after that we could see both aggregates and all disks. Brought the aggregates online and the all the volumes. The vFiler is still gone, no sign of it on either node. Does the vfiler migrate command support vFilers with more than one aggregate? Note that all the volumes for the vFiler live on two aggregates and no other volumes are on those aggregates. While we're on the subject, does anyone know where the configs for a vFiler actually live? Right now my hope is to rebuild the vFiler and reattach the volumes but I'm hoping to at least be able to see what CIFS shares it had, for example. We save a nightly tarball of the complete contents of etc$ so if it's in a config file somewhere I can find it. On the phone with NetApp support right now but I'm hoping someone has experience with this in the real world. Hope to hear from you. Randy

2 2

Filer behavior differences between 7.3.3 and 7.3.5.
by Jeff Cleverley 07 Feb '12

07 Feb '12

Greetings, Has anyone upgraded from any version of 7.3.3 to a 7.3.5 variant seen any behavioral differences? I also upgraded diagnostics from 5.5 to 5.6.1. I upgraded 2 clusters from 7.3.3P5 to 7.3.5.1P4 at the first of the year. One cluster is a 6040 and the other is a 6080. Both have similar layouts of disk space and usage, CIFS, NFS, PAMII cards, 10G networking, etc. After the upgrade, the 6040 cluster is behaving normally. Our NMS monitoring server immediately started reporting SNMP polling timeouts for the 6080 cluster. There were over 130 errors for each head within the first 2 days and 0 for the entire month of December. The 6080 cluster has user home directories via NFS. I see periodic "hiccups" in response on my workstation, as have other users. We had one of the filers panic due to a failed SAS controller a couple of weeks after the upgrade. During that failover time, there were a lot more complaints about unacceptable performance until I got the replacement part and failed things back. I was able to pull a performance graph from the first of December to the end of January for both filers. This data came from SNMP queries of the filers. You can clearly see that the baseline CPU utilization was ~40% before the upgrade and ~60% after the upgrade. That would also account for the poor performance during the cluster failover since one filer head can't maintain the load (60%) for both heads without degrading performance. Through all of this, nothing changed but the OS and diagnostics firmware. I just reinstalled the OS again this weekend just to make sure everything was happy with it. There has been no change. I've got a call open with NetApp but they are not seeing anything wrong and are unable to explain what is driving the CPU higher since everything looks acceptable to them. Unfortunately I did not do a perfstat collection prior to the upgrade so there isn't anything to compare it to. I did find the option nfs.mountd.trace option became very verbose after the upgrade (prior post on that) and had to be turned off. I'm wondering if there are any other issues anyone else may have found after this type of upgrade? Thanks, Jeff -- Jeff Cleverley Unix Systems Administrator 4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611

2 1

anybody has experience using snapmover?
by Kawakubo, Ken 06 Feb '12

06 Feb '12

Dear list, We have an active-active cluster with vfilers. One vfiler in node-1 is having performance issues. Since node-1's CPU usage is much higher than node-2. I would like to migrate the vfiler from node-1 to node-2. I found a feature called snapmover in vfiler manual. I found that the command "vfiler migrate -m nocopy" migrates a vfiler from node-1 to node-2 without actually copying data. This command merely changes the software disk ownership of disks that make up the vfiler and is non-disruptive. Furthermore, snapmover is now a non-license feature. Sounds too good to be true? I want to know if there is any gotcha using snapmover. Does anybody have experiences using "vfiler migrate -m nocopy" command? Our cluster is NetApp V3170. Regards, Ken Kawakubo

2 3

Volume Naming Conventions?
by Ray Van Dolson 04 Feb '12

04 Feb '12

Hello; Curious what type of volume naming conventions you all have found work best? We're about to expand our environment, and looking for ideas on ways to keep names somewhat formulaic and consistent across filers. Thanks, Ray

2 1

Oracle E-Business Suite - Cisco UCS / Netapp
by Salisbury, Charles 04 Feb '12

04 Feb '12

Is anyone on this list running Oracle E-Business Suite specifically on a Cisco UCS / Netapp platform utilizing DirectNFS? Thanks in advance. Charles Salisbury LAN Services Administrator Gentex Corporation 616-772-1800 x5090 charles.salisbury(a)gentex.com <mailto:charles.salisbury@gentex.com>

2 1

Re: any licensing issues NDU 7.3.5.1P2 -> 8.1RC2?
by Fletcher Cocquyt 04 Feb '12

04 Feb '12

Hi Steven - thanks for the link - yes, I can get to it sound like I should add a license command prior to giveback to verify all the features I need are still licensed properly? (or would giveback flag mismatched licenses ?) thanks -- Fletcher On Feb 3, 2012, at 3:42 PM, Yee, Steven wrote: > not sending this out to the dl, not sure if the KB link I’m sending has been made public yet .. > (it should be, but one never knows) > > https://kb.netapp.com/support/index?page=content&actp=LIST&id=S:3013159 > > The 8.1 licensing uses the new packaging, but the packaging mapping > is dependent on the platform type. It looks like your standby cluster is > a FAS 32xx or 62xx because that’s the new packaging for those machines. > Not sure what machine you’re starting with. The only time you really have a > problem is if you were using a feature that is now part of a “package” > but you had never installed the “package key” (eg the feature that represents the whole package) > then on upgrade the feature will disable until the package key is installed. > > The only other issue is that if you try and use one of the “always included” features > (like multistore on a 62xx) for the first time you will notice that there is now > a new on/off switch. If the key was installed previously, then on upgrade that > on/off option should be set to ‘on’ but if you have issues during upgrade you will want to check that > first. > > steve. > > > From: toasters-bounces(a)teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Fletcher Cocquyt > Sent: Friday, February 03, 2012 3:05 PM > To: toasters(a)teaparty.net > Subject: any licensing issues NDU 7.3.5.1P2 -> 8.1RC2? > > WIll I run into any license issues during this NDU? Hate to be trying to get a license code at 3am… > The ones that are critical are NFS, Multistore and snap mirror > > The 7.3.5.1P2 systems have license codes for: > > a_sis > cluster > cluster_remote > flex_clone > flexcache_nfs > http > iscsi > multistore > nearstore_option > nfs > snapmirror > snapmirror_sync > snapmover > snaprestore > syncmirror_local > > I noticed our standby cluster (already running 8.1.RC2) lists: > > a_sis ENABLED > cf CODE > cf_remote CODE > compression ENABLED > disk_sanitization ENABLED > flash_cache ENABLED > flex_scale ENABLED > flexcache_nfs ENABLED > http ENABLED > iscsi CODE > multistore ENABLED > nearstore_option ENABLED > nfs CODE > operations_manager ENABLED > persistent_archive ENABLED > protection_manager ENABLED > provisioning_manager ENABLED > snapmirror CODE > snapmirror_sync ENABLED > snapmover ENABLED > snaprestore CODE > storage_services ENABLED > sv_linux_pri ENABLED > sv_unix_pri ENABLED > sv_vi_pri ENABLED > sv_windows_ofm_pri ENABLED > sv_windows_pri ENABLED > syncmirror_local CODE > vld ENABLED > > thanks > > -- > Fletcher > > > > >

2 2

any licensing issues NDU 7.3.5.1P2 -> 8.1RC2?
by Fletcher Cocquyt 03 Feb '12

03 Feb '12

WIll I run into any license issues during this NDU? Hate to be trying to get a license code at 3am… The ones that are critical are NFS, Multistore and snap mirror The 7.3.5.1P2 systems have license codes for: a_sis cluster cluster_remote flex_clone flexcache_nfs http iscsi multistore nearstore_option nfs snapmirror snapmirror_sync snapmover snaprestore syncmirror_local I noticed our standby cluster (already running 8.1.RC2) lists: a_sis ENABLED cf CODE cf_remote CODE compression ENABLED disk_sanitization ENABLED flash_cache ENABLED flex_scale ENABLED flexcache_nfs ENABLED http ENABLED iscsi CODE multistore ENABLED nearstore_option ENABLED nfs CODE operations_manager ENABLED persistent_archive ENABLED protection_manager ENABLED provisioning_manager ENABLED snapmirror CODE snapmirror_sync ENABLED snapmover ENABLED snaprestore CODE storage_services ENABLED sv_linux_pri ENABLED sv_unix_pri ENABLED sv_vi_pri ENABLED sv_windows_ofm_pri ENABLED sv_windows_pri ENABLED syncmirror_local CODE vld ENABLED thanks -- Fletcher

1 0

Re: clarification about reboots during NDU
by Fletcher Cocquyt 03 Feb '12

03 Feb '12

Justin, thanks for the feedback What makes step 13 (cf giveback) triggering another reboot of the SAME node A is step 12 reads like that has already been accomplished for node A so does node A reboot twice, then when you repeat the procedure, node B reboots twice? thanks -- Fletcher On Feb 2, 2012, at 11:20 AM, Parisi, Justin wrote: > In a takeover/giveback scenario, there are two reboots. > > The first reboot is the takeover. The second reboot is the giveback. > > From: toasters-bounces(a)teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Fletcher Cocquyt > Sent: Thursday, February 02, 2012 1:52 PM > To: toasters(a)teaparty.net > Subject: clarification about reboots during NDU > > Hi - can someone clarify this from pg 56 of the upgrade guide - > > for instance, if step 12 causes A to "reboot the system using the new firmware and software" > > WHY in step 13 does the cf giveback ALSO "cause system A to reboot with the new system configuration?" > > Are there really 2 reboots ? > > thanks > > > 12. Enter the following command to reboot the system using the new firmware and software: > > bye > > 13. Choose the option that describes your configuration. > > <image001.png> > <image002.png> > If FCP or iSCSI... > > Is not in use in system A > > Is in use in system A > > Then when the "Waiting for giveback" message appears on the console of system A... > > Enter the following command at the console of system B: > > cf giveback > > Wait for at least eight minutes to allow host multipathing software to stabilize, then enter the following command at the console of system B: > > cf giveback > > <image001.png> > <image002.png> > <image001.png> > <image002.png> > <image001.png> > <image002.png> > Attention: The cf giveback command can fail because of open client sessions (such as CIFS sessions), long-running operations, or operations that cannot be restarted (such as tape backup or SyncMirror resynchronization). If the cf givebackcommand fails, terminate any CIFS session or long-running operations gracefully (because the -f option will immediately terminate any CIFS sessions or long-running operations) and then enter the following command (with the -f option): > > cf giveback -f > For more information about the behavior of the -f option, see the cf(1) man page. > > The command causes system A to reboot with the new system configuration—a Data ONTAP version and any new system firmware and hardware changes—and resume normal operation as a high-availability partner. > > -- > Fletcher Cocquyt > Principal Engineer > Information Resources and Technology (IRT) > Stanford University School of Medicine > <image003.jpg> > > Email: fcocquyt(a)stanford.edu > Phone: (650) 724-7485 > > > >

5 8

Jump to page:

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

toasters