toasters May 2012

toasters@lists.teaparty.net

53 participants
34 discussions

snapshot cleanup and performance
by Peter D. Gray 16 May '12

16 May '12

OK, just wondering if anybody can shed light on this. We just had a massive performance problem on one of our netapps. One aggregate was amazingly busy, with disk drives 100% busy all the time. IO latency went through the roof for all the volumes on the aggregate. We spent a bit of time on this and we are not novice netapp users. In the end we could not identify the problem. There appeared to be no relationship between the amountof I/O coming from clients and the amount of I/O on the aggregate. So, we called netapp support. It took them a while (a few days) but eventually they suggested changing the snapshot schedules and removing snapshots off the aggregates and also removing hourly snapshots off the volumes on the aggregate. We complied. It took a while, but after 5 hours or so, relatively suddenly the problem went away and seems to have stayed away for a day now. I am assuming the snapshot cleanup was the problem and the problem stopped when it finally caught up. So, I guess my question is "why is snapshot cleanup so expensive". Do blocks freed from snapshots need to be written, if so why? It seems snapshot creation is cheap, but deletion expensive, which makes the entire snapshot management cycle rather more trouble then you might hope. Regards, pdg

4 3

NFS Lock and Oracle
by Craig A. Falls 15 May '12

15 May '12

Hi, I was wondering if anyone had a script for viewing/removing NFS locks on a filer? We have just started to use Oracle on NFS and have had an issue where we have had the host crash and leave locks which then stop the DB being started again (on that node) when that node is recovered and will probably (guessing) eventually stall the DB when other nodes try to access the data that is being locked. I was planning on something like: - check the locks for a filer/vfiler for the specific host - remove the locks using lock status/lock break Any thoughts/help or advice would be appreciated. thanks c

3 2

Writing to the list via Gmane
by Oliver Brakmann 15 May '12

15 May '12

Hello all, just a heads-up: if this post reaches the list, it means that posting to the list via gmane (http://gmane.org) works again. It broke after the move to the new domain, but I contacted the Gmane maintainer recently and asked him to please fix the address, which has now happened. Regards, Oliver

1 0

What is Flash Cache caching?
by Fletcher Cocquyt 15 May '12

15 May '12

We setup a custom view in NMC to show us the hit % and Disk Reads replaced per the Flash Cache Best Practices doc TR-3832. Is it possible to get more granular information about what is being cached? These are NFS data stores and one of the volumes is 500Gb of web content. So the web team would be very interested to know the net benefit to them - eg how much are the apache instances benefitting. We also have oracle instances, VMware volumes and would like per volume stats if possible I guess I am looking for the ability to apply the custom view to different vFilers and see per volume hits and hit %, Disk Reads replaced We have the 1Tb card installed and see the Disk Reads Replaced topping out around 5000 - with one spike up to ~8500 Does this look like what others are seeing in a mixed workload environment like I described? thanks, Fletcher

3 5

Cleaning up snapvault relationships.
by Jeff Cleverley 14 May '12

14 May '12

Greetings, We retired a SV primary volume and I need to keep the destination SV volume around until the snapshots age out. The backups are done via command line/cron/snapvault snap sched setups. They are not managed by DFM/PM or any other package. I did a snapvault stop on the destination which seems to have worked fine. I verified that the destination qtree is gone. I also did a snapvault release on the primary and a snapvault snap unsched on both primary and secondary. I've even rebooted the destination filer which would have the effect of stopping and starting snapvault. The primary shows no trace of the former relationship. The secondary got rid of everything, but some of the destinations still show up in the snapvault status output. I need to figure out how to clean these up. Here is the output of a snapvault -l of one of the destination qtrees: Snapvault secondary is ON. Source: source:/vol/carmel2/test Destination: destination:/vol/carmel2_i/test Status: Idle Progress: - State: Unknown Lag: - Mirror Timestamp: - Base Snapshot: - Current Transfer Type: - Current Transfer Error: - Contents: - Last Transfer Type: Update Last Transfer Size: 8 KB Last Transfer Duration: 00:00:06 Last Transfer From: source:/vol/carmel2/test Thanks, Jeff -- Jeff Cleverley Unix Systems Administrator 4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611

1 1

revision control for /etc/exports with post-commit exports -va API call?
by Fletcher Cocquyt 11 May '12

11 May '12

Is anyone doing revision control on /etc/exports with post-commit hooks to call exportfs -va? (including vfiler:/rootvol/etc/exports)? Or more generally what is the recommended best practice for these command line file updates in a team environment (multiple admins)? There used to be a Filerview UI for updating exports, but I can not see one in the new System Manager - is it there? In a team environment we'd like to avoid touching files directly with vi (and introducing errors) - but if we need to, we'd like revision control and automation with post-commit hooks which has worked well for our apache and puppet configs. thanks Fletcher

1 0

SMSQL question
by Klise, Steve 10 May '12

10 May '12

I just upgraded a SQL 2005 SQL cluster from snapdrive 5, to 6.3.1r1 and SMSQL 2.x to 5.0. I also upgraded the verification server (remote server). Now, after upgrading, on a set of drives (G:\data and L:\logs), the backup works, but DBCC fails. What is strange is there is another set of drives (H:\data M:\logs) DBCC works.. I checked for snapshots that would still be mounted, flex clones, etc. Nothing appears to have "busy snapshots". This is the error (names/ip's changed to protect the innocent). Even checked for LUNS that would be not assigned to an igroup.. nothing.. This is the event from the verification server: Event Type: Error Event Source: SnapDrive Event Category: Generic event Event ID: 331 Date: 5/4/2012 Time: 1:19:32 PM User: N/A Computer: VERIFICATIONSRV Description: Failed to connect FlexClone of Snapshot copy (sqlsnap__SQL__recent) as drive C:\Program Files\NetApp\SnapManager for SQL Server\SnapMgrMountPoint\MPDisk001\ on the computer (VERIFICATIONSRV). LUN Name = lun0.lun Storage Path = /vol/SQLSERVER/SQLSERVER/ Protocol Type = LUN Storage System Name = 10.10.10.10 Error code : ZAPI: An attempt to online LUN '/vol/sdw_cl_SQLSERVER_0/SQLSERVER/lun0.lun' failed on the storage system 10.10.10.10. Error description: Another LUN mapped at this number This is the snap list. Volume SQLSERVER working... %/used %/total date name ---------- ---------- ------------ -------- 0% ( 0%) 0% ( 0%) May 04 13:18 sqlsnap__SQLSERVER__recent 0% ( 0%) 0% ( 0%) May 04 11:33 sqlsnap__SQLSERVER_05-04-2012_11.33.12__weekly 0% ( 0%) 0% ( 0%) May 04 10:43 sqlsnap__SQLSERVER_05-04-2012_10.42.44__weekly 0% ( 0%) 0% ( 0%) May 04 10:03 panstor2(1573990916)_SQLSERVER_snap.65 (snapmirror) 0% ( 0%) 0% ( 0%) May 04 10:03 @snapmir@{EF1F6E8C-64BA-4C23-A8F1-1566E881A373} 0% ( 0%) 0% ( 0%) May 04 10:01 sqlsnap__SQLSERVER_05-04-2012_10.00.07__daily 0% ( 0%) 0% ( 0%) May 04 07:59 sqlsnap__SQLSERVER_05-04-2012_07.58.44__weekly 0% ( 0%) 0% ( 0%) May 04 07:20 sqlsnap__SQLSERVER_05-04-2012_07.19.32__weekly 0% ( 0%) 0% ( 0%) May 04 07:12 @snapmir@{5CF5E20A-6C44-4CE9-B212-0F4BAEB90636} 0% ( 0%) 0% ( 0%) May 04 07:11 sqlsnap__SQLSERVER_05-04-2012_07.10.46__daily 0% ( 0%) 0% ( 0%) May 03 20:40 sqlsnap__SQLSERVER_05-03-2012_20.39.16__daily 0% ( 0%) 0% ( 0%) May 03 19:00 sqlsnap__SQLSERVER_05-03-2012_19.00.00__weekly 0% ( 0%) 0% ( 0%) May 03 18:30 sqlsnap__SQLSERVER_05-03-2012_18.30.00__daily 0% ( 0%) 0% ( 0%) May 03 14:30 sqlsnap__SQLSERVER_05-03-2012_14.30.00__daily 0% ( 0%) 0% ( 0%) May 03 10:30 sqlsnap__SQLSERVER_05-03-2012_10.30.00__daily 0% ( 0%) 0% ( 0%) May 02 19:00 sqlsnap__SQLSERVER_05-02-2012_19.00.00__weekly 0% ( 0%) 0% ( 0%) May 01 19:00 sqlsnap__SQLSERVER_05-01-2012_19.00.00__weekly 0% ( 0%) 0% ( 0%) Apr 30 19:00 sqlsnap__SQLSERVER_04-30-2012_19.00.00__weekly Any help would be appreciated.

2 3

NDMP backups of named snapshots
by John Stoffel 09 May '12

09 May '12

Guys, Anyone know if it's possible do an NDMP backup (using CommVault) of a *named* snapshot? Basically I need to snapshot two or more volumes at the same time (I can do this in a pre-script) and then let the snapshots be backed up. I guess I could be silly and just create a flex clone volumes and then back those volumes up instead, which would get rid of the issue, but would complicate things a bit more. Thanks, John John Stoffel - Senior Staff Systems Administrator - System LSI Group Toshiba America Electronic Components, Inc. - http://www.toshiba.com/taec john.stoffel(a)taec.toshiba.com - 508-486-1087

4 10

15/19 vFilers migrated - 64 bit aggr upgrade ontap next
by Fletcher Cocquyt 09 May '12

09 May '12

So we've reached the point where our 32->32 bit data motion (vFiler migrations) are almost complete - (pending one vFiler stuck in "Migrate Failed" state case, and a previously fun flex clone vFiler) we purposely destroyed the destination (> 19Tb 64bit) aggregates and recreated them < 16Tb to work around 32->64bit data motion incompatibility Next we will initiate non-disruptive 64 bit aggregate upgrades by adding disk (the only currently supported method) Its beaucoup hoops, but we are almost there - our goal is to data motion off all vFilers btw partner clusters to complete our flash cache hardware upgrade thanks Fletcher

1 1

SAS port swap.
by Jeff Cleverley 08 May '12

08 May '12

Greetings, I have a SAS stack (plus other FC ds14s) in a FAS60xx cluster running 7.3.5.1. It has 2 quad port SAS HBAs per filer. The stack currently uses ports 6a and 7d on each filer. I need to add a couple more stacks. To make the cabling consistent I would like to move the current 7d connection to port 7a. When I moved the cable from 7d to 7a on one of the heads, it never brought the port online and never saw the disks on that port. I tried the storage enable adapter 7a but it still showed it was offline. I ran various sysconfig, storage show, disk show, etc, type commands to get the system to go find the disks on the new path. Nothing seemed to help. I waited 5 minutes after doing this just in case there was some type of discovery lag, but they never showed up. When I plugged back into 7d, everything is fine. I thought moving to a new port should work. Is there something else I'm missing to make this work? Thanks, Jeff -- Jeff Cleverley Unix Systems Administrator 4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611

2 2

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

toasters May 2012