RE: wafl_cp_slovol_warning_1 with big latency spikes

16 Jan 2013


      I have seen the same issue when deleting large files off CIFS volumes, resulting in astronomical latency on the whole filer. Especially noticeable on SATA aggrs.
ONTAP/WAFL seem to have an issue reclaiming zombie blocks.
/ Marcus
...
-----Original Message-----
From: toasters-bounces@teaparty.net [mailto:toasters-
bounces@teaparty.net] On Behalf Of Tim Parkinson
Sent: den 16 januari 2013 07:45
To: Fletcher Cocquyt
Cc: toasters@teaparty.net Lists
Subject: Re: wafl_cp_slovol_warning_1 with big latency spikes
Fletcher,
What ONTAP version are you running?
We've had a case open since we swapped our heads out to 3270s with
cp_slo_vols that seemed to be happening at random, but we thought we'd
narrowed it down to times when large metadata writes are occuring.
Deletions into snapshots, for example. Latterly I could reliably trigger it with
storage vmotions - it would normally occur at the end of the process (i.e.
when VMware deletes the files and it ends up in
snaps)
Netapp had us upgrade to 8.1.2RC2 with reference to some bug IDs I don't
have to hand at the moment. We thought this had fixed it - certainly storage
vmotions were not triggering it, however, it reappeared when a number of
LUNs were deleted at once the other day.
Regards,
Tim
On 15 January 2013 21:12, Fletcher Cocquyt fcocquyt@stanford.edu wrote:
...
resending this without the 80kb chart
Yesterday morning one of the heads on our 3270 experienced large NFS
latency spikes causing our VMware hosts and their VMs to log storage
timeouts.
...
This latency does not correlate to any external metrics like CPU,
network, OPS etc.
But in the logs do show CP events on the aggregate hosting the VMs:
Jan 14 05:27:56 [n04:wafl.cp.slovol:warning]: aggregate aggr2 is
holding up the CP.
And the EMS log has CP events logged for the duration of the episode -
what can we do to prevent these issues?
<wafl_cp_toolong_warning_1
        total_ms="117825"
        total_dbufs="32276"
        clean="4312"
        v_ino="3"
        v_bm="29"
        a_ino="0"
        a_bm="3428"
        flush="1209"/>
</LR>
<LR d="14Jan2013 05:19:38" n="irt-na04" t="1358169578"
id="1335304168/148007" p="4" s="Ok" o="wafl_CP_proc" vf="" type="0"
seq="633232" >
<wafl_cp_slovol_warning_1
        voltype="aggregate"
        volowner=""
        volname="aggr2"
        volident=""
        nt="35"
        nb="22045"
        clean="1346852"
        v_ino="0"
        v_bm="113"
        a_ino="0"
        a_bm="4"
        flush="0"
        rgid="2"/>
Netapp support wants me to run perfstats, but the issue is not ongoing

things are idle

thanks

Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters
--
Tim Parkinson
Storage & Server Administrator
Corporate Information & Computing Services The University of Sheffield
10-12 Brunswick Street
Sheffield
S10 2FN
E-Mail: t.r.parkinson@sheffield.ac.uk
Tel:    +44 (0) 114 222 3039
http://www.sheffield.ac.uk/cics/
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

RE: wafl_cp_slovol_warning_1 with big latency spikes