That’s SMVI – the problem is
less pronounced there because there’s (typically) significantly less I/O
to replay from the vmsnap deletion (it’s a factor of activity, which is
highest over a longer period of time – the time is very short with
SMVI). However, with regular vsnaps (outside of SMVI), the snapshots are
created and remain for a longer period of times (I’ve seen some that were
MONTHS old in some environments) – deleting these would be a significant
impact to the CPU and storage I/O.
From: Darren Sykes
[mailto:Darren.Sykes@csr.com]
Sent: Tuesday, November 04, 2008
8:53 AM
To: Glenn Walker; Karlsson Ulf
Ibrahim :ULK; toasters@mathworks.com
Subject: RE: Brief outages on the
filer?
That's true - snapshot deletions are very
heavy on CPU.
We've got quite a bit of headroom
admittedly, but on a 6070 I can't see the spike's in IO when SMVI commits the
changes; if you think about it, it's only the changes that have happened during
the time the machines are being backup up, which is about 10 seconds in our
environment so the impact isn't too great, and takes < half a second.
From: Glenn
Walker [mailto:ggwalker@mindspring.com]
Sent: 04 November 2008 13:45
To: Darren Sykes; Karlsson Ulf
Ibrahim :ULK; toasters@mathworks.com
Subject: RE: Brief outages on the
filer?
Well… maybe:
From: Darren Sykes
[mailto:Darren.Sykes@csr.com]
Sent: Tuesday, November 04, 2008
8:39 AM
To: Glenn Walker; Karlsson Ulf
Ibrahim :ULK; toasters@mathworks.com
Subject: RE: Brief outages on the
filer?
From:
owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com] On Behalf Of Glenn Walker
Sent: 04 November 2008 13:19
To: Karlsson Ulf Ibrahim :ULK;
toasters@mathworks.com
Subject: RE: Brief outages on the
filer?
-----Ursprungligt meddelande-----
Från: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com]För Glenn Walker
Skickat: den 3 november 2008 20:03
Till: Page, Jeremy; toasters@mathworks.com
Ämne: RE: Brief outages on the filer?
Any way you can predict when it will happen? Sysstat (or better yet, perfstat) would be of help here.
Something I’ve noticed on my infrastructure: VMWare over NFS (unsure about other protocols) will have huge spikes where they write lots of data in a quick burst – happens only a few times a day on relatively quiet systems, but I can definitely see a spike on the filer. Perhaps you have the same thing going, just a SWAG…
The impact on our side is not really felt – but the filer does go into back2back CPs from the massive spike (200MB/s – 350MB/s in a short window) and that could manifest itself as ‘poor disk response time’.
In our case, we’re running VMWare over NFS and Exchange over iSCSI on the same filers, but no one is really complaining when the ‘events’ happen. Just something I’ve noticed for a while.
FAS6070 and the busy time is recorded around 6000 NFS IOPS. That said, we did a stress test with about 25 guests running IOMeter and were able to push 15000 NFS OPS on node 1, 10000 NFS OPS on node 2 (a combined 400MB/s write, 300MB/s read) without any sort of reported performance problems.
From: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com] On Behalf Of Page, Jeremy
Sent: Monday, November 03, 2008 11:02 AM
To: toasters@mathworks.com
Subject: Brief outages on the filer?
I am seeing brief outages where my VMs (NFS as the back end protocol) and SQL LUNs (FC) both complain of poor disk response time at the same time. I don’t think it can be the infrastructure since one is IP and the other FC. The LUNs are on a different set of spindles/different aggr then the NFS volumes as well, so I don’t think it’s a disk bottleneck. I’m on a 3070 and rarely do we hit 3500 IOPS (and 90+% of that out of cache) or go above 40% for the busiest CPU (normally we’re in the 15-25% range) so I am not sure what’s going on here, any suggestions on how to troubleshoot it?
We’re running 7.2.4, I want to wait for 7.3.1 to upgrade since we are using NFS for VMware and there are several fixes that will be beneficial to us.
Please be advised that this email may contain confidential information.
If you are not the intended recipient, please do not read, copy or
re-transmit this email. If you have received this email in error,
please notify us by email by replying to the sender and by telephone
(call us collect at +1 202-828-0850) and delete this message and any
attachments. Thank you in advance for your cooperation and assistance.
In addition, Danaher and its subsidiaries disclaim that the content of
this email constitutes an offer to enter into, or the acceptance of,
any
contract or agreement or any amendment thereto; provided that the
foregoing disclaimer does not invalidate the binding effect of any
digital or other electronic reproduction of a manual signature that is
included in any attachment to this email.