Re: Considering filers for major deployment....

14 Nov 1999


      ...
Any advice on what we should really include in our eval to really
test the box out?
I'd try to mirror/simulate as closely as possible whatever workload
you expect to place on your filers when you put them into production.
Don't forget to eval whatever backup/restore sw you are looking at at
the same time.
You might want to test out NA customer support. Put you filer into
some situations that you need to call support for and see if support
can provide you the assistance you need.
"Brian" == Brian Tao taob@risc.org writes:
    Brian> Try the extreme cases like very large directories (10k's or 100k's
    Brian> of files), very deep directory hierarchies, large files
    Brian> (2GB+) and intense file locking activity (something that
    Brian> always sucks over NFS).
It's neat to see how a filer performs under those conditions, but I
expect you have a pretty good idea of the workload you plan to place
on your filers. You may know that you aren't ever going to have to
deal with 3GB files or 100k entry directories.
So, I'd definitly try some extreme cases that result from events
beyond your control (i.e., hw failures):
Try pulling a disk and testing out RAID reconstuct. Turn the filer off
in degraded mode and see if it comes back okay. If you are thinking
about clustering, get a clustered pair on evel. Exercise the
clustering. Try every scenarir you can think of to force a failover
(e.g, turn a filer off, pull a filer's fan, break a filer's FC-AL
A-loop).
Try pulling a filer's disk, then while it is doing a reconstruct, turn
it off to force a takeover. See if the partner does a proper takeover
and begins a reconstuct. Pull a disk on the opposite filer so that it
is doing two reconstructs at once.
Try addind a shelf to a clustered pair w/o having both filers
down. This is supposed to work and is a documented procedure from NA.
Just for kicks, try out some catastrophic things so you can see what
they look like. Pull two disks from the same RAID group. Try turning
off a shelf.
It has been my experience that NA's are great once you get them up and
running if you don't touch them. If you do _anything_ out of the norm
(i.e, any h/w maintenence procedure, or try to utilize any recently
introduced feature, such as SnapMirror), you have a 50/50 chance of
exercising some bug. Our filers have actually been _less_ reliable
since we clustered them. We've had two extended downtimes (> 1 hour)
when trying to take advantage of the clustering in order to perform a
zero-downtime maintenence. I'm not sure I ever want to type 'cf takeover'
on my filers again.
Before we clustered our F740's, we had an F540 and and F630 that never
went down. We then ran an F740 for over a year with no trouble. When
we finally got a pair of F740's and clustered them, we started having
problems.
Good luck. They are truly wonderful devices when they work.
j.
--
Jay Soffian jay@cimedia.com                            UNIX Systems Engineer
                                                         Cox Interactive Media

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Re: Considering filers for major deployment....