> On Thu, 12 Jun 1997, Dave Hitz wrote:
> >
> > Although marketing certainly has input, the space restrictions also
> > come from engineering droids. During normal operation the smaller
> > machines could certainly handle more disk, but the time required for
> > something like RAID reconstruction could get dangerously long.
>
> Is this the reason why the F210 can still only use around 50GB of
> disk, even though it can physically accomodate more …
[View More]using 9GB drives?
In order for a RAID reconstruction to complete, you have to read all of
the data on all of the other disks. So RAID reconstruction time is
proportional the the amount of data, not proportional to the number of
disks.
Dave Hitz hitz(a)netapp.com
Network Appliance (408) 367-3106
[View Less]
One of my inn 1.4unoff4 news reader servers started throttling
itself just today with "Interrupted system call writing article file"
(happened twice in the past 24 hours). The spool is on an F230
running 4.0.3, 256MB of read cache and 4MB of write cache. The news
server is an Ultra 170, 512MB of RAM, ~250 to 300 readers around peak
times. The two are on a FDDI ring.
The F230 hovers around 65% CPU usage, so I don't think that's the
problem, but the Ultra is reporting 900 to 1200 packets …
[View More]per second
both in and out of its FDDI interface. Half of its time is spent in
the kernel, according to top(1). The mounts are NFSv3 over UDP.
Would dropping back down to NFSv2 help any? I'm trying to determine
if this is a network congestion problem, or an OS limitation (on
either the Netapp or the Sun).
--
Brian Tao (BT300, taob(a)netcom.ca)
"Though this be madness, yet there is method in't"
[View Less]
I'm following up on a thread from mid-June. You may recall that I asked
for assistance deciding between a couple of small NetApps and a Falcon
Systems FastFilePro 7000. I've include Andy Watson's response below for
context.
We made our decision to go with NetApp last week. Two key factors:
- NetApp is the market leader and has strong momentum. With NetApp, I'm
not concerned about owning an orphaned product, and I expect that the
product line will continue to develop rapidly. …
[View More]NetApp's large customer
base is a major asset because I feel comfortable that the toasters user
community will be a good source of information as well as a point of
leverage for ensuring that NetApp is responsive.
- NetApp's entry-level price is very low compared to that of the
FastFilePro 7000, and upgrades are easy and penalty-free.
Our plans have changed a bit -- we're buying an F210 now for a big customer
project and the two we originally wanted this fall.
Thanks very much everyone for sharing your thoughts with me.
-- Marc Rouleau
VP and Chief Technology Officer Voice: (812) 479-1700 Fax: (812) 479-3439
World Connection Services, LLC http://www.evansville.net
On Jun 13, 4:36pm, Andy Watson wrote:
> Marc --
>
> Given that no one else has offered you opinions (none that I've
> seen on this distribution list, anyway), I thought I ought to
> comment. I'll try not to exhibit too much of a NetApp vendor bias.
>
>> Apologies if this is off the charter -- let me know (gently, please!)
>> and I'll desist -- but I'm evaluating dedicated NFS servers and have
>> narrowed things down to Network Appliance (F220, qty 2) versus Falcon
>> Systems (FastfilePro 7000, qty 1). I must admit that I'm leaning
>> toward the 7000 due to its superior expandability and greater
>> administrative flexibility (multiple filesystems and RAID sets, ability
>> to run old drives via JBOD on the SCSI port). ...
>
> As I'm sure you are aware, NetApp has made design choices towards
> simplification. The goal is not to offer "administrative flexibility"
> but instead to provide a dedicated function appliance where admin
> is streamlined to a minimal set of tasks.
>
> For example, unlike a mundane file server that requires the
> administrator(s) to load balance among multiple file systems,
> and reallocate storage between them, a NetApp filer has a single
> file system that can be *software*-partitioned among subtrees
> such that space allocation can be managed on-the-fly by simply
> editing the tree quotas in /etc/quotas, and where no load-balancing
> whatsoever need be attempted. You can export the single file system
> as if it were multiple file systems by exporting subtrees with
> various permissions and options specified in the /etc/exports file.
>
>> ... Also, the 7000 tests
>> about even with the F540 in various magazine reviews, and NetApp's own
>> LADDIS testing shows the F220 to be only half as fast as the F540 (is
>> this true in the real world?).
>
> Well, in at least one magazine review I think it was UNIX REVIEW,
> as I recall) a benchmark called "bigdisk" was used, which showed
> Falcon's performance as comparable to a NetApp filer. Other than
> that I don't recall other reviews where Falcon matched NetApp.
> Anyway, in that article, bigdisk exercised the Falcon using NFSv2
> over UDP, because at that time Falcon's software did not support
> NFSv3 nor NFS over TCP. NetApp was tested with NFSv3 over TCP.
> Running NFS over TCP is known to hurt performance compared to UDP,
> but TCP is provided because it is better for WANs and other "lossy"
> networks where packet loss can cause heavy retrans overhead.
> Anyway, we ran bigdisk internally, using NFSv2 over UDP, and
> our performance was significantly better than what that magazine
> reported for Falcon.
>
> On the broader topic of benchmarking, the industry-standard
> benchmark of NFS file server performance is SFS (also known as
> "LADDIS") from SPEC. SPEC has a peer review process whereby
> official results are submitted, and if accepted by SPEC, can
> be published in the SPEC Newsletter and on the SPEC website --
> (see http://www.specbench.org/osg/sfs93/results/results.html).
> You will find Falcon and NetApp SFS benchmark results at that
> site (and in the hardcopy Newsletter, if you have access to it).
> NetApp has also published SFS results which have been accepted
> by SPEC, but which have been inexplicably delayed in appearing
> on the SPEC website. SPEC assures me that they will appear
> very soon, probably by early next week. In the meanwhile, you
> can see our latest results for our F210, F230, F520, and F630
> filers at http://www.netapp.com/technology/performance.html.
>
> Falcon has published four results via SPEC, but none of them
> are annotated as being for a model "7000", so I'm not sure
> what that machine's SFS benchmark results might be. Also,
> in 3 out of those 4 tested configurations, Falcon used a
> SSD (SolidState Disk) to significantly accelerate their
> performance. If the proposed Falcon configuration you are
> considering does not include that SSD, then you should ask
> to have that added if you want to optimize that system's
> performance. On the 4th Falcon configuration for which they
> have published SFS results, instead of SSD, they combined
> a BBU (Battery Backup Unit) with a Mylex disk controller's
> RAM cache to accelerate Write performance. If you are
> considering that approach, instead of using SSD, be sure not
> to omit the BBU, because otherwise pending Writes (asynchronously
> acknowledged to the clients) could be lost in the event of
> a power interruption. (NetApp uses battery-backed NVRAM
> ((Non-Volatile RAM)) to capture and preserve updates to the
> file system, evenin the event of an interruption in power.
>
> The SPEC results will show you that our filers outperform Falcon's
> servers in terms of average response time. This means we provide
> faster replies to clients. Also, Falcon has only published results
> for one configuration where parity RAID protection was enabled.
> That happens to be for a model 9000 system with 62 disks, with
> three RAM-caching Mylex controllers, and an SSD. Even so, two
> F220s together will present more total ops/s (1213 ops/s each,
> or 2426 together) than the large-config Falcon 9000 (2392 ops/s).
> Each F220 needed only 14 disks to achieve this performance,
> so even two of them together were exercising less than half the
> spindles than the high-end Falcon. And with our latest software
> release, and the newer F230 filer, the performance has improved
> both in terms of faster response time (4.5ms for 1221 ops/s,
> compared to the F220's 9.4 ms for 1231 ops/s) and greater max
> throughput (1610 ops/s).
>
> But this is benchmarking numerology, to some extent. You are
> wise to ask about realworld performance. We have consistently
> found that NetApp filers deliver better performance for actual
> application environments than indicated by the SFS benchmark
> because the benchmark has a "worst-case" workload, compared
> to most application requirements. Also, unlike other vendors
> (including Falcon) where the SFS benchmark load is *perfectly*
> load-balanced across many file systems, NetApp is benchmarking
> with a single file system. Realworld applications depend on
> the performance of the individual file system where the target
> data resides. It is rare to find an individual application that
> is actively exercising data stored in multiple file systems.
>
> I've written a paper which goes into greater detail on this
> topic: "Interpreting SPEC SFS ('LADDIS') Benchmark Results"
> (http://www.netapp.com/technology/level3/3010.html). It doesn't
> specifically mention Falcon, but I think you will find it useful
> in the analysis of your requirements and the comparison of NetApp
> filer performance versus benchmark results for multiple-file system
> configurations such as Falcon's.
>
>> On the other hand, NetApp tells me that the highly random nature of my
>> traffic -- typical ISP work including email, USENET news, and webservers --
>> should point me toward multiple servers to maximize main memory cache hits.
>> They say that the main memory cache on the 7000 will become worthless as I
>> expand storage and that performance will drop off. I don't have a good
>> feel for the importance of main memory cache here -- the 7000 uses hardware
>> RAID controllers with onboard cache, so it seems like it would be less
>> important to the Falcon product than it is to the NetApp line. But I must
>> admit that NetApp's larger installed base -- 1000's versus 100's of
>> systems -- makes it a safer choice on the surface.
>
> On the above topic, you should check out Karl Swartz's paper:
> "The Brave Little Toaster Meets Usenet" (a USENIX LISA paper --
> see http://www.netapp.com/technology/level3/brave.html).
>
> In general, for large data sets and the highly randomized loads seen
> by growing ISPs, it is futile to try to solve your I/O problems with
> larger caches. Let's say you have 20 GB of data that is being accessed
> by a large population of diverse users, such that there is relatively
> little likelihood of data re-use from the cache. If even only 25% of
> that 20-GB is exercised, then you would need 4 GB RAM, and you'd
> merely fill it with data retrieved only to never need it again,
> as the next 4-GB of different data flows through the cache over the
> next equivalent time interval.
>
> So what you really need to focus on is disk subsystem performance.
> And that's where we excel. Note that we have published all our
> benchmark results with relatively small disk complements, giving
> us the best per-disk (and per-file-system) performance results
> in the industry.
>
> >Anyone care to share his/her thoughts on this topic?
>
> Well, I hope this helped. I tried to provide as much relevant
> information as I could, with references, and to avoid spouting
> unsubstantiated opinions.
>
> Good luck in your exploration of alternatives!
>
> -- Andy
>
> Andy Watson
> Director, Technical Marketing watson(a)netapp.com
> Network Appliance +1 408 367 3220 voice
> 2770 San Tomas Expressway +1 408 367 3151 fax
> Santa Clara, CA 95051 <http://www.netapp.com/>
>
> "It's really stupid to be an intellectual when you're young.
> You should be an intellectual when you're a hundred years old
> and can't feel anything anymore."
> -- a character in Bruce Sterling's novel, HOLY FIRE
>
>
>
>
[View Less]
Is anyone out there using their NetAPP to house Oracle, Informix or
Sybase databases? If so what have been the results? Any feedback on
this issue would be greatly appreciated.
Thanks Tim Lewis
I have a 220 with 4.0.3 and 37 quoted trees.
These mount fine on all our machines (solaris, aix, linux,
ultrix, digital unix), except on sgi's with IRIX 6.2
or higher, regardless of model. On 5.2 or 5.3, there is no
problem at all.
Upon doing 'mount -a' i get the following errors:
mount: chatelier:/mn/chatelier/u37 already mounted
mount: chatelier:/mn/chatelier/u36 already mounted
mount: chatelier:/mn/chatelier/u35 already mounted
mount: chatelier:/mn/chatelier/u20 already mounted
chatelier:/…
[View More]mn/chatelier/u34 mounted on /mn/chatelier/u34
chatelier:/mn/chatelier/u33 mounted on /mn/chatelier/u33
chatelier:/mn/chatelier/u32 mounted on /mn/chatelier/u32
mount: chatelier:/mn/chatelier/u30 server not responding: Timed out
mount: chatelier:/mn/chatelier/u31 server not responding: Timed out
mount: giving up on:
/mn/chatelier/u31
mount: giving up on:
/mn/chatelier/u30
mount: chatelier:/mn/chatelier/u29 server not responding: Timed out
mount: chatelier:/mn/chatelier/u28 server not responding: Timed out
mount: giving up on:
/mn/chatelier/u29
mount: giving up on:
/mn/chatelier/u28
mount: chatelier:/mn/chatelier/u27 server not responding: Timed out
mount: giving up on:
/mn/chatelier/u27
mount: chatelier:/mn/chatelier/u26 server not responding: Timed out
mount: giving up on:
/mn/chatelier/u26
mount: chatelier:/mn/chatelier/u25 server not responding: Timed out
mount: giving up on:
/mn/chatelier/u25
mount: chatelier:/mn/chatelier/u24 server not responding: Timed out
mount: giving up on:
/mn/chatelier/u24
mount: chatelier:/mn/chatelier/u23 server not responding: Timed out
mount: giving up on:
/mn/chatelier/u23
mount: chatelier:/mn/chatelier/u22 server not responding: Timed out
mount: giving up on:
/mn/chatelier/u22
mount: chatelier:/mn/chatelier/u21 server not responding: Timed out
mount: giving up on:
/mn/chatelier/u21
mount: chatelier:/mn/chatelier/u19 server not responding: Timed out
mount: giving up on:
/mn/chatelier/u19
As you can see, the sgi will mount some trees and time out on the
rest. Each time I mount -av some more trees will mount, and
the rest will time out.
Has anyone else seen this?
--
Ståle Johansen, University of Oslo.
[View Less]
Hello everyone,
I am quite perplexed by the behavior I am seeing below:
david@vortex ~
$ find * -type f -print | sed -e 's%.*%ls -ldtr "&" .snapshot/*/"&"; echo%' | bash > messed.up.files
.snapshot/*/Mail/.dir.tiff: No such file or directory
.snapshot/*/NeXT/Apps/.dir.tiff: No such file or directory
.snapshot/*/NeXT/Apps/.opendir.tiff: No such file or directory
.snapshot/*/NeXT/Apps/Icon.app/brushes/.places: No such file or directory
.snapshot/*/NeXT/Apps/Icon.app/help/.places: No …
[View More]such file or directory
.snapshot/*/NeXT/Apps/Icon.app/help/.list: No such file or directory
^C
david@vortex ~
$ ls -l .snapshot/*/NeXT/Apps/Icon.app/help/.list
-rw-r--r-- 1 david staff 1095 Nov 14 1992 .snapshot/hourly.0/NeXT/Apps/Icon.app/help/.list
-rw-r--r-- 1 david staff 1095 Nov 14 1992 .snapshot/hourly.1/NeXT/Apps/Icon.app/help/.list
-rw-r--r-- 1 david staff 1095 Nov 14 1992 .snapshot/hourly.2/NeXT/Apps/Icon.app/help/.list
-rw-r--r-- 1 david staff 1095 Nov 14 1992 .snapshot/hourly.3/NeXT/Apps/Icon.app/help/.list
-rw-r--r-- 1 david staff 1095 Nov 14 1992 .snapshot/weekly.0/NeXT/Apps/Icon.app/help/.list
Would anyone care to comment? This is greatly disturbing.
Regards,
David K. Drum
david(a)more.net
--
"That man has a rare gift for obfuscation." -- ST:DS9 * "It's hard to
be bored when you're as stupid as a line." -- Vernor Vinge * "Reality
has a tendency to be so uncomfortably real." -- Neil Peart * "You can
only measure the size of your head from the inside." -- Larry Wall
[View Less]
Among our netapps we have a F330 which will soon be put into use as a fileserver for
our internal use. We have PCs, various UNIX workstations, and a few Macs. The Macs are
giving me a headache - the netapp doesn't support Appletalk, so we are pondering the
alternatives :
We've tried 'DAVE' from Thursby systems, which gives the Mac CIFS-client
capabilities. This works fine with NT servers, however it does not work with netapps.
(This is noted in bug #4140 - with no indication of whose …
[View More]problem it is, or, for that
matter, when it might be fixed).
Other options include NFS client software on the Macs, or setting up a gateway, which
will mount the netapp via nfs, and re-export the filesystems via Appletalk (using linux and
the free appletalk package, for example).
Has anyone tried any of these solutions ? What will work ?
regards,
--
---Ketil Kirkerud, Scandinavia Online
I synergistically leverage my paradigms
[View Less]
At 05:16 PM 7/7/97 -0400, Christoph Doerbeck wrote:
>I'm looking for input regarding why I might (or might not) encourage
>my users to allow me to upgrade the filer OS from 3.1c to 4.0.1c.
>What features are we missing? What pitfalls can I expect? I just
>want to cover my bases...
>
>What we have currently is very stable.
As far as I am concerned, there is one major advantage to the 4.x series.
It can serve files over CIFS. It can also serve files over http, but
that's much …
[View More]less important to me.
We have a large and growing number of PCs, and installing NFS software on
them is a pain in the rear. Being able to serve files over CIFS is very
convenient.
If you don't need CIFS or HTTP file service, then it's a much closer call.
'course, 4.x is their current release, and any performance tweaks they make
are gonna go into 4.x, which will probably make it a bit faster, and
potentially a bit less stable. If you're performance hungry, go with 4.x.
If you absolutely have to stay up, ask them for their current list of known
bugs in 3.1c and 4.0.1c, read them carefully, and decide which set you're
less likely to trigger.
They're pretty good with their known bugs list.
Amy
[View Less]
Had something similar happen to us. One of our Netapps failed a
drive and reconstructed its data onto a different drive (from the hot
spare pool). The failed drive should have been replaced asap but it
fell through the cracks and was forgotten.
A week later we shut it down to replace an fddi card and afterwards
it wouldn't boot up. The failed drive was apparently working well
enough so that the Netapp thought it had a RAID drive that wasn't a
valid member of the array (inconsistent disk …
[View More]labels). Once we removed
the problem drive the Netapp booted just fine.
I don't understand why your bad drive was added to the hot spare
pool upon reboot. It should have had a label that was inconsistent
with the other drives and the Netapp shouldn't have booted.
>> Could the Netapp somehow mark a bad drive so that the information
>> is kept across boots?
If a failed drive is working after a reboot then its label should
be inconsistent with the other drives in the array and the Netapp
shouldn't boot.
NOTE: For those wondering what label I am talking about here is an
excerpt from the System Administrator's Guide chapter on
Troubleshooting:
The Netapp writes a label on each disk indicating its position
in the RAID disk array.
regards,
Steve Gremban gremban(a)ti.com
PS: I noticed that you are running without a hot spare configured.
We normally configure a spare in order to minimize time spent
at risk in degraded mode.
Do you consider the risk is minimal that another drive will
fail before the first one is replaced and rebuilt or do you
have some plan in place to ensure that someone is notified
immediately? What about early in the morning, on weekends,
holidays?
Anyone else out there running without hot spares?
------------- Begin Forwarded Message -------------
>From madhatta(a)turing.mathworks.com Mon Jul 7 15:40:34 1997
Date: Mon, 7 Jul 1997 16:34:26 -0400 (EDT)
From: Brian Tao <taob(a)nbc.netcom.ca>
To: toasters(a)mathworks.com
Subject: Marking failed drives across boots?
MIME-Version: 1.0
We had a problem with our Netapp this morning that could
potentially be quite serious. One drive near the beginning of the
chain (ID 2, I believe) was failed out by the Netapp. Very shortly
thereafter, the filer crashed with a RAID panic and rebooted. Upon
rebooting, it noticed that drive ID 2 was not actively being used, and
proceeded to add it to the hot spare pool. Then it began
reconstructing the data on to (you guessed it) drive ID 2.
In this scenario, there was no time to pull out the bad drive, and
the Netapp happily rebuilt the data on it. I guess the correct
procedure now is to forcibly fail that drive and rebuild to our good
spare drive, and remove drive ID 2. Could the Netapp somehow mark a
bad drive so that the information is kept across boots?
--
Brian Tao (BT300, taob(a)netcom.ca)
"Though this be madness, yet there is method in't"
------------- End Forwarded Message -------------
[View Less]
> Anyone else out there running without hot spares?
Now, no. In the past, yes. We were space hungry and we'd maxed on the
number of disks in some of our earlier and smaller NetApps, so we threw the
spares into the pool because we kept tipping over the 100% full mark and
not being able to write anything and that was bad.
As it turned out, we were okay. The odds of two disks going bad before you
get a chance to replace one of them is fairly minimal, especially if you
set the thing to …
[View More]send you e-mail notifying you of a bad drive.
Especially if you set the thing to send e-mail to both your mailbox and
your pager notifying you of a bad drive.
Amy -- who now has enough co-workers to not have to carry a pager
24x7x365 anymore, but who does it anyway.
[View Less]