Mike Horwath <drechsau(a)tiny.net> writes:
> So, with 2 raid groups...it is all still one filesystem?
It can be with either one filesystem or two (each one with a single
RAID group). There is a nice picture and explanation at:
http://www.netapp.com/technology/level3/3027.html
To avoid fragmentation problems (as in "I have 50 GB free on one
filesystem and 50 GB free on another, but I need 100 GB in one
place"), all of our filers are single volume (filesystem). Using
multiple …
[View More]RAID groups is a much easier decision.
Dan
[View Less]
Actually, I thought of another way to get one RAID group: wipe the filer,
install 4.x on it, and then uprev it... B-)
I am a big fan of having only one configuration, where possible. My
other filers are at 4.3.4 right now, and when I upgrade those to 5.1Dx,
they will have only one RAID group. I see no reason for this one to
be special.
My F630's have among them thousands 'n' thousands of hours of run time.
I have yet to see a single disk failure, let alone a double-disk failure
(knock …
[View More]knock). And I have no plans to put more shelves on them.
Finally, if I have one more RAID group, that'll mean one more hot spare
plus one more parity disk. When you take into account the purchase of
my entire F630, two disks set aside for safety have an opportunity cost
(at list prices) of just over $12,000. Is that safety worth $12,000 to
me?
Well, maybe it is: the not-backed-up-yet data on the rest of a RAID
group could exceed $12,000 in replacement value. And maybe it isn't:
these failures seem unlikely enough to make that $12,000 feel like
earthquake insurance in Bismarck, N.D. Guess this is for me to
figure out, huh?
Brian
[View Less]
hitz(a)netapp.com (Dave Hitz) writes:
> But in addition, RAID reconstruction is much faster. When you
> reconstruct a failed drive, you have to read all the data on all the
> other drives, so if your RAID groups are half as big, then the
> reconstruction goes twice as fast.
My experience is that reconstruction is more than twice as fast. More
like quadruple, depending on server load and size of the RAID group.
Does NetApp have recent numbers on reconstruct time? Especially …
[View More]under
various NFS loads (reading/writing to disk, not just cache, so the
disks are loaded as well).
Dan
[View Less]
Dave Hitz wrote:
>> So at one parity disk per 14 the overhead is 1/14 = 7%, and at one
>> parity disk per 26 the overhead is 3.8%.
Brian Rice <brice(a)gnac.com> writes:
> The cost per usable gigabyte of my F630's has received *considerable*
> executive scrutiny recently. So I'm just a leeeetle sensitive about this
> kind of thing.
You *really* want to use one RAID group per 14 disks. Trust me.
Right now, we have 3 F630 systems (40 disk SCSI, 40 disk SCSI, and …
[View More]26
disk FC). On our first F630, we have a 26 disk RAID group because
multiple RAID groups weren't available at the time. On that RAID group,
disk reconstructions seem to take forever. Once it took 23 hours, under
only moderate load. I tried cranking the raid.reconstruct_speed dial
higher (I think I had it set to 3), but then everyone starting getting
"nfs server not responding" errors.
So, not only are you subject to greater risk during those 23 hours, but
your users will complain about how slow everything is (even if the dial is
set at 2 or 3). Let's face it, NetApps don't perform very well during disk
reconstruction.
On a 14 disk RAID group on an F630, the slowest reconstruction we've
finished in under 4 hours. The fastest was under 2 hours (on the F630 FC).
I guess it's important to note that disk reconstruction speed does not
increase linearly. Double the disks in a RAID group and it can easily
quadruple the time.
Dan
[View Less]
Dave Hitz wrote:
> So at one parity disk per 14 the overhead is 1/14 = 7%, and at one
> parity disk per 26 the overhead is 3.8%.
The cost per usable gigabyte of my F630's has received *considerable*
executive scrutiny recently. So I'm just a leeeetle sensitive about this
kind of thing.
> So are you dead-set on living dangerously?
Maybe, maybe not. My experience with my existing (identical) F630's
suggests I can get away with it. On the other hand, it's bad karma to
turn away a …
[View More]life jacket when somebody offers you one "just in case."
So I dunno.
Let's say, for the sake of argument, that I did decide to live dangerously.
So...what's the best way to get a vol0 with one RAID group in it?
Brian
brice(a)gnac.com
[View Less]
On 09/04/98 00:12:22 you wrote:
>
>I thought this follow-up (to a thread from several months ago) might be
>interesting for people at sites that stress time synchronization.
>
>Daniel Quinlan <quinlan(a)transmeta.com> wrote:
>
>> To prevent it from happening again, we're remotely monitoring the
>> system time of the adminhost (using "mon") to make sure it doesn't
>> drift off again. Incidentally, there seems to be no way to remotely
>> query the …
[View More]system time from a NetApp (except by creating a new file and
>> running stat() on it, or by using unsupported/hidden commands).
>
>So, I started monitoring the time with the kludgey method of creating a
>file and using stat() on it. I wasn't very happy with that method
>because it tended to give false alerts, so I eventually disabled it.
>
>Earlier this week, we had another problem with time synchronization. It
>wasn't actually the NetApp, but I still had to rule it out because it
>wasn't being monitored. I tried port-scanning a NetApp, looking for some
>hidden protocol that might let me reliably query the time. Here's what I
>found:
>
> port tcp udp
> ----- ----------- -----------
> 23 telnet -
> 80 http -
> 111 sunrpc sunrpc
> 137 - netbios-ns
> 138 - netbios-dgm
> 139 netbios-ssn -
> 161 - snmp
> 514 shell syslog
> 520 - route
> 602 - unknown
> 603 unknown -
> 604 - unknown
> 605 unknown -
> 606 - unknown
> 607 unknown -
> 608 - unknown
> 609 unknown -
> 618 - unknown
> 619 - unknown
> 620 unknown -
> 1063 - unknown
> 2049 nfs nfs
> 10000 unknown -
>
>A lot of unknown services (ones not listed in /etc/services, some are
>probably NDMP.
I thought the 600 ports were for rsh, but by they alternate
between TCP and UDP, and aren't contiguous, I don't know.
I have no idea what 10000 is for.
Bruce
[View Less]
Daniel Quinlan <quinlan(a)transmeta.com> writes:
> I'm also interesting in NTP support, but I think I'd like to see the
> current software become a lot more reliable (and complete) before
> adding another new feature. All I need is more NetApp crashes because
> of NTP bugs.
I should add that this happened to us just a few days ago:
- xntpd died on our adminhost (that ran the cron job to set the
NetApp dates)
- time drifted by 8 minutes on the adminhost and the filers …
[View More]over the
course over a few days
- someone started having problems, managed to figure out that
it was because of the time drift, and we fixed it. (Other people
probably had problems, but didn't report them.)
To prevent it from happening again, we're remotely monitoring the
system time of the adminhost (using "mon") to make sure it doesn't
drift off again. Incidentally, there seems to be no way to remotely
query the system time from a NetApp (except by creating a new file and
running stat() on it, or by using unsupported/hidden commands).
However, I'm still saying that I don't want NTP until other supported
protocols (such as NDMP) start working reliably.
Dan
[View Less]
If you are a NetApp F630/FC-AL user and have encountered any (especially
recurring) glitches with interactions between the Fibre Channel host
adapter and OnTAP, or recurring FC hardware failures I'd appreciate
hearing from you.
On the flip side, if you are a NetApp F630/FC-AL user and have seen
perfect performance from your filer, I'd love to hear about it too.
We have had the first F630/FC-AL in Japan in our machine room for the last
month or so, and have been through some interesting …
[View More]experiences.
Thanks,
Matt Ghali
--matt(a)bikkle.interq.or.jp-------------------------------------------
Matt Ghali MG406/GM023JP - System Administrator, interQ, Inc AS7506
"Sub-optimal is a state of mind." -Dave Rand, <dlr(a)bungi.com>
[View Less]
Cliff Johnson <yrsuser(a)televar.com> wrote:
>We have a new F230 installed here (5.1R1, 1.9i firmware),
>with a Quad 10/100 card in it. I've configured the 4 ports,
>and have no problems talking to (or mounting NFS from) any
>of them from the Unix side of the house.
>
>Unfortunately, the only network connection that shows up
>on the NT side of the house is the one on the F230 mother-
>board (e0), which is not the one I want the users accessing.
>
>I can …
[View More]map a network drive using the 4 "other names", (only
>if I attach as "nobody", but that's probably another problem),
>but I can only "see" the main port in the network neighborhood.
>And I'd rather the users not have to memorize the names of the
>different ports.
>
>[..]
Hi Cliff,
Is your filer's CIFS configured for WINS?
If it is, then the filer should be registering all your
interfaces with the WINS server using the standard WINS
multihomed registration protocol. (This is how a multihomed NT
server would register with WINS.) In that situation, CIFS
clients that use WINS resolution will choose one of the filer's
IP addresses from the registered list. (Depending on the type of
client, the address selection algorithm varies; for example,
older Microsoft clients tend to choose the address randomly,
while NT4 SP3 clients try to be more intelligent. Microsoft's
MSDN support web site has some articles detailing this -- try
searching for "multihomed" and/or "WINS".) (By the way, as of
version 5.1 of Data ONTAP there is a new ifconfig option,
"-wins", that can be used to disable WINS registration for a
particular interface. It's documented in the na_ifconfig(1) man
page. Maybe you should try setting that option on your e0
interface?)
If your filer's CIFS is *not* configured for WINS, then CIFS
should be broadcasting its presence on each of its interfaces.
For example, if you have two interfaces, e0 at 10.0.0.1 and e1 at
192.168.0.1, CIFS will periodically send a broadcast on e0 that
reads "FILER is at 10.0.0.1", and on e1 the broadcast will read
"FILER is at 192.168.0.1". (Note that each packet only mentions
the IP address for its particular interface.) This is how an NT
server that isn't configured for WINS would announce itself.
Hope that helps some - let me know!
Matt Day <mday(a)netapp.com>
Network Appliance, Inc.
[View Less]
Hello,
We have a new F230 installed here (5.1R1, 1.9i firmware),
with a Quad 10/100 card in it. I've configured the 4 ports,
and have no problems talking to (or mounting NFS from) any
of them from the Unix side of the house.
Unfortunately, the only network connection that shows up
on the NT side of the house is the one on the F230 mother-
board (e0), which is not the one I want the users accessing.
I can map a network drive using the 4 "other names", (only
if I attach as "nobody", but …
[View More]that's probably another problem),
but I can only "see" the main port in the network neighborhood.
And I'd rather the users not have to memorize the names of the
different ports.
I can use the server manager on the PDC to add the other 4
names, but they only show up as ghosts, and the PDC tells me
that they are not responding if I try to do anything else with
them. (I can ping them, though).
Any thoughts? I really don't want to use the main motherboard
network connection (currently 10Mbps) to handle anything other
than Root telnet sessions to the filer.
Cliff Johnson
yrsuser(a)televar.com
[View Less]