I just want to say for the record that we've had vol0 on our filers be 100% full several times and our filers have never crashed because of it, at least not any time I can recall.
Not that vol0 being full is a good thing, of course, but it's always been recoverable in my experience without a crash.
As anyone at NetApps disclosed a reason for this to happen? (the crash I mean)
Thus spake Mike Sphar, on Tue, Feb 05, 2002 at 01:08:55PM -0800:
I just want to say for the record that we've had vol0 on our filers be 100% full several times and our filers have never crashed because of it, at least not any time I can recall.
Not that vol0 being full is a good thing, of course, but it's always been recoverable in my experience without a crash.
-- Mike Sphar - Sr Systems Administrator - Engineering Support Peregrine Systems, Inc.
-----Original Message----- From: Jim Harm [mailto:jharm@llnl.gov] Sent: Tuesday, February 05, 2002 1:23 PM To: toasters@mathworks.com Subject: Re: vol0 question
An argument for isolating root volume is that we are still warned in the netapp admin classes that if the root volume gets full, the filer will crash. We can build qtress to limit client to just under 100% or we can reserve 1% snap and turn off snaps. But will either of those really work? Will the filer potentially crash on a full volume even if it is not root volume?
I'm guessing the system info is loaded into RAM on boot and so doesn't need to access disk in most cases. This would allow the filer to continue running even though data cannot be written to vol0.
You also don't need to write much, if any, system info once you're up and running.
~JK
Jose Celestino wrote:
As anyone at NetApps disclosed a reason for this to happen? (the crash I mean)
Thus spake Mike Sphar, on Tue, Feb 05, 2002 at 01:08:55PM -0800:
I just want to say for the record that we've had vol0 on our filers be 100% full several times and our filers have never crashed because of it, at least not any time I can recall.
Not that vol0 being full is a good thing, of course, but it's always been recoverable in my experience without a crash.
-- Mike Sphar - Sr Systems Administrator - Engineering Support Peregrine Systems, Inc.
-----Original Message----- From: Jim Harm [mailto:jharm@llnl.gov] Sent: Tuesday, February 05, 2002 1:23 PM To: toasters@mathworks.com Subject: Re: vol0 question
An argument for isolating root volume is that we are still warned in the netapp admin classes that if the root volume gets full, the filer will crash. We can build qtress to limit client to just under 100% or we can reserve 1% snap and turn off snaps. But will either of those really work? Will the filer potentially crash on a full volume even if it is not root volume?
-- Jose Celestino japc@co.sapo.pt
In an endless maze begotten, to dead ends led by fools I sought a plague for those who smiled at walls, In humble fear
On Wed, 6 Feb 2002, Jose Celestino wrote:
As anyone at NetApps disclosed a reason for this to happen? (the crash I mean)
Thus spake Mike Sphar, on Tue, Feb 05, 2002 at 01:08:55PM -0800:
I just want to say for the record that we've had vol0 on our filers be 100% full several times and our filers have never crashed because of it, at least not any time I can recall.
Not that vol0 being full is a good thing, of course, but it's always been recoverable in my experience without a crash.
We've filled up vol0 once or twice as well; in recent versions of ONTAP, it seemed to recover without a problem. I vaguely recall - back in the old 3.x or 4.x releases? - when a filer with just one volume would fill up it would cause crashes or odd behavior...
My theory is that you're more likely to crash or hang the machine if you're writing log messages locally instead of pushing them over to a syslog host. I'm guessing that there's some kind of buffer in the ONTAP kernel for log messages; when you run out of space and can't flush that buffer to disk, eventually the logging thread blocks and things get wonky. I'm guessing that ONTAP now just drops the messages it can't flush rather than block or overflow its message buffer.
Just a theory, anyway. Anyone with a spare filer want to dink with their syslog.conf and try a couple of tests? :-)
-- Chris
-- Chris Lamb, Unix Guy MeasureCast, Inc. 503-241-1469 x247 skeezics@measurecast.com