Jay Soffian jay@cimedia.com writes:
Do parity errors during a disk scrub necessarily indicate something is wrong with the filesystem?
Not neccessarily.
this rather distressing. The filer really needs a way to perform a filesystem health check w/o downtime.
Like running fsck on a mounted filesystem? Some things are better done in a quiesced state...
NA also needs to publish accurate timing data on wack via NOW. The only document I can find has ancient data.
There are way too many variables to that equation. It depends on the version of the OS, the version of wack, etc. I have seen times when running the incorrect version of wack yielded 3 hour wack's (before someone mentioned this should not be the case; we restarted with the proper version).
A similar table that would be nice is raid reconstruct speeds based on volume size, filer load and disk size. Now that would be interesting.. :)
Alex
"alexei" == alexei alexei@mindspring.net writes:
this rather distressing. The filer really needs a way to perform a filesystem health check w/o downtime.
alexei> Like running fsck on a mounted filesystem? Some things are alexei> better done in a quiesced state...
The filer is not a Unix host serving NFS. I expect more out of it. I didn't say it needed to correct errors while serving content, I said it needed to be able to do a health-check. If you go back to my original message, I mentioned that if it required an immutable filesystem to do this, then you should be able to tag a filesystem (not just an export) read only.
You can btw, run fsck on a mounted filesystem. Solaris happily runs 'fsck -n' on a mounted file system (yes, I know, it will also happily run 'rm -rf /' which doesn't mean you should do it - lot's of rope and whatnot). Linux will run e2fsck after making some noise. I don't see any reason why this would be dangerous on a filesystem mounted read only.
>> NA also needs to publish accurate timing data on wack via >> NOW. The only document I can find has ancient data.
alexei> There are way too many variables to that equation. It alexei> depends on the version of the OS, the version of wack, alexei> etc. I have seen times when running the incorrect version alexei> of wack yielded 3 hour wack's (before someone mentioned alexei> this should not be the case; we restarted with the proper alexei> version).
So what? NetApp could still collect stats on various configs and publish those. Then I can pick out the configuration closest to mine. How about a tool to at least provide guestimates of running time.
alexei> A similar table that would be nice is raid reconstruct alexei> speeds based on volume size, filer load and disk size. Now alexei> that would be interesting.. :)
Agreed, but right now, I want wack times.
j. -- Jay Soffian jay@cimedia.com UNIX Systems Engineer 404.572.1941 Cox Interactive Media
Hi,
I agree that a wack online is a must. Any downtime of a filer is a big problem in almost any environment. I am sure that Netapp hears our voice and is probably preparing a solution. Let's hope that they come out with one soon!
Jay Soffian wrote:
"alexei" == alexei alexei@mindspring.net writes:
this rather distressing. The filer really needs a way to perform a filesystem health check w/o downtime.
alexei> Like running fsck on a mounted filesystem? Some things are alexei> better done in a quiesced state...
The filer is not a Unix host serving NFS. I expect more out of it. I didn't say it needed to correct errors while serving content, I said it needed to be able to do a health-check. If you go back to my original message, I mentioned that if it required an immutable filesystem to do this, then you should be able to tag a filesystem (not just an export) read only.
You can btw, run fsck on a mounted filesystem. Solaris happily runs 'fsck -n' on a mounted file system (yes, I know, it will also happily run 'rm -rf /' which doesn't mean you should do it - lot's of rope and whatnot). Linux will run e2fsck after making some noise. I don't see any reason why this would be dangerous on a filesystem mounted read only.
>> NA also needs to publish accurate timing data on wack via >> NOW. The only document I can find has ancient data. alexei> There are way too many variables to that equation. It alexei> depends on the version of the OS, the version of wack, alexei> etc. I have seen times when running the incorrect version alexei> of wack yielded 3 hour wack's (before someone mentioned alexei> this should not be the case; we restarted with the proper alexei> version).
So what? NetApp could still collect stats on various configs and publish those. Then I can pick out the configuration closest to mine. How about a tool to at least provide guestimates of running time.
alexei> A similar table that would be nice is raid reconstruct alexei> speeds based on volume size, filer load and disk size. Now alexei> that would be interesting.. :)
Agreed, but right now, I want wack times.
j.
Jay Soffian jay@cimedia.com UNIX Systems Engineer 404.572.1941 Cox Interactive Media