Re: mini-core

23 Mar 2000


      From "Michael J. Tuciarone" on Wed, 22 Mar 2000 16:54:54 PST:
...
Cores are not dumped directly to the file system--at the time the filer
panics, you can't risk touching the file system lest you corrupt it.
So the current state--the core--is dumped to a reserved area on the disk.
Rather, all the reserved areas on all disks are filled up one by one with
chunks of coredump, and savecore unwinds this after reboot.
This part I was aware of.
...
As filers have received more main memory, we began running out of
reserved disk areas before the whole core was dumped. The actual
Oh.  =(
...
ratio of memory to disk depends on memory size and disk model;
the larger the ratio the more likely you'll be unable to dump the
whole core. In response, we implemented the compressed core feature.
Now if the filer computes there isn't enough disk space to save the
entire core uncompressed, we compress the core before writing it out.
So its doesn't always compress the core during the dump?  Even our filers
with 56 of the 18GB drives take a noticeably longer time to dump.
Understandably they have 1GB of RAM, but if the reserved areas 52 4GB disks
could hold a core from 512MB of RAM on an F630, why can't 56 18GB drives
hold a core from 1GB of RAM on an F760?  It seems like 5 times the amount
of disk got added but only twice the amount of RAM.
...
(The other obvious change is to change the size of the reserved
disk area, but we were loath to do that as we didn't want to make
changes to the disk layout. Such changes would deeply affect both
reverting back to previous releases, and the migration of disks to
new filer heads during an upgrade. Basically, we concluded that the
data layout on the disk is sacrosanct.)
Understandably so.  Thanks for not changing the size - reverting would have
been horrible.
...
Finally, we concluded that panics were sufficiently rare events that we
Uhhh... sufficiently rare?  My customer's definition of sufficiently rare
downtime is none whatsoever.  =)
...
were willing to trade off some time during compression to ensure that
we got the entire core, without too badly affecting our overall
availability. Of course, the very fact that we had to make tradeoffs
In the end, of course we'll spend the extra few minutes off-line to get the
core so you folks can fix our problem(s).  Unfortunately, this is the first
time we've gotten a complete technical explination of the problem (memory
to reserved disk ratio).  All we have heard before was "Guess what?  You
don't have to gzip your cores anymore!" which, obviously didn't sit well.  =)
Is there any metric we can use to know if the filer is going to compress
the core or not?  All our filers seem to compress all their cores.
Thanks for the in-depth response!
-- Jeff
--
----------------------------------------------------------------------------
Jeff Krueger                                   E-Mail: jeff@qualcomm.com
NetApp File Server Lead                         Phone: 858-651-6709
IT Engineering and Support                        Fax: 858-651-6627
QUALCOMM, Incorporated                            Web: www.qualcomm.com

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Re: mini-core