Re: mini-core

23 Mar 2000


      Dear Jeff:
You said--
...
Incidentally, have you noticed that it now takes about five times longer to
dump a core now that they are .nz compressed?  This is a very frustrating
"feature".  The savecore process works while the filer is ON-LINE, whereas
the dump process works while the filer is OFF-LINE.
If the goal is for the filer to be ON-LINE more than it is OFF-LINE, why
delay the time it spends dumping with compressing a core when we could all
gzip it when the filer is ON-LINE?  Or why not put the compression code in
the savecore process which can be executed after the filer is back
ON-LINE?
The problem lies in the implementation of core dump.
Cores are not dumped directly to the file system--at the time the filer
panics, you can't risk touching the file system lest you corrupt it.
So the current state--the core--is dumped to a reserved area on the disk.
Rather, all the reserved areas on all disks are filled up one by one with
chunks of coredump, and savecore unwinds this after reboot.
As filers have received more main memory, we began running out of
reserved disk areas before the whole core was dumped. The actual 
ratio of memory to disk depends on memory size and disk model;
the larger the ratio the more likely you'll be unable to dump the
whole core. In response, we implemented the compressed core feature.
Now if the filer computes there isn't enough disk space to save the
entire core uncompressed, we compress the core before writing it out.
(The other obvious change is to change the size of the reserved
disk area, but we were loath to do that as we didn't want to make
changes to the disk layout. Such changes would deeply affect both
reverting back to previous releases, and the migration of disks to
new filer heads during an upgrade. Basically, we concluded that the
data layout on the disk is sacrosanct.)
Finally, we concluded that panics were sufficiently rare events that we
were willing to trade off some time during compression to ensure that
we got the entire core, without too badly affecting our overall
availability. Of course, the very fact that we had to make tradeoffs
means that some customers in some configurations would see some
degradation. We are constantly looking for ways to improve our self-diagnostic
capability. In fact, the guy in the next office is looking at core dumps
right now, and I'll make sure he has read your note.
Yours,
Mike Tuciarone
Platform Software

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Re: mini-core