Hello,
I have a Netapp F540 that has some 2-3 million files on its
filesystem currently. I've been testing a news server setup
on it and now I'd like to erase all these files, starting
all over again with an empty filesystem. I did "rm -rf <dir>"
(and no, I didn't erase the /etc dir on the Netapp) and it
ran for a while, erasing lots of files, then the Netapp
crashed and rebooted! I hoped it was a freak occurance so
I did another rm when it got back up again but that made
it crash once more.
Now, I'm thinking this may have something to do with the
snapshots. Because this is a newsserver I used "options
nosnap on" following advice I got from people at Netapp.
I also tried enabling snapshots and doing "snap sched
0 0 3@1,9,17" (just to test if it would make a difference)
but the machine still was still prone to crash. I know
the snapshot reserve may be too small - can that cause
the server to crash like this?
Yesterday I started an rm job and the F540 crashed at 2am
this morning with the following message:
PANIC: ../common/wafl/write_alloc.c:770: Assertion failure.
version: NetApp Release 4.0.1c: Wed Jan 8 05:53:54 PST 1997
cc flags: 1
.dumping core: ........................................done
...it rebooted and ran until, I think, 4 am when it gave the
same message but this time it didn't go online again because
an old core file was present apparently:
.Old core present on disk --- not dumped.
This time it just halted. I think I read somewhere about this
write_alloc.c:770: Assertion failure message/bug/whatever but
I can't remember where.
Can I just tell the Netapp to reboot again from the "ok"
prompt or should I do something else? How do I make the
machine *not* crash when erasing lots of files?
Also, is there some way to re-initialize a Netapp server
to the state it was when shipped? Doing a separate unlink
on each file when you have several million files can be
very time-consuming even if the server doesn't crash ;-)
Regards,
/Ragnar, Algonet