We have extensively tested many scenarios of filer/server failures in an Oracle environment. It is actually pretty hard to lose any data and we have yet to find a way to corrupt data if the steps outlined in the Technical Whitepapers are followed. In particular, you MUST use hard NFS mounts with Oracle. The situation described below will only be a problem if soft NFS mounts are used.
Bruce Clarke Network Appliance, Inc.
-----Original Message----- From: Brian Tao [mailto:taob@risc.org] Sent: Wednesday, May 24, 2000 9:43 PM To: foo Cc: toasters@mathworks.com Subject: Re: Oracle and NTAP.
On Wed, 24 May 2000, foo wrote:
Specifically they mentioned a situation they had experienced while testing a netapp/oracle environment which would cause total corruption and require restoration from backup. The specific situation is a case where the filer loses power but the DB server doesn't. Their claim is that if this occurs there is no method to recover the DB other than from backup.
Can they publish the exact steps they used to produce this behaviour? We have various flavours of Oracle 7 and 8 (not 8i) with tables stored over NFS on a clustered pair of F740's. We ran a simple test that repeatedly inserted 100 rows, deleted 99 rows, inserted 100 rows, deleted 99 rows, ad infinitum. In the middle of that, we yanked the power from one of the F740's. This triggered a failover to the other F740. During that time, NFS service is suspended, and you can see transactions from sqlplus hang for about 90 seconds. The NVRAM logs are replayed during the takeover, NFS service is restored, and the test script continues running. All records that we expect to see in the database are accounted for.
Granted, this is a very simple test, so I'd like to know how your DBA's were able to produce the behaviour they claimed to have seen.