Hi Steve,

Where no updates are required, people have been doing what you describe with NetApp for over half a decade, by simply applying queries to previous night's snapshots.

Where updates prior to querying are required (as you describe) the answer is different depending on whether you are using file or block I/O.

Block I/O (iSCSI and FCP) has had read/write snapshot support for some time, by virtue of block-level virtualisation and indirection in a container file on top of WAFL.

File I/O (NFS, CIFS) has read/write snapshot support in Ontap 7G. This is basically the same, at least in effect, as what you describe. It's acheived using block-level virtualisation and indirection underneath WAFL.

Your DBA's will argue against using snapshots on the basis of disk contention, and in extreme environments this is something to be careful of. Generally though the read cache will take care of this sufficiently and they will not notice a significant hit. Where contention becomes an issue, clones are more appropriate. NetApp do a very good job at hiding contention compared to most storage solutions - temporal locality in writes to reduce seek overhead makes quite a difference to smoothing out read cache loading (I'm going a little too deep here... enough to say it works well enough that sharing snapshots for reporting on NetApp doesn't cause anywhere near the hit to production that it does on other vendor arrays).

Whether you should be using file or block I/O depends on the nature of your database. In general, smaller transactions perform better with file I/O, larger records perfom better with block I/O. (opposite of traditional thinking). This is due to the metadata and CPU overhead versus the actual payload - for larger I/O's (i.e. binary large object blocks) these overheads are proportionately small and block I/O streams better. For small I/O's the opposite is true and NFS is a good solution - plus with a hard mount it is very robust (cable disconnect for extend periods won't crash the database).

It sounds like your DBA's are fairly traditional and will resist a modern analysis and modern recommendations - they will still want to feel that they are managing the I/O characteristics of the disk solution. This is where EMC is appealing to many old-fashioned DBA's, as it allows them to retain a hands on approach to the disk configuration. They may have a hard time swallowing that NetApp's virtualise-everything approach can actually lead to better performance in some circumstances, and that even where it doesn't the management benefits more than make up for any small negative difference in I/O throughput. They still have an even harder time swallowing NFS mount, even when you tell them that Oracle themselves mount many of their in-house databases that way.

DBA's in general today need to start learning to leave disk management to storage people and concentrate on their applications - they tend to get their backs up when this is suggested to them.... What they need to realise is that they aren't dealing with host-managed JBOD anymore and modern storage subsystems take care of I/O contention without the traditional manual disk tuning. The value a true DBA brings is in writing SQL, not partitioning hard disks.

This is where Oracle Consulting Services (OCS) can step in and help - the argument carries more weight when it comes straight "from the horses mouth".

Hoe this helps,

Alan McLachlan
___________________________________
Solution Architect - Data Centre Solutions
Dimension Data Australia
Alan.McLachlan@didata.com.au
Tel. +61 (0)2 61225123
Fax +61 (0)2 62486346
Mobile 0428 655644

Steve Losen <scl@sasha.acc.virginia.edu>
Sent by: owner-toasters@mathworks.com

05/02/2005 12:47 AM

To: toasters@mathworks.com
cc:
Subject: question about copying an Oracle instance

We currently have our Oracle stuff running on Solaris boxes with
(essentially) locally attached raid. We also have a clustered pair
of Netapp filers that we use for home directories and other NAS storage.
And our email system is running on Solaris with locally attached raid.

We are looking into replacing our aging filers and locally attached
raid disks with some kind of SAN/NAS solution. I'm the filer admin,
so naturally I would like to see us buy Netapp. But I think the Oracle
folks are going to push for EMC.

Our Oracle folks have this situation where they need to make a copy
of a 500G production Oracle instance every night and ship it to a
data warehouse server. Then they do some automatic updates on the
warehouse instance and make it available for querying.

I know that both Netapp and EMC have efficient ways of handling
this where you only need to copy part of the data. For example, on
a Netapp we could use snapmirror and EMC claims to have similar
functionality.

However, the Oracle folks want to know what a worst case scenario would
be, where you simply must copy all the data rather than just snapmirror
the changes. It looks like EMC would probably be able to handle
this sort of bulk copy faster than a comparable filer.

But I was thinking that Netapp might have a better way to accomplish this.
I don't know if this is supported or not, but theoretically WAFL should
be able to handle it.

Instead of actually copying the Oracle instance data (which would be
either a 500G LUN or a 500G NFS file) you create a virtual copy. To do
this, you would allocate a new inode for the copy, but link in the
same data blocks as the original inode. This would be MUCH faster
than copying the data, practically instantaneous. Of course the two
copies would be in the same volume.

Because WAFL always makes updates on fresh blocks, never changing
existing data blocks, it would be possible to run the production
instance on the original file and the warehouse instance on the
virtual copy. Updates to one file would have no effect on the
other. Once a day you could remove the warehouse instance and start
over with a new "virtual copy".

Maybe Netapp already supports something like this. But if not, they
should think about it. I think it would be very handy in the database
world where people need to run tests on production data, but are loathe
to test on their production instance.

I can see one problem right off the bat. Linking the same data block
into more than one file is usually considered a filesystem no-no. So
WAFL would need to be modified to support this, which might require
some tricky bookkeeping. Perhaps each data block would need a file
reference count, to indicate how many files it belongs to.

If Netapp already supports something like this, I need to tell our
folks who like EMC.

If the powers that be choose EMC, how difficult will the EMC NAS product
make my life? I realize this is a Netapp-centric list, so opinions will
be biased.

Steve Losen scl@virginia.edu phone: 434-924-0640

University of Virginia ITC Unix Support

******************************************************************************
- NOTICE FROM DIMENSION DATA AUSTRALIA
This message is confidential, and may contain proprietary or legally privileged information. If you have received this email in error, please notify the sender and delete it immediately.

Internet communications are not secure. You should scan this message and any attachments for viruses. Under no circumstances do we accept liability for any loss or damage which may result from your receipt of this message or any attachments.
******************************************************************************