We currently have our Oracle stuff running on Solaris boxes with (essentially) locally attached raid. We also have a clustered pair of Netapp filers that we use for home directories and other NAS storage. And our email system is running on Solaris with locally attached raid.
We are looking into replacing our aging filers and locally attached raid disks with some kind of SAN/NAS solution. I'm the filer admin, so naturally I would like to see us buy Netapp. But I think the Oracle folks are going to push for EMC.
Our Oracle folks have this situation where they need to make a copy of a 500G production Oracle instance every night and ship it to a data warehouse server. Then they do some automatic updates on the warehouse instance and make it available for querying.
I know that both Netapp and EMC have efficient ways of handling this where you only need to copy part of the data. For example, on a Netapp we could use snapmirror and EMC claims to have similar functionality.
However, the Oracle folks want to know what a worst case scenario would be, where you simply must copy all the data rather than just snapmirror the changes. It looks like EMC would probably be able to handle this sort of bulk copy faster than a comparable filer.
But I was thinking that Netapp might have a better way to accomplish this. I don't know if this is supported or not, but theoretically WAFL should be able to handle it.
Instead of actually copying the Oracle instance data (which would be either a 500G LUN or a 500G NFS file) you create a virtual copy. To do this, you would allocate a new inode for the copy, but link in the same data blocks as the original inode. This would be MUCH faster than copying the data, practically instantaneous. Of course the two copies would be in the same volume.
Because WAFL always makes updates on fresh blocks, never changing existing data blocks, it would be possible to run the production instance on the original file and the warehouse instance on the virtual copy. Updates to one file would have no effect on the other. Once a day you could remove the warehouse instance and start over with a new "virtual copy".
Maybe Netapp already supports something like this. But if not, they should think about it. I think it would be very handy in the database world where people need to run tests on production data, but are loathe to test on their production instance.
I can see one problem right off the bat. Linking the same data block into more than one file is usually considered a filesystem no-no. So WAFL would need to be modified to support this, which might require some tricky bookkeeping. Perhaps each data block would need a file reference count, to indicate how many files it belongs to.
If Netapp already supports something like this, I need to tell our folks who like EMC.
If the powers that be choose EMC, how difficult will the EMC NAS product make my life? I realize this is a Netapp-centric list, so opinions will be biased.
Steve Losen scl@virginia.edu phone: 434-924-0640
University of Virginia ITC Unix Support