On 08/10/99 18:25:05 you wrote:
On Tue, 10 Aug 1999 sirbruce@ix.netcom.com wrote:
Given that Netapp's NVRAM allocates writes and then dumps them to disk periodically in stripes, this type of optimization would probably work well with them. Perhaps they do it already. :)
I'm sure they do it. Does NetApp's NVRAM allocate writes? I though NetApp held raw NFS packets in the NVRAM.
Yes. I was sorta speaking in shorthand. My understanding is that the raw NFS packets get storied both in NVRAM and main memory; the point is once the NVRAM is half-full, or at 10 seconds, outstanding write data is collected and sent to disk, and the NVRAM flushed. There's some amount of allocation involved here and I think some parity stuff that actually gets stored in NVRAM as well; I didn't mean to imply, however, that actual SCSI commands or disk blocks are being stored there. My knowledge of exactly what happens in detail ends here, so I'll let someone else explain more thoroughly if you're really curious.
Bruce
Yes. I was sorta speaking in shorthand. My understanding is that the raw NFS packets get storied both in NVRAM and main memory;
The paper "File System Design for an NFS File Server Appliance" at
http://www.netapp.com/technology/level3/3002.html#I35
does say
WAFL uses non-volatile RAM (NVRAM) to keep a log of NFS requests it has processed since the last consistency point.
but that's a bit of a simplification; what's actually logged are the arguments to the WAFL operation being performed on behalf of the NFS - or CIFS, or HTTP - request, rather than the raw contents of the request.
For write operations, the bulk of that is the data being written; that is, as you note, copied both to NVRAM (for the log entry) and to main memory (the in-memory file system buffer for the changed block(s) of the file system).
the point is once the NVRAM is half-full, or at 10 seconds, outstanding write data is collected and sent to disk, and the NVRAM flushed.
The paper also discusses that:
WAFL actually divides the NVRAM into two separate logs. When one log gets full, WAFL switches to the other log and starts writing a consistency point to store the changes from the first log safely on disk. WAFL schedules a consistency point every 10 seconds, even if the log is not full, to prevent the on-disk image of the file system from getting too far out of date.
There's some amount of allocation involved here
Yes, the final resting place on the disk of modified file system blocks is chosen as part of the consistency point that writes those blocks out.
and I think some parity stuff that actually gets stored in NVRAM as well;
Yes, we keep track of which blocks in which stripes are being updated, so that if the system is unintentionally rebooted (crashes, loses power, etc.) in the middle of updating a stripe, it can fix up the parity on the stripe when it comes back up.