What do people do to track NFS activity on their filers? Basically a real-time audit log on the filer showing NFS calls, file sizes, pathname info, uid, etc. I often run into situations where I can see a filer filling up very quickly (or the snapshot reserve doing the same thing), but no way of figuring who/what is doing it. I could sniff the network for NFS packets and decode them, but it seems the best place to look for this information is on the filer. It can probably make better sense of NFS streams than picking off individual packets.
On a related note, are there any utilities available for more granular reporting on snapshot usage? Like being able to tell which file or directory or uid is occupying what percentage of the snapshot reserve? This wasn't possible a few years ago, but I figure the smart Netapp engineers would have thought of something since then. ;-)
What do people do to track NFS activity on their filers?
Basically a real-time audit log on the filer showing NFS calls, file sizes, pathname info, uid, etc. I often run into situations where I can see a filer filling up very quickly (or the snapshot reserve doing the same thing), but no way of figuring who/what is doing it. I could sniff the network for NFS packets and decode them, but it seems the best place to look for this information is on the filer. It can probably make better sense of NFS streams than picking off individual packets.
You can use nfsstat like this:
options nfs.per_client_stats.enable on nfsstat -z
This causes the filer to accumulate per client nfs statistics and nfsstat -z zeroes all counters. Now let the filer run for a little while and run
nfsstat -l
This will give you a summary of total nfs ops by each client since the nfsstat -z. This will allow you to determine which clients are "hot".
To see detailed counts for a particular client, use
nfsstat -h hostname
This will show if the client is doing any writes. You still won't know which files are being written, though.
Steve Losen scl@virginia.edu phone: 434-924-0640
University of Virginia ITC Unix Support
On Mon, 9 Jun 2003, Steve Losen wrote:
To see detailed counts for a particular client, use
nfsstat -h hostname
This will show if the client is doing any writes. You still won't know which files are being written, though.
Yeah, that's what I'm doing right now... per-client NFS stats to see which one is obviously generating a lot of requests, then investigating the network traffic and processes on that client. This is pretty hit-and-miss though, especially on filers that service dozens or hundreds of clients. It also difficult to detect unlinks of large files that could cause snapshots to blow up (since the actual number of NFS ops can be extremely low).
I'd love to see an audit log that might looks like this:
2003:06:09:23:12:45.413 192.168.100.4 e3a getattr /vol/vol0/home/taob/.bashrc 32 2003:06:09:23:12:45.445 192.168.100.4 e3a read /vol/vol0/home/taob/.bashrc 4104
... etc. Timestamp, client IP, filer interface on which the request was received, NFS call, argument(s), size of request. Obviously, with filers hitting 30000+ ops/sec, there needs to be some sort of filtering mechanism to only log certain clients or certain interfaces or certain syscalls.
We track daily per-user/per-volume usage by keeping quotas turned on (but with no limits) for each volume we want to watch. Our quotas file looks like:
* user@/vol/design - - - * user@/vol/dept - - - * user@/vol/users - - -
We do a "quota report" every morning at 6am, and diff it against the previous day's report to get usage details. If you need greater granularity, just run the reports more frequently.
On Sunday, June 8, 2003, at 11:56 PM, Brian Tao wrote:
What do people do to track NFS activity on their filers?
On Mon, 9 Jun 2003, Andrew Siegel wrote:
We track daily per-user/per-volume usage by keeping quotas turned on (but with no limits) for each volume we want to watch. Our quotas file looks like:
user@/vol/design - - -
user@/vol/dept - - -
user@/vol/users - - -
We do a "quota report" every morning at 6am, and diff it against the previous day's report to get usage details. If you need greater granularity, just run the reports more frequently.
Yep, having default quotas everywhere should definitely be a documented "best practice" if it isn't already. I've got that here ("unlimited" user quotas for tracking purposes), but it doesn't help locate files/directories with the most "activity" in them. There seem to be a lot of tools to tell you "how much" (sysstat, nfsstat, filestats, quota report, etc.), but not "where" to any degree of granularity finer than a volume or qtree.
Can DataFabricManager produce per-directory activity reports?