It appeared to be caused by the application doing readdirs (along with other operations: read/write/getattr) on a specific directory, which at that time held about 70,000 files. We fixed the problem by disabling the readdirs within the application and also reducing the number of files in that directory down to about 45,000.
Probably is likely two-fold -- large (erm, huge) directories are going to be painful for the filer in the first place, since a 'simple' request explodes out into such a large anwser.. Secondly, I don't know if the same holds true for Linux, but Solaris will bypass the DNLC for directories over a given size -- likewise, what's already a nasty request gets amplified since the client isn't caching it.
We don't know exactly which fix (stopping readdirs or removing the files) did the trick, but after that Netapp CPU dropped back to normal and the webservers were happy and the site responsive again.
I believe its READDIR+ that you disabled (disabling READDIR altogether would be ... interesting). This would definatley help in that situation, as READDIR+ effectively does a GETATTR for every file at the same time, so if the attribute data isn't beling used, would be that much more work to be done.
Avoid large directories like this -- it will always be a problem on any platform. Even doing something as ugly as apache's mod_rewrite to hash out across multiple levels will boost performance in the end..
You should also take a look at your config to see if you can identify what was constantly doing the lookups (ie. apache mod_speling will do this). Even if its not destroying things as it was here, it will impact performance you'll probably want to address it anyways..