New subject: Solaris/Oracle/filer tuning question

22 Aug 2001


      Chris,
I normally recommend setting max_threads to at least 24.
Have you got gigabit patch 106764?  I think this affects reads across the board, but it's worth checking.
Regards, 
Andrew Bond 				Network Appliance UK 
Consulting Systems Engineer 		Waterview House, 1 Roundwood Avenue, 
tel +44 (0)20-8756-6722 		Stockley Park, 
mobile +44 (0)7801-383566 		Middlesex, UB11 1EJ 
fax +44 (0)20-8756-6701 		http://www.netapp.com 
Get answers NOW! - NetApp On the Web - http://now.netapp.com
...
-----Original Message-----
From: Chris Lamb [mailto:skeezics@measurecast.com]
Sent: 21 August 2001 18:08
To: toasters@mathworks.com
Subject: Solaris/Oracle/filer tuning question
Howdy, all.
This is more a Solaris question, but since there's a filer 
involved and
y'all are so helpful, I'll start with you guys. :-)
The setup:
   Sun E4500, Solaris 7 HW11/99+patches, GigE
   NetApp F760, 5.3.6R2, GigE
   Oracle 8.1.6.x
   Two volumes for Oracle, NFSv3/UDP, 32K rsize/wsize
The problem(s):
   I/O from any single Oracle process tops out around 3-4MB/sec;
Even with 'disk_asynch_io = false' in init.ora set to false, we
   are getting sporadic and seemingly nonsensical "resource
   temporarily unavailable" errors (had a case open for this one).
The question:
   Are any of the Solaris kernel NFS tunables useful in increasing
   that apparent limit?  Specifically:
   	nfs:nfs3_max_threads
   	nfs:nfs3_async_clusters
   	nfs:nfs3_nra
The DBA and our NetApp reps/support team have gone over our 
config, and
things are generally stable.  After turning off asynch_io we noticed a
slight performance hit, but thought that the annoying "resource
temporarily unavailable" errors had been solved.  We had another one
yesterday, first time in almost two months.  :-/
In diagnosing the performance sluggishness, we noticed that for a
non-partitioned full table scan a single Oracle "reader" process was
limited to about 2.5-4MB/sec, while a full scan of a partitioned table
(using 6 readers) was getting 18-22MB/sec.  (Obvious workaround:  
increase the number of readers and writers, and partition the 
table with
the bottleneck...)
In each case, Solaris iostat was claiming that the NFS mount was "100%
busy", although response times were generally in the sub-6ms 
range and the
filer's sysstat showed less than 20% cpu usage (this is a 
production box,
so I can't completely isolate the load for our testing).  For 
_reads_ I
just can't figure out why Solaris - or Oracle itself - are throttling
performance like this... I've clocked > 50MB/sec reads in 
plain old NFS
tests (bonnie, cpio, etc), and I think it's generally clear when we've
saturated the PCI bus on the filer.  This Oracle bottleneck 
is a mystery,
but now I'm wondering if the two problems are related.
The Solaris Tunable Parameters Reference Manual says of the
"nfs:nfs3_max_threads" knob:
   "Controls the number of kernel threads that perform asynchronous
   I/O for the NFS version 3 _client_. [emphasis mine] Since NFS is
   based on RPC and RPC is inherently synchronous, 
separate execution
   contexts are required to perform NFS operations that are
   asynchronous from the calling thread."
The default pool of threads is set to 8.  There's some indication that
this may be a _per filesystem_ setting, and is not global to all NFS
client activity on the machine.  Even if that's the case, 
then 8 threads
per mount point might still be fairly restrictive on an 8-cpu machine,
given the fat pipe and headroom still available on the filer. 
 I generally
shy away from mucking about with Solaris internals ("this isn't your
father's SunOS" :-) but if that's what it takes to boost 
performance I'm
all for it.
Has anyone running in a similar environment had to tweak any Solaris
kernel tunables?  Are there other Oracle/NetApp tuning hints 
not discussed
in the whitepapers on the NOW site?  My DBA and I would be 
most grateful
for any warnings/suggestions/hints. :-)  I'll be happy to provide more
detail off-list, then summarize results if anyone else is interested.
Thanks,
-- Chris
--
Chris Lamb, Unix Guy
MeasureCast, Inc.
503-241-1469 x247
skeezics@measurecast.com

RE: Solaris/Oracle/filer tuning question