I apologize that this is so long but it's kind of complicated and I don't want to blow my budget on the wrong stuff.
I've got a mixed use 3070A at 7.3.1P2D11 hosting FC based AIX boxes (about 4k IOPS average, 90+% reads), 300 or so VMs (NFS, 3k IOPS average) and 1,000 users home/departmental shares via CIFS and NFS (2500 IOPS on a busy day. Oracle is starting to get a bit pokey and I've budgeted for some upgrades this coming quarter. The question is should I be buying spindles or PAM cards. Currently this load is spread across 231 spindles in wide stripes (minimum 12 disks per RG). So not great performance but it also takes a big spike to have a major negative impact as well.
I have good performance metrics for everything except Oracle, which for some reason is seeing poor performance even though latency and throughput look good. Of course there is no DBA running this database so I suspect at least part of the problem is on the Oracle side but for now I have to assume that's not going to improve. Currently we are already observing a very good cache hit % so I am questioning if the PAM cards will give any benefit there. On the other hand statit shows that 2/3 of the IO from disk is still reads so I think maybe there would be a benefit and I know it would be great for file shares & VMs (which are also on deduped volumes so it's my understanding the cache from the PAM cards would be even more effective).
* How do I determine what to purchase? * What's does the "seek time" for data read from PAM? * I'm getting to the point where the 3070 is going to work really hard in a fail over situation, if PAM cards can service a lot more requests in a short period if the data is cached on SSD will this put more load on my CPUs? * Why did NetApp name them Pam? It's not the 60's, they should be named Shelby or Logan cards or something?
Please be advised that this email may contain confidential information. If you are not the intended recipient, please do not read, copy or re-transmit this email. If you have received this email in error, please notify us by email by replying to the sender and by telephone (call us collect at +1 202-828-0850) and delete this message and any attachments. Thank you in advance for your cooperation and assistance.
In addition, Danaher and its subsidiaries disclaim that the content of this email constitutes an offer to enter into, or the acceptance of, any contract or agreement or any amendment thereto; provided that the foregoing disclaimer does not invalidate the binding effect of any digital or other electronic reproduction of a manual signature that is included in any attachment to this email.
Hi Jeremy, You may want to take a look at this webpage:
http://ctistrategy.com/2009/02/27/netapp-cache-pcs/
talks about how to gauge some of these items you're looking at. OnTap (I believe it only starts in 7.3.2 though) has a new option called Predictive Cache Statistics that you can use to monitor and help you predict if a PAM card would be helpful for you.
Here are some other things that might be worth looking at too.
* What is your disk utilization look like? (sysstat -x 3 or you can get a closer look with statit ) * What is your cache age? If this is really low I think a pam card could help.
Will be interested in what you eventually decide as I think we'll be looking into using a PAM card in the future.
Romeo
On Mon, Feb 22, 2010 at 10:35 PM, Page, Jeremy jeremy.page@gilbarco.comwrote:
I apologize that this is so long but it's kind of complicated and I don't want to blow my budget on the wrong stuff.
I've got a mixed use 3070A at 7.3.1P2D11 hosting FC based AIX boxes (about 4k IOPS average, 90+% reads), 300 or so VMs (NFS, 3k IOPS average) and 1,000 users home/departmental shares via CIFS and NFS (2500 IOPS on a busy day. Oracle is starting to get a bit pokey and I've budgeted for some upgrades this coming quarter. The question is should I be buying spindles or PAM cards. Currently this load is spread across 231 spindles in wide stripes (minimum 12 disks per RG). So not great performance but it also takes a big spike to have a major negative impact as well.
I have good performance metrics for everything except Oracle, which for some reason is seeing poor performance even though latency and throughput look good. Of course there is no DBA running this database so I suspect at least part of the problem is on the Oracle side but for now I have to assume that's not going to improve. Currently we are already observing a very good cache hit % so I am questioning if the PAM cards will give any benefit there. On the other hand *statit* shows that 2/3 of the IO from disk is still reads so I think maybe there would be a benefit and I know it would be great for file shares & VMs (which are also on deduped volumes so it's my understanding the cache from the PAM cards would be even more effective).
- How do I determine what to purchase?
- What's does the "seek time" for data read from PAM?
- I'm getting to the point where the 3070 is going to work really hard
in a fail over situation, if PAM cards can service a lot more requests in a short period if the data is cached on SSD will this put more load on my CPUs?
- Why did NetApp name them Pam? It's not the 60's, they should be named
Shelby or Logan cards or something?
Please be advised that this email may contain confidential information. If you are not the intended recipient, please do not read, copy or re-transmit this email. If you have received this email in error, please notify us by email by replying to the sender and by telephone (call us collect at +1 202-828-0850) and delete this message and any attachments. Thank you in advance for your cooperation and assistance.
In addition, Danaher and its subsidiaries disclaim that the content of this email constitutes an offer to enter into, or the acceptance of, any contract or agreement or any amendment thereto; provided that the foregoing disclaimer does not invalidate the binding effect of any digital or other electronic reproduction of a manual signature that is included in any attachment to this email.
I read that, unfortunately I am not at 7.3.2 yet, I do get the counters but the format is a bit old.
My disk utilization rarely goes above 25%
Here's what I see but since I'm not at 7.3.2 the numbers may be misleading, I don't know. array01*> stats show -p flexscale-pcs Instance Blocks Usage Hit Miss Hit Evict Invalidate Insert
--- ec0 4194304 90 11733 3479 77 1748 57 2236 ec1 4194304 38 1203 2275 34 343 401 1748 ec2 8388608 2 95 2180 4 0 39 343 --- ec0 4194304 90 7225 3306 68 430 140 548 ec1 4194304 38 756 2550 22 82 130 430 ec2 8388608 2 62 2487 2 0 15 82 --- ec0 4194304 90 8683 4467 66 422 0 554 ec1 4194304 38 1325 3142 29 71 74 422 ec2 8388608 2 129 3012 4 0 6 71 --- ec0 4194304 90 7965 4485 63 2300 122 2940 ec1 4194304 38 1497 2988 33 415 528 2300 ec2 8388608 2 167 2821 5 0 47 415 --- ec0 4194304 90 4996 4661 51 1793 54 2346 ec1 4194304 38 1114 3547 23 316 328 1793 ec2 8388608 2 116 3431 3 0 25 316 --- ec0 4194304 90 6915 4437 60 875 0 1141 ec1 4194304 38 1725 2712 38 168 124 875 ec2 8388608 2 201 2511 7 0 10 168
_____
From: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com] On Behalf Of Romeo Theriault Sent: Monday, February 22, 2010 9:13 AM To: Page, Jeremy Cc: toasters@mathworks.com Subject: Re: PAM cards or disks and some questions about the impact of running with PAM
Hi Jeremy, You may want to take a look at this webpage:
http://ctistrategy.com/2009/02/27/netapp-cache-pcs/
talks about how to gauge some of these items you're looking at. OnTap (I believe it only starts in 7.3.2 though) has a new option called Predictive Cache Statistics that you can use to monitor and help you predict if a PAM card would be helpful for you.
Here are some other things that might be worth looking at too.
* What is your disk utilization look like? (sysstat -x 3 or you can get a closer look with statit ) * What is your cache age? If this is really low I think a pam card could help.
Will be interested in what you eventually decide as I think we'll be looking into using a PAM card in the future.
Romeo
On Mon, Feb 22, 2010 at 10:35 PM, Page, Jeremy jeremy.page@gilbarco.com wrote:
I apologize that this is so long but it's kind of complicated and I don't want to blow my budget on the wrong stuff.
I've got a mixed use 3070A at 7.3.1P2D11 hosting FC based AIX boxes (about 4k IOPS average, 90+% reads), 300 or so VMs (NFS, 3k IOPS average) and 1,000 users home/departmental shares via CIFS and NFS (2500 IOPS on a busy day. Oracle is starting to get a bit pokey and I've budgeted for some upgrades this coming quarter. The question is should I be buying spindles or PAM cards. Currently this load is spread across 231 spindles in wide stripes (minimum 12 disks per RG). So not great performance but it also takes a big spike to have a major negative impact as well.
I have good performance metrics for everything except Oracle, which for some reason is seeing poor performance even though latency and throughput look good. Of course there is no DBA running this database so I suspect at least part of the problem is on the Oracle side but for now I have to assume that's not going to improve. Currently we are already observing a very good cache hit % so I am questioning if the PAM cards will give any benefit there. On the other hand statit shows that 2/3 of the IO from disk is still reads so I think maybe there would be a benefit and I know it would be great for file shares & VMs (which are also on deduped volumes so it's my understanding the cache from the PAM cards would be even more effective).
* How do I determine what to purchase?
* What's does the "seek time" for data read from PAM?
* I'm getting to the point where the 3070 is going to work really hard in a fail over situation, if PAM cards can service a lot more requests in a short period if the data is cached on SSD will this put more load on my CPUs?
* Why did NetApp name them Pam? It's not the 60's, they should be named Shelby or Logan cards or something?
Please be advised that this email may contain confidential information. If you are not the intended recipient, please do not read, copy or re-transmit this email. If you have received this email in error, please notify us by email by replying to the sender and by telephone (call us collect at +1 202-828-0850) and delete this message and any attachments. Thank you in advance for your cooperation and assistance.
In addition, Danaher and its subsidiaries disclaim that the content of this email constitutes an offer to enter into, or the acceptance of, any contract or agreement or any amendment thereto; provided that the foregoing disclaimer does not invalidate the binding effect of any digital or other electronic reproduction of a manual signature that is included in any attachment to this email.
On Tue, Feb 23, 2010 at 1:35 AM, Page, Jeremy jeremy.page@gilbarco.comwrote:
I read that, unfortunately I am not at 7.3.2 yet, I do get the counters but the format is a bit old.
My disk utilization rarely goes above 25%
Here's what I see but since I'm not at 7.3.2 the numbers may be misleading, I don't know. array01*> stats show -p flexscale-pcs Instance Blocks Usage Hit Miss Hit Evict Invalidate Insert
ec0 4194304 90 11733 3479 77 1748 57 2236 ec1 4194304 38 1203 2275 34 343 401 1748
I'm by no means an expert on any of this and it's the first time I really look at PCS data but it certainly seems like at least the first level of cache (ec0) would help things out with a 90% usage and 11,733 hits but on the other hand a max 25% disk utilization doesn't seem very high to me. You also mentioned to me that your cache age is a fairly good size too. So, I'm not really sure. It certainly doesn't seem like a PAM card would hurt, that's for sure.
If the only thing noticing a performance issue is the Oracle DB you might want to look at the latency on those volumes.
stats show -i 3 volume:*:avg_latency
or using Performance Advisor.
Romeo
I've checked the latency on both the host side (nmon) and at the disk level (statit). I am pretty sure they are ok, reads never go higher than 4ms and writes are in the 1ms range (on disk) and from the AIX side it's not much worse except for a few specific file systems. I want the PAM II cards for my VMs and shares (other controller) more than Oracle but I wanted to make sure I was not missing a glaring IO problem on the DB side before I got something that does not make any difference for writes.
The SATA disks holding my VMs are crying out for PAM relief and like you said it won't (shouldn't) hurt for Oracle. I want to make sure that I am doing due dilligance (as far as I can with out access to the data on the Oracle side anyways) so the folks who I provide storage to are comfortable I am making the right choice.
Thanks for all the feedback, I am learning a lot which is always good.
_____
From: Romeo Theriault [mailto:romeotheriault@gmail.com] Sent: Monday, February 22, 2010 1:39 PM To: Page, Jeremy Cc: toasters@mathworks.com Subject: Re: PAM cards or disks and some questions about the impact of running with PAM
On Tue, Feb 23, 2010 at 1:35 AM, Page, Jeremy jeremy.page@gilbarco.com wrote:
I read that, unfortunately I am not at 7.3.2 yet, I do get the counters but the format is a bit old.
My disk utilization rarely goes above 25%
Here's what I see but since I'm not at 7.3.2 the numbers may be misleading, I don't know. array01*> stats show -p flexscale-pcs Instance Blocks Usage Hit Miss Hit Evict Invalidate Insert
--- ec0 4194304 90 11733 3479 77 1748 57 2236 ec1 4194304 38 1203 2275 34 343 401 1748
I'm by no means an expert on any of this and it's the first time I really look at PCS data but it certainly seems like at least the first level of cache (ec0) would help things out with a 90% usage and 11,733 hits but on the other hand a max 25% disk utilization doesn't seem very high to me. You also mentioned to me that your cache age is a fairly good size too. So, I'm not really sure. It certainly doesn't seem like a PAM card would hurt, that's for sure.
If the only thing noticing a performance issue is the Oracle DB you might want to look at the latency on those volumes.
stats show -i 3 volume:*:avg_latency
or using Performance Advisor.
Romeo
Hi Jeremy,
if you buy a PAM card for one of your controllers you always should buy one for the other controller too. It will speed up your Oracle random reads at normal operation as well as help you facing a performance problem if the filesystem/VM- controller with the PAM card fails and your database controller has to handle all your IO.
PAM is short for Performance Accelerator Modul. There are two flavours: PAM-I and PAM-II. PAM-II is of greater cache size but has a slower access time than PAM-I. As far as I know PAM-II is not supported with your version of DataONTAP but please check this with NetApp.
You may send your flexscale-pcs statistics to NetApp (perhaps with a perfstat). They can analyze this and tell you what PAM card would best fit to your IO profile or if you best add more spindles. Furthermore they can help you with your Oracle performance issue.
Regarding your Oracle performance issues: It may be helpful to check if you encounter the performance issue only in your database instances or if your host filesystem in the LUN is slow too. For this you can use "dd" (on UNIX) to generate a serial write or read IO on your LUN. You may test with different block sizes. Please be sure to include the blocksize of your Oracle instance in the tests. Well, I know that nobody (especially NetApp) likes dd to test performance of a LUN. But it's easy to use, always at hand and my experience is that if dd is slow other performance is not very much better. If your performance without Oracle is acceptable (what I expect) then you can look at the Oracle SGA. Increasing this may help to cache more data on your host and decrease IO. You should analyze your select statements, too. If you see a lot of full table scans then you can use indexes to avoid those. But for this you best have an Oracle DBA at hand.
Best Regards
i. A. Dipl.-Inform. (FH) Walter J. Kießl
------------------------------------------------------------ mailto:kiessl@heidenhain.de tel.: +49 8669 31 1954 fax: +49 8669 32 1954 ------------------------------------------------------------
DR. JOHANNES HEIDENHAIN GmbH Dr.-Johannes-Heidenhain-Str. 5 83301 Traunreut, Deutschland http://www.heidenhain.de/
Von: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com] Im Auftrag von Page, Jeremy Gesendet: Montag, 22. Februar 2010 21:38 An: Romeo Theriault Cc: toasters@mathworks.com Betreff: (WARNING!!! S/MIME with incorrect signature) RE: PAM cards or disks and some questions about the impact of running with PAM
I've checked the latency on both the host side (nmon) and at the disk level (statit). I am pretty sure they are ok, reads never go higher than 4ms and writes are in the 1ms range (on disk) and from the AIX side it's not much worse except for a few specific file systems. I want the PAM II cards for my VMs and shares (other controller) more than Oracle but I wanted to make sure I was not missing a glaring IO problem on the DB side before I got something that does not make any difference for writes.
The SATA disks holding my VMs are crying out for PAM relief and like you said it won't (shouldn't) hurt for Oracle. I want to make sure that I am doing due dilligance (as far as I can with out access to the data on the Oracle side anyways) so the folks who I provide storage to are comfortable I am making the right choice.
Thanks for all the feedback, I am learning a lot which is always good.
________________________________ From: Romeo Theriault [mailto:romeotheriault@gmail.com] Sent: Monday, February 22, 2010 1:39 PM To: Page, Jeremy Cc: toasters@mathworks.com Subject: Re: PAM cards or disks and some questions about the impact of running with PAM
On Tue, Feb 23, 2010 at 1:35 AM, Page, Jeremy <jeremy.page@gilbarco.commailto:jeremy.page@gilbarco.com> wrote: I read that, unfortunately I am not at 7.3.2 yet, I do get the counters but the format is a bit old.
My disk utilization rarely goes above 25%
Here's what I see but since I'm not at 7.3.2 the numbers may be misleading, I don't know. array01*> stats show -p flexscale-pcs Instance Blocks Usage Hit Miss Hit Evict Invalidate Insert --- ec0 4194304 90 11733 3479 77 1748 57 2236 ec1 4194304 38 1203 2275 34 343 401 1748
I'm by no means an expert on any of this and it's the first time I really look at PCS data but it certainly seems like at least the first level of cache (ec0) would help things out with a 90% usage and 11,733 hits but on the other hand a max 25% disk utilization doesn't seem very high to me. You also mentioned to me that your cache age is a fairly good size too. So, I'm not really sure. It certainly doesn't seem like a PAM card would hurt, that's for sure.
If the only thing noticing a performance issue is the Oracle DB you might want to look at the latency on those volumes.
stats show -i 3 volume:*:avg_latency
or using Performance Advisor.
Romeo </PRE><p> ------------------------------------------------------------------------------------------------------ <br> Registergericht: Traunstein / Registry Court: HRB 275 - Sitz / Head Office: Traunreut <br> Aufsichtsratsvorsitzender / Chairman of Supervisory Board: Rainer Burkhard <br> Geschäftsführung / Management Board: Thomas Sesselmann (Vorsitzender / Chairman),<br> Michael Grimm, Matthias Fauser<br><br> <a href="http://www.heidenhain.de/disclaimer" target="_blank">E-Mail Haftungsausschluss / E-Mail Disclaimer</a><br><pre>