Hi Jeremy,
1) Yeah, file alignment is a pain here. I'm keen to fix it but internal change processes mean it'll take a while.
2) Sorry, I should have said this is all FC.
3) Yeah we do have about 300 LVMs, and over 1000 Windows VMs of various versions. I do need to check crons and DBs and anything that could be running scheduled jobs as we do have spikes at regular times (as well as irregular). check_raid is one I'd forgotten about though, thanks. We do have a daily spike shortly after 4am, hmm...
4) The spikes are a mix of writes and reads, I suspect it's dependent on what the application is doing at the time.
5) I've only been here 3 weeks so I don't really know the history of the filer. It certainly may have been much fuller than it is now in the past.
I'm teaching myself to read statit, there's a lot in there that I haven't been paying attention to. Jeff Mohler has been really helpful here looking at statits from my filers, and pointed out that we spend as much time writing stripes only 1 disk wide as we do the entire width and all other possible stripe widths, which explains a lot. I think I need to get both the misalignment and the fragmentation issues fixed, easily done on a technical level, harder on an internal process level!
Thanks for your help.
Peta
On 26 April 2012 23:21, Jeremy Page jeremy.page@gilbarco.com wrote:
Running 7.3.5.stuff here so please take with 8.0 grains of salt :)
- File system alignment is the most important thing to do (you may already
be aligned, just making sure others reading this are aware). 2) Am I correct in assuming you are using iSCSI for your data stores? Make sure the VMFS file systems are also aligned with the NetApp blocks 3) Are you using LVMs? We had a problem with our CentOS boxes where crontab had a job running weekly at 4:22 AM on Mondays where they did a raid check which occasionally brought our 3070 to it's knees 4) Are your latency spikes being measured from the vNIC to the filer interface ? stats show -i 3 iscsi (nfsv3 for NFS) will give you a good overview, stats show -i 3 lun will give a per LUN view of the same kind of thing. Are the spikes in read or write times or network specific? If they are due to the network itself you may want to look at your hypervisor's network config. 5) If it's specifically slow on writes instead of reads you may need to run a AGGR reallocate to get your free space in contiguous blocks. If this is something you need to do it's probably because you're filer was either *really* full or you added disks to an aggr late in the game.
statit will give you a better picture of what your disks are doing individually than sysstat's %utilized, especially on a filer with a ton of disks, sysstat shows the busiest during it's interval, not always the best metric. If you have disks in the same raid group with drasticly different IO times then maybe a reallocate is worth while.
Finally if you're using VMDK files inside of VMFS and not mounting your iSCSI LUNs as RDMs or something you may want to consider reallocating the VMDK files as well.
If you're using EXT3 I doubt the host file system is the problem - although if it's a LVM all bets are off, we don't use them in our environment.
Jeremy Page|Senior Technical Architect|Gilbarco Veeder-Root, A Danaher Company Office:336-547-5399|Cell:336-601-7274|24x7 Emergency:336-430-8151
On 04/26/2012 12:49 AM, Peta Thames wrote:
- is it absolutely necessary to defrag the OS before you reallocate
the lun? I'm sure I've run reallocate without defraging the OS and still seen performance improvements. I'm also assuming that this is only relevant to Windows VMs, not Linux (in our case, Red Hat/CentOS) ones.
Please be advised that this email may contain confidential information. If you are not the intended recipient, please notify us by email by replying to the sender and delete this message. The sender disclaims that the content of this email constitutes an offer to enter into, or the acceptance of, any agreement; provided that the foregoing does not invalidate the binding effect of any digital or other electronic reproduction of a manual signature that is included in any attachment.
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters