Hi,
I'm thinking of running a "reallocate start -p" on a couple volumes that are showing high latency disk accesses for luns. The volume is a source for both snapmirror and snapvault operations. I was reading up on this command and I can't find any real info on the effect of running this on the snapmirror and snapvault operations.
Question: If I run a "reallocate start -p" on a volume, will snapmirror/snapvault see this and send all reallocated blocks? Also, can snapmirror/snapvault operations be running during the reallocate?
Thanks
Rick
----------------------------------------- The information contained in this message is intended only for the personal and confidential use of the recipient(s) named above. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately, and delete the original message.
rrhodes@firstenergycorp.com wrote:
I'm thinking of running a “reallocate start –p” on a couple volumes that are showing high latency disk accesses for luns. The volume is a source for both snapmirror and snapvault operations.
Right, you do have active snapshots (since you're mentioning SnapVault below) I take it. reallocate start -f -p <vol_name> Are you seeing this high latency for READ I/Os above all on these LUNs / Vols? If it's for WRITE, your issue might be a different one and your efforts running that command won't do as much good. Could still be layout related, Free Space Fragmentation e.g.
(I'm not at all sure what you mean by "disk accesses" in this context.)
I was reading up on this command and I can’t find any real info on the effect of running this on the snapmirror and snapvault operations.
Question: If I run a “reallocate start –p” on a volume, will snapmirror/snapvault see this and send all reallocated blocks? Also, can snapmirror/snapvault operations be running during the reallocate?
I'm not 100% certain (yet), but incidentally I'm digging around in this area as well since a while now, effects on SnapVault and the snapshots it uses on the source. And as you say, the info is rather frugal.
To my current knowledge, the answer to your first Q is: no. However there will, during a certain time, be extra directs inside WAFL when blocks from snapshots are read and that might slow things down. No idea how much :-( This is automatically corrected by the reallocate -p, which will start wafl reallocate scan in the background that cleans it up for the FlexVol in question. It looks like this (example):
$ xyz 'priv set -q diag; wafl scan status' | egrep redirect | sed 's/ //' 138932 redirect public phase 1 level 2 inode 5970287 of 33554409 (2%) 138942 redirect private phase 3 level 1 block 470560972 of 771868448 (22%) 139000 redirect public phase 1 level 1 inode 19865752 of 33554409 (1%) 139884 redirect public phase 1 level 1 inode 24542520 of 33554409 (3%) 139054 redirect public phase 3 level 1 block 349198358 of 410692272 (33%) 140078 redirect public phase 1 level 2 inode 32509281 of 33554409 (7%) 140519 redirect public phase 1 level 1 inode 11041527 of 113246091 (0%) 140427 redirect public phase 1 level 1 inode 2218329 of 33554409 (0%) 139956 redirect public phase 1 level 1 inode 11865625 of 33554409 (1%) 140125 redirect public phase 1 level 1 inode 17613080 of 33554409 (1%) 138844 redirect private phase 3 level 1 block 305546101 of 537103936 (21%) 138991 redirect public phase 1 level 2 inode 19646076 of 33554409 (2%) 140192 redirect public phase 1 level 1 inode 24762423 of 33554409 (3%) 139853 redirect public phase 3 level 1 block 61740937 of 258940032 (14%) 140521 redirect public phase 2 level 1 block 2279190 of 7518795 (10%)
During this scan, the FlexVol continues to operate normally in any other respect so to speak. You can even abort such a scan cleanly and have it implicitly restarted later, with no harm to anything in the FlexVol.
To the second Q: yes
/M
The stats I'm looking at are from a "lun stat" cmd. For example, here are some of the worse lunstats lines for a 1hr interval last Friday.
It shows average latency of 50ms.
Read Write Other QFull Read Write Average Queue Partner Lun Ops Ops Ops kB kB Latency Length Ops kB
10 61 0 0 180 1051 50.55 4.03 0 0 /vol/v_fnce20p_db/q_fnce20p_db/lun2 201401171200 11 61 0 0 171 1070 50.16 4.03 0 0 /vol/v_fnce20p_db/q_fnce20p_db/lun3 201401171200 10 60 0 0 168 1046 50.38 4.02 0 0 /vol/v_fnce20p_db/q_fnce20p_db/lun4 201401171200 10 61 0 0 168 1063 49.80 4.02 0 0 /vol/v_fnce20p_db/q_fnce20p_db/lun1 201401171200 11 60 0 0 171 1051 50.16 4.02 0 0 /vol/v_fnce20p_db/q_fnce20p_db/lun0 201401171200
It looks like they are mostly write ops, but the latency isn't broken down by read/writes. The queue length is 4.
The only snapshot activity is snapmirror/snapvault operations, which occur every 6 hours. And, the snapmirror/snapvault Would have occurred for this volume during this 1hr interval.
I have the sysstat from the same 1hr interval (5 min intervals for 1hr starting 201401171200). The head really isn't very busy. This head is SAN only, so the net activity is snapmirror/snapvault activity.
CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk OTHER FCP iSCSI FCP kB/s iSCSI kB/s in out read write read write age hit time ty util in out in out 40% 0 0 0 5050 320 12672 34895 23444 0 0 14s 96% 39% 54 39% 9 5041 0 15762 88420 0 0 28% 0 0 0 757 530 21550 42006 15396 0 0 26s 93% 33% 42 38% 8 749 0 9284 16318 0 0 25% 0 0 0 838 209 8578 22343 27128 0 0 3s 94% 50% 88 31% 10 828 0 15800 13103 0 0 46% 0 0 0 700 1871 75567 83102 39238 0 0 1s 91% 43% 47 43% 7 693 0 27099 11266 0 0 14% 0 0 0 482 147 8675 21454 9020 0 0 1 96% 37% 19 31% 9 473 0 6240 13769 0 0 17% 0 0 0 702 143 8047 28066 12103 0 0 1 95% 36% 22 41% 9 693 0 8235 21790 0 0 19% 0 0 0 702 137 7922 26423 24004 0 0 1 95% 38% 35 39% 8 694 0 16770 17597 0 0 18% 0 0 0 642 156 8456 26082 14530 0 0 0s 93% 44% 20 44% 9 633 0 8983 15153 0 0 20% 0 0 0 1005 157 8168 33210 14510 0 0 1 93% 27% 30 53% 10 995 0 9724 23917 0 0 20% 0 0 0 1033 137 7337 25215 20989 0 0 1 92% 33% 35 45% 8 1025 0 14157 17978 0 0 19% 0 0 0 860 136 7757 28052 14805 0 0 1 94% 27% 30f 48% 8 852 0 9464 21196 0 0 19% 0 0 0 860 136 7757 28052 14805 0 0 1 94% 27% 30f 48% 8 852 0 9464 21196 0 0 24% 0 0 0 1101 150 9264 33514 18724 0 0 1 94% 32% 30 49% 9 1092 0 12129 25414 0 0
-----Original Message----- From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Michael Bergman Sent: Monday, January 20, 2014 9:47 AM To: Toasters Subject: Re: reallocate effect on snapmirror and snapvault
rrhodes@firstenergycorp.com wrote:
I'm thinking of running a "reallocate start -p" on a couple volumes that are showing high latency disk accesses for luns. The volume is a source for both snapmirror and snapvault operations.
Right, you do have active snapshots (since you're mentioning SnapVault below) I take it. reallocate start -f -p <vol_name> Are you seeing this high latency for READ I/Os above all on these LUNs / Vols? If it's for WRITE, your issue might be a different one and your efforts running that command won't do as much good. Could still be layout related, Free Space Fragmentation e.g.
(I'm not at all sure what you mean by "disk accesses" in this context.)
I was reading up on this command and I can't find any real info on the effect of running this on the snapmirror and snapvault operations.
Question: If I run a "reallocate start -p" on a volume, will snapmirror/snapvault see this and send all reallocated blocks? Also, can snapmirror/snapvault operations be running during the reallocate?
I'm not 100% certain (yet), but incidentally I'm digging around in this area as well since a while now, effects on SnapVault and the snapshots it uses on the source. And as you say, the info is rather frugal.
To my current knowledge, the answer to your first Q is: no. However there will, during a certain time, be extra directs inside WAFL when blocks from snapshots are read and that might slow things down. No idea how much :-( This is automatically corrected by the reallocate -p, which will start wafl reallocate scan in the background that cleans it up for the FlexVol in question. It looks like this (example):
$ xyz 'priv set -q diag; wafl scan status' | egrep redirect | sed 's/ //' 138932 redirect public phase 1 level 2 inode 5970287 of 33554409 (2%) 138942 redirect private phase 3 level 1 block 470560972 of 771868448 (22%) 139000 redirect public phase 1 level 1 inode 19865752 of 33554409 (1%) 139884 redirect public phase 1 level 1 inode 24542520 of 33554409 (3%) 139054 redirect public phase 3 level 1 block 349198358 of 410692272 (33%) 140078 redirect public phase 1 level 2 inode 32509281 of 33554409 (7%) 140519 redirect public phase 1 level 1 inode 11041527 of 113246091 (0%) 140427 redirect public phase 1 level 1 inode 2218329 of 33554409 (0%) 139956 redirect public phase 1 level 1 inode 11865625 of 33554409 (1%) 140125 redirect public phase 1 level 1 inode 17613080 of 33554409 (1%) 138844 redirect private phase 3 level 1 block 305546101 of 537103936 (21%) 138991 redirect public phase 1 level 2 inode 19646076 of 33554409 (2%) 140192 redirect public phase 1 level 1 inode 24762423 of 33554409 (3%) 139853 redirect public phase 3 level 1 block 61740937 of 258940032 (14%) 140521 redirect public phase 2 level 1 block 2279190 of 7518795 (10%)
During this scan, the FlexVol continues to operate normally in any other respect so to speak. You can even abort such a scan cleanly and have it implicitly restarted later, with no harm to anything in the FlexVol.
To the second Q: yes
/M
rrhodes@firstenergycorp.com wrote:
The stats I'm looking at are from a "lun stat" cmd.
Ah OK this is per LUN. So it's about latency and ops distr on the LUN and it has a queue length as well and everything just like a physical disk... :-\ Sorry my experience with LUN based systems (iSCSI or FC-AL based) is rather limited so I cannot really guess of 50 ms is very horrible or just bad -- but something up to 5-15 ms wouldn't worry me too much from what I have seen in the past.
You can run reallocate -p -f <vol_name> as well as reallocate on an individual LUN, but in this case it doesn't look like there's much READ so...
You ran stats with -i 300 during 3600 s (1 h) as I understand it and it says qlen >4. That doesn't feel very pleasant to me, it will invariably lead to fairly high latency I'd expect, it's kindof implicit in that concept if you know what I mean. If that qlen was 1 instead of 4 you'd be at ~12 ms avg latency on those LUNs instead of 50. The Q is why is the qlen so high?
I see you have much more write pressure here than READ, so if anything I think you may have some kind of free space fragmentation issue, especially if the Aggr this is on is getting full. Keeping Aggrs below some % full is highly recommendable, it depends on your workload and patience :-) how high you want to risk going
In the sysstat output, the disk util is pretty high, ~50% is definitely a warning sign for issues. But it's not a disaster quite yet
For example, here are some of the worse lunstats lines for a 1hr interval last Friday.
It shows average latency of 50ms.
Read Write Other QFull Read Write Average Queue Partner Lun Ops Ops Ops kB kB Latency Length Ops kB
10 61 0 0 180 1051 50.55 4.03 0 0 /vol/v_fnce20p_db/q_fnce20p_db/lun2 201401171200 11 61 0 0 171 1070 50.16 4.03 0 0 /vol/v_fnce20p_db/q_fnce20p_db/lun3 201401171200 10 60 0 0 168 1046 50.38 4.02 0 0 /vol/v_fnce20p_db/q_fnce20p_db/lun4 201401171200 10 61 0 0 168 1063 49.80 4.02 0 0 /vol/v_fnce20p_db/q_fnce20p_db/lun1 201401171200 11 60 0 0 171 1051 50.16 4.02 0 0 /vol/v_fnce20p_db/q_fnce20p_db/lun0 201401171200
It looks like they are mostly write ops, but the latency isn't broken down by read/writes. The queue length is 4.
The only snapshot activity is snapmirror/snapvault operations, which occur every 6 hours. And, the snapmirror/snapvault Would have occurred for this volume during this 1hr interval.
I have the sysstat from the same 1hr interval (5 min intervals for 1hr starting 201401171200). The head really isn't very busy. This head is SAN only, so the net activity is snapmirror/snapvault activity.
CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk OTHER FCP iSCSI FCP kB/s iSCSI kB/s in out read write read write age hit time ty util in out in out 40% 0 0 0 5050 320 12672 34895 23444 0 0 14s 96% 39% 54 39% 9 5041 0 15762 88420 0 0 28% 0 0 0 757 530 21550 42006 15396 0 0 26s 93% 33% 42 38% 8 749 0 9284 16318 0 0 25% 0 0 0 838 209 8578 22343 27128 0 0 3s 94% 50% 88 31% 10 828 0 15800 13103 0 0 46% 0 0 0 700 1871 75567 83102 39238 0 0 1s 91% 43% 47 43% 7 693 0 27099 11266 0 0 14% 0 0 0 482 147 8675 21454 9020 0 0 1 96% 37% 19 31% 9 473 0 6240 13769 0 0 17% 0 0 0 702 143 8047 28066 12103 0 0 1 95% 36% 22 41% 9 693 0 8235 21790 0 0 19% 0 0 0 702 137 7922 26423 24004 0 0 1 95% 38% 35 39% 8 694 0 16770 17597 0 0 18% 0 0 0 642 156 8456 26082 14530 0 0 0s 93% 44% 20 44% 9 633 0 8983 15153 0 0 20% 0 0 0 1005 157 8168 33210 14510 0 0 1 93% 27% 30 53% 10 995 0 9724 23917 0 0 20% 0 0 0 1033 137 7337 25215 20989 0 0 1 92% 33% 35 45% 8 1025 0 14157 17978 0 0 19% 0 0 0 860 136 7757 28052 14805 0 0 1 94% 27% 30f 48% 8 852 0 9464 21196 0 0 19% 0 0 0 860 136 7757 28052 14805 0 0 1 94% 27% 30f 48% 8 852 0 9464 21196 0 0 24% 0 0 0 1101 150 9264 33514 18724 0 0 1 94% 32% 30 49% 9 1092 0 12129 25414 0 0
I'm curious just what the latency and queue length are for a lun in a "lun stats" cmd. Is it the latency of servicing a host I/O, or, of writing the I/O's to disk? In other words, I/O's hit the head and get stored in memory (mirrored to the partner head, logged to NVRAM), then are written to disk curing a CP. I could see the queue length being a problem if the host is waiting for the disk writes. I just can't see how it would take +12ms x 4 = 50ms to write to the head memory. It's almost Like the head is single threading writes from the Host straight through to the backend. Other than back-to-back CP's, I don't understand how this could occur.
I was thinking a reallocate, but since it's only writes I don't see how that will help. The aggregate is 9tb of which only 3 are used, leaving 6tb free. I don't see how it's a freespace problem. It has 22x600gb15k drives in 2x11 disk raiddp raidsets.
Rick
-----Original Message----- From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Michael Bergman Sent: Monday, January 20, 2014 2:32 PM To: Toasters Subject: Re: reallocate effect on snapmirror and snapvault
rrhodes@firstenergycorp.com wrote:
The stats I'm looking at are from a "lun stat" cmd.
Ah OK this is per LUN. So it's about latency and ops distr on the LUN and it has a queue length as well and everything just like a physical disk... :-\ Sorry my experience with LUN based systems (iSCSI or FC-AL based) is rather limited so I cannot really guess of 50 ms is very horrible or just bad -- but something up to 5-15 ms wouldn't worry me too much from what I have seen in the past.
You can run reallocate -p -f <vol_name> as well as reallocate on an individual LUN, but in this case it doesn't look like there's much READ so...
You ran stats with -i 300 during 3600 s (1 h) as I understand it and it says qlen >4. That doesn't feel very pleasant to me, it will invariably lead to fairly high latency I'd expect, it's kindof implicit in that concept if you know what I mean. If that qlen was 1 instead of 4 you'd be at ~12 ms avg latency on those LUNs instead of 50. The Q is why is the qlen so high?
I see you have much more write pressure here than READ, so if anything I think you may have some kind of free space fragmentation issue, especially if the Aggr this is on is getting full. Keeping Aggrs below some % full is highly recommendable, it depends on your workload and patience :-) how high you want to risk going
In the sysstat output, the disk util is pretty high, ~50% is definitely a warning sign for issues. But it's not a disaster quite yet
For example, here are some of the worse lunstats lines for a 1hr interval last Friday.
It shows average latency of 50ms.
Read Write Other QFull Read Write Average Queue Partner Lun Ops Ops Ops kB kB Latency Length Ops kB
10 61 0 0 180 1051 50.55 4.03 0 0 /vol/v_fnce20p_db/q_fnce20p_db/lun2 201401171200 11 61 0 0 171 1070 50.16 4.03 0 0 /vol/v_fnce20p_db/q_fnce20p_db/lun3 201401171200 10 60 0 0 168 1046 50.38 4.02 0 0 /vol/v_fnce20p_db/q_fnce20p_db/lun4 201401171200 10 61 0 0 168 1063 49.80 4.02 0 0 /vol/v_fnce20p_db/q_fnce20p_db/lun1 201401171200 11 60 0 0 171 1051 50.16 4.02 0 0 /vol/v_fnce20p_db/q_fnce20p_db/lun0 201401171200
It looks like they are mostly write ops, but the latency isn't broken down by read/writes. The queue length is 4.
The only snapshot activity is snapmirror/snapvault operations, which occur every 6 hours. And, the snapmirror/snapvault Would have occurred for this volume during this 1hr interval.
I have the sysstat from the same 1hr interval (5 min intervals for 1hr starting 201401171200). The head really isn't very busy. This head is SAN only, so the net activity is snapmirror/snapvault activity.
CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk OTHER FCP iSCSI FCP kB/s iSCSI kB/s in out read write read write age hit time ty util in out in out 40% 0 0 0 5050 320 12672 34895 23444 0 0 14s 96% 39% 54 39% 9 5041 0 15762 88420 0 0 28% 0 0 0 757 530 21550 42006 15396 0 0 26s 93% 33% 42 38% 8 749 0 9284 16318 0 0 25% 0 0 0 838 209 8578 22343 27128 0 0 3s 94% 50% 88 31% 10 828 0 15800 13103 0 0 46% 0 0 0 700 1871 75567 83102 39238 0 0 1s 91% 43% 47 43% 7 693 0 27099 11266 0 0 14% 0 0 0 482 147 8675 21454 9020 0 0 1 96% 37% 19 31% 9 473 0 6240 13769 0 0 17% 0 0 0 702 143 8047 28066 12103 0 0 1 95% 36% 22 41% 9 693 0 8235 21790 0 0 19% 0 0 0 702 137 7922 26423 24004 0 0 1 95% 38% 35 39% 8 694 0 16770 17597 0 0 18% 0 0 0 642 156 8456 26082 14530 0 0 0s 93% 44% 20 44% 9 633 0 8983 15153 0 0 20% 0 0 0 1005 157 8168 33210 14510 0 0 1 93% 27% 30 53% 10 995 0 9724 23917 0 0 20% 0 0 0 1033 137 7337 25215 20989 0 0 1 92% 33% 35 45% 8 1025 0 14157 17978 0 0 19% 0 0 0 860 136 7757 28052 14805 0 0 1 94% 27% 30f 48% 8 852 0 9464 21196 0 0 19% 0 0 0 860 136 7757 28052 14805 0 0 1 94% 27% 30f 48% 8 852 0 9464 21196 0 0 24% 0 0 0 1101 150 9264 33514 18724 0 0 1 94% 32% 30 49% 9 1092 0 12129 25414 0 0
_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
----------------------------------------- The information contained in this message is intended only for the personal and confidential use of the recipient(s) named above. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately, and delete the original message.
Rick Rhodes wrote:
I'm curious just what the latency and queue length are for a lun in a "lun stats" cmd. Is it the latency of servicing a host I/O, or, of writing the I/O's to disk?
Well, it should be the same semantics as for a physical disk pretty much. It also has an avg latency (for any time period you measure when it serves I/O at all) and a queue length, and a max queue depth. The whole idea is that a LUN should logically behave like a physical disk device from an OS PoV, so...
I assume the LUN latency is measured from the point where the ops (READ or WRITE) enter some place inside ONTAP, the latency is the time it takes for the LUN to respond: 'ok, commited to stable storage'. There's lots of stuff, layers of abstraction, beneath the LUN of course -- basically it's just a big file in a FlexVol (WAFL).
And the queue length is the number of in-flight commands on the LUN, ops sent down to it but the response hasn't come back yet. How many ups "in parallel" (whatever that means for a LUN) an ONTAP LUN can do, I don't know
If that parallelism is >1 (it isn't for a physical disk; the write head can only be at one particular place at a time, right?) I'm wrong about this:
If that qlen was 1 instead of 4 you'd be at ~12 ms avg latency on those LUNs instead of 50.
Rick Rhodes wrote:
In other words, I/O's hit the head and get stored in memory (mirrored to the partner head, logged to NVRAM), then are written to disk curing a CP.
What I/Os are you talking about now? An I/O hitting a LUN, or physical disk I/O's underneath all those layers of abstraction, deep inside ONTAP?
I could see the queue length being a problem if the host is waiting for the disk writes. I just can't see how it would take +12ms x 4 = 50ms to write to the head memory. It's almost Like the head is single threading writes from the Host straight through to the backend. Other than back-to-back CP's, I don't understand how this could occur.
Me neither. And I'm prob wrong about that whole thing, as I don't really know how a LUN defines (measures) the latency and queue lenght. A LUN is essentially a special kind of file in a WAFL, so in some respect it must be basically the same as when you write a block in a file then
I was thinking a reallocate, but since it's only writes I don't see how that will help. The aggregate is 9tb of which only 3 are used, leaving 6tb free. I don't see how it's a freespace problem. It has 22x600gb15k drives in 2x11 disk raiddp raidsets.
No, I agree... You've got *plenty* of free space as well. And 15K rpm drives as well. And the box isn't high on CPU load in any way at all that I've seen from what you've shown. Well, I don't know. Something fishy here. Your sysstat output does show high disk utilisation, that's clear. It's 40-50% and that's not very pleasant.
This Aggregate is 2 raid groups, with rgsize 11, for a total of 22 spindles. OK they're 15K rpm but still that's not a lot of spindles... What FAS model and ONTAP version is this? *curious*
/M