NwkThd_00:warning]: NFS response to client xx.xx.xx.xx for volume 0x1234567 was slow, op was v3 write, 65 > 60 (in seconds)
I have a filer head, on which I'm hosting ESX datastores. I've had a couple of instances now of this error (or one rather similar).
It correlates with VMware getting upset and VMs going read only. But it doesn't actually give me any insight into what is going on.
Has anyone run into this, and can give some further insight as to what might be causing and where I can look?
That's a very generic warning- I'd open a case.
On Mon, Feb 15, 2016 at 6:44 AM, Edward Rolison ed.rolison@gmail.com wrote:
NwkThd_00:warning]: NFS response to client xx.xx.xx.xx for volume 0x1234567 was slow, op was v3 write, 65 > 60 (in seconds)
I have a filer head, on which I'm hosting ESX datastores. I've had a couple of instances now of this error (or one rather similar).
It correlates with VMware getting upset and VMs going read only. But it doesn't actually give me any insight into what is going on.
Has anyone run into this, and can give some further insight as to what might be causing and where I can look?
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Are you hosting your datastores on SATA drives? I have seen this before (many times) when customer use SATA and try to host too many virtual machines and they do not turn on the Storage I/O control. (premium feature!).
Turning on SIO control does not always solve this either.
--tmac
*Tim McCarthy* *Principal Consultant*
On Mon, Feb 15, 2016 at 7:42 AM, Basil basilberntsen@gmail.com wrote:
That's a very generic warning- I'd open a case.
On Mon, Feb 15, 2016 at 6:44 AM, Edward Rolison ed.rolison@gmail.com wrote:
NwkThd_00:warning]: NFS response to client xx.xx.xx.xx for volume 0x1234567 was slow, op was v3 write, 65 > 60 (in seconds)
I have a filer head, on which I'm hosting ESX datastores. I've had a couple of instances now of this error (or one rather similar).
It correlates with VMware getting upset and VMs going read only. But it doesn't actually give me any insight into what is going on.
Has anyone run into this, and can give some further insight as to what might be causing and where I can look?
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
The long and short seems to be - I'm getting low_mbuf CPs on the filter head, and at the time the error message occurs - I'm also getting back to back CPs. So have a reboot pending, and probably a code update in the near future.
On 15 February 2016 at 13:32, tmac tmacmd@gmail.com wrote:
Are you hosting your datastores on SATA drives? I have seen this before (many times) when customer use SATA and try to host too many virtual machines and they do not turn on the Storage I/O control. (premium feature!).
Turning on SIO control does not always solve this either.
--tmac
*Tim McCarthy* *Principal Consultant*
On Mon, Feb 15, 2016 at 7:42 AM, Basil basilberntsen@gmail.com wrote:
That's a very generic warning- I'd open a case.
On Mon, Feb 15, 2016 at 6:44 AM, Edward Rolison ed.rolison@gmail.com wrote:
NwkThd_00:warning]: NFS response to client xx.xx.xx.xx for volume 0x1234567 was slow, op was v3 write, 65 > 60 (in seconds)
I have a filer head, on which I'm hosting ESX datastores. I've had a couple of instances now of this error (or one rather similar).
It correlates with VMware getting upset and VMs going read only. But it doesn't actually give me any insight into what is going on.
Has anyone run into this, and can give some further insight as to what might be causing and where I can look?
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Edward -
Good luck with your corrective plan, but if you're getting back to back CPs simply rebooting or patching (unless there's a specific patch that is recommended against this behavior) isn't going to do much to solve your problem. These errors are almost always caused by having too few physical disks in the underlying aggregate, or a workload that is too aggressive for the aggregate hosting it -- so you're best off relocating the workload (Storage vMotion, volume move, SnapMirror/NDMP dump) or expanding the aggregate. Another thing you might want to check on is the free space on the aggregate (> 90-95% utilization) since that can also cause problems where housekeeping tasks such as changed block reclamation do not have adequate free space to run.
Anyway, to reiterate -- good luck, but keep the above in mind when considering your options.
Thanks!
On Feb 16, 2016, at 8:03 AM, Edward Rolison <ed.rolison@gmail.commailto:ed.rolison@gmail.com> wrote:
The long and short seems to be - I'm getting low_mbuf CPs on the filter head, and at the time the error message occurs - I'm also getting back to back CPs. So have a reboot pending, and probably a code update in the near future.
On 15 February 2016 at 13:32, tmac <tmacmd@gmail.commailto:tmacmd@gmail.com> wrote: Are you hosting your datastores on SATA drives? I have seen this before (many times) when customer use SATA and try to host too many virtual machines and they do not turn on the Storage I/O control. (premium feature!).
Turning on SIO control does not always solve this either.
--tmac
Tim McCarthy Principal Consultant
On Mon, Feb 15, 2016 at 7:42 AM, Basil <basilberntsen@gmail.commailto:basilberntsen@gmail.com> wrote: That's a very generic warning- I'd open a case.
On Mon, Feb 15, 2016 at 6:44 AM, Edward Rolison <ed.rolison@gmail.commailto:ed.rolison@gmail.com> wrote: NwkThd_00:warning]: NFS response to client xx.xx.xx.xx for volume 0x1234567 was slow, op was v3 write, 65 > 60 (in seconds)
I have a filer head, on which I'm hosting ESX datastores. I've had a couple of instances now of this error (or one rather similar).
It correlates with VMware getting upset and VMs going read only. But it doesn't actually give me any insight into what is going on.
Has anyone run into this, and can give some further insight as to what might be causing and where I can look?
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Well, if it's caused by low_mbuf, those are memory buffers that can be cleared with a reboot.
However, the issue will likely resurface unless there is a fix for the memory buffer issue.
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Tony Bar Sent: Tuesday, February 16, 2016 11:12 AM To: Edward Rolison Cc: toasters@teaparty.net Subject: Re: NwkThd_00:warning: NFS response to client was slow
Edward -
Good luck with your corrective plan, but if you're getting back to back CPs simply rebooting or patching (unless there's a specific patch that is recommended against this behavior) isn't going to do much to solve your problem. These errors are almost always caused by having too few physical disks in the underlying aggregate, or a workload that is too aggressive for the aggregate hosting it -- so you're best off relocating the workload (Storage vMotion, volume move, SnapMirror/NDMP dump) or expanding the aggregate. Another thing you might want to check on is the free space on the aggregate (> 90-95% utilization) since that can also cause problems where housekeeping tasks such as changed block reclamation do not have adequate free space to run.
Anyway, to reiterate -- good luck, but keep the above in mind when considering your options.
Thanks!
On Feb 16, 2016, at 8:03 AM, Edward Rolison <ed.rolison@gmail.commailto:ed.rolison@gmail.com> wrote: The long and short seems to be - I'm getting low_mbuf CPs on the filter head, and at the time the error message occurs - I'm also getting back to back CPs. So have a reboot pending, and probably a code update in the near future.
On 15 February 2016 at 13:32, tmac <tmacmd@gmail.commailto:tmacmd@gmail.com> wrote: Are you hosting your datastores on SATA drives? I have seen this before (many times) when customer use SATA and try to host too many virtual machines and they do not turn on the Storage I/O control. (premium feature!).
Turning on SIO control does not always solve this either.
--tmac
Tim McCarthy Principal Consultant
On Mon, Feb 15, 2016 at 7:42 AM, Basil <basilberntsen@gmail.commailto:basilberntsen@gmail.com> wrote: That's a very generic warning- I'd open a case.
On Mon, Feb 15, 2016 at 6:44 AM, Edward Rolison <ed.rolison@gmail.commailto:ed.rolison@gmail.com> wrote: NwkThd_00:warning]: NFS response to client xx.xx.xx.xx for volume 0x1234567 was slow, op was v3 write, 65 > 60 (in seconds)
I have a filer head, on which I'm hosting ESX datastores. I've had a couple of instances now of this error (or one rather similar).
It correlates with VMware getting upset and VMs going read only. But it doesn't actually give me any insight into what is going on.
Has anyone run into this, and can give some further insight as to what might be causing and where I can look?
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Yes - my first port of call is to see if I can make low_mbufs go away. I haven't often seen that, although I have seen similar when I'd got a memory leak, that was causing similar sort of problems. So an update to the latest revision of ONTAP is probably no bad thing, before trying to trouble shoot further. (Or otherwise suggest to my customers that we need more resources to do what they want). This is a 2240, so it's not the beefiest of filers in the first place :).
On 16 February 2016 at 16:16, Parisi, Justin Justin.Parisi@netapp.com wrote:
Well, if it’s caused by low_mbuf, those are memory buffers that can be cleared with a reboot.
However, the issue will likely resurface unless there is a fix for the memory buffer issue.
*From:* toasters-bounces@teaparty.net [mailto: toasters-bounces@teaparty.net] *On Behalf Of *Tony Bar *Sent:* Tuesday, February 16, 2016 11:12 AM *To:* Edward Rolison *Cc:* toasters@teaparty.net *Subject:* Re: NwkThd_00:warning: NFS response to client was slow
Edward -
Good luck with your corrective plan, but if you're getting back to back CPs simply rebooting or patching (unless there's a specific patch that is recommended against this behavior) isn't going to do much to solve your problem. These errors are almost always caused by having too few physical disks in the underlying aggregate, or a workload that is too aggressive for the aggregate hosting it -- so you're best off relocating the workload (Storage vMotion, volume move, SnapMirror/NDMP dump) or expanding the aggregate. Another thing you might want to check on is the free space on the aggregate (> 90-95% utilization) since that can also cause problems where housekeeping tasks such as changed block reclamation do not have adequate free space to run.
Anyway, to reiterate -- good luck, but keep the above in mind when considering your options.
Thanks!
On Feb 16, 2016, at 8:03 AM, Edward Rolison ed.rolison@gmail.com wrote:
The long and short seems to be - I'm getting low_mbuf CPs on the filter head, and at the time the error message occurs - I'm also getting back to back CPs. So have a reboot pending, and probably a code update in the near future.
On 15 February 2016 at 13:32, tmac tmacmd@gmail.com wrote:
Are you hosting your datastores on SATA drives? I have seen this before (many times) when customer use SATA and try to host too many virtual machines and they do not turn on the Storage I/O control. (premium feature!).
Turning on SIO control does not always solve this either.
--tmac
*Tim McCarthy*
*Principal Consultant*
On Mon, Feb 15, 2016 at 7:42 AM, Basil basilberntsen@gmail.com wrote:
That's a very generic warning- I'd open a case.
On Mon, Feb 15, 2016 at 6:44 AM, Edward Rolison ed.rolison@gmail.com wrote:
NwkThd_00:warning]: NFS response to client xx.xx.xx.xx for volume 0x1234567 was slow, op was v3 write, 65 > 60 (in seconds)
I have a filer head, on which I'm hosting ESX datastores.
I've had a couple of instances now of this error (or one rather similar).
It correlates with VMware getting upset and VMs going read only. But it doesn't actually give me any insight into what is going on.
Has anyone run into this, and can give some further insight as to what might be causing and where I can look?
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
What type and speed disks are you using for the VMware environment? If you're using 15k disks there are a couple things I would start with. Check the output of the following:
priv set diag statit -b (wait 10 seconds) statit -e
You'll get some good info about the disks in this output. What are the xfers (iops)?
Also check: sysstat -c 10 -x 10
What do you see in the "CP ty" column? If you see an upper or lower case "b", then you're hitting back to back CPs. Basically, this means that the controller has a problem keeping up with the incoming writes.
If any of the above comes back abnormal I would open a case for deeper investigation.
Also, that message you're seeing is giving you the volume fsid, 0x1234567. You can use the "vol read_fsid <vol-name>" command (priv set diag) to see the volume fsid's. You'd have to run it on each volume until you find the one that matches what you see in the message regarding the slow response.
On Mon, Feb 15, 2016 at 6:44 AM, Edward Rolison ed.rolison@gmail.com wrote:
NwkThd_00:warning]: NFS response to client xx.xx.xx.xx for volume 0x1234567 was slow, op was v3 write, 65 > 60 (in seconds)
I have a filer head, on which I'm hosting ESX datastores. I've had a couple of instances now of this error (or one rather similar).
It correlates with VMware getting upset and VMs going read only. But it doesn't actually give me any insight into what is going on.
Has anyone run into this, and can give some further insight as to what might be causing and where I can look?
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters