I know there was discussion on this before, but I don't recall any resolution.
It started with the binmail program could not NFS lock the file. I have run the lockdump -h command and found the processes and the files. The processes are no longer running on the host. The file no longer exists.(I removed it) The locks are still on the filer. How can I get rid of them? and(anyone) Why does the NetApp box keep the lock for processes/files that don't exist?
}}}===============>> LLNL James E. Harm (Jim); jharm@llnl.gov System Administrator Compaq Clusters (925) 422-4018 Page: 423-7705x57152
Jim Harm harm1@llnl.gov writes:
I know there was discussion on this before, but I don't recall any resolution.
It started with the binmail program could not NFS lock the file. I have run the lockdump -h command and found the processes and the files. The processes are no longer running on the host. The file no longer exists.(I removed it) The locks are still on the filer. How can I get rid of them? and(anyone) Why does the NetApp box keep the lock for processes/files that don't exist?
It's up the client kernel to remove the locks held by a process when that process goes away. What flavour is your client? Is its NLM implementation reliable?
The removal of the file should be neither here nor there. NFS and NLM are only tenuously related through the use of NFS filehandles as NLM tokens. [Roll on NFS v4 when locking is to become a first-class NFS activity!]
Chris Thompson University of Cambridge Computing Service, Email: cet1@ucs.cam.ac.uk New Museums Site, Cambridge CB2 3QG, Phone: +44 1223 334715 United Kingdom.
We are running Tru64 Compaq UNIX. Any UNIX that suffers a power outage or a crash will not clean up locks maybe not even if a process is unceremoniously killed.
The real problem I had was that NetApp support said to remove the files in /etc/sm directory and then run the "sm_mon -l $hostname" command. This of course fails and now I understand why from the expanation of P. Albers at the bottom of this email. Thank you P.A.
At 10:13 PM +0100 7/10/00, Chris Thompson wrote:
Jim Harm harm1@llnl.gov writes:
I know there was discussion on this before, but I don't recall any resolution.
It started with the binmail program could not NFS lock the file. I have run the lockdump -h command and found the processes and the files. The processes are no longer running on the host. The file no longer exists.(I removed it) The locks are still on the filer. How can I get rid of them? and(anyone) Why does the NetApp box keep the lock for processes/files that don't exist?
It's up the client kernel to remove the locks held by a process when that process goes away. What flavour is your client? Is its NLM implementation reliable?
The removal of the file should be neither here nor there. NFS and NLM are only tenuously related through the use of NFS filehandles as NLM tokens. [Roll on NFS v4 when locking is to become a first-class NFS activity!]
Chris Thompson University of Cambridge Computing Service, Email: cet1@ucs.cam.ac.uk New Museums Site, Cambridge CB2 3QG, Phone: +44 1223 334715 United Kingdom.
Date: Mon, 10 Jul 2000 11:05:42 -0500 From: Paul Albers palbers@netapp.com
To: Jim Harm harm1@llnl.gov Subject: Re: cannot lockf
Jim,
Not sure why the locks are remaining on the filer, this is usually caused by the client not telling the filer to free the lock. Of course, it could be any number of other things, will leave this open for the locking experts out there.
To free up the locks on the filer for this client, do the following:
filer> rc_toggle_basic filer> sm_mon -l <client>
This tells <client> that <filer> just rebooted! Now, the <client> must remember which processes were waiting for locks, and give the locks to them. If the processes are no longer there, they will be freed up and will not show up in your lock_dump -h output.
Yes, 'sm_mon -l' without client specification will simulate a filer reboot, and will broadcast to all clients that the filer rebooted and come reclaim your locks. If there were any bogus locks out there, they are now all gone.
Hope that this helps,
-paul
}}}===============>> LLNL James E. Harm (Jim); jharm@llnl.gov System Administrator Compaq Clusters (925) 422-4018 Page: 423-7705x57152