The underlying problem may be due to a timing bug in the Microsoft client (MS has acknowledged the bug; their answer is "upgrade to Windows 2000") This is seen more on filers than NT server because we are so much faster doing opens. We have a workaround, which is to enforce a small delay before breaking the oplock on oplocked files that are re-opened. This is controlled by an option, "cifs.oplocks.opendelta". It defaults to 8 ms which works well for most clients, but it sounds like you might want to try increasing it. The "cifs stat" console command has a value "OpLkBkNoBreakAck" which gets incremented if this problem is seen. If you have non-zero values for this, try increasing the opendelta to 16 and see if things get better.
As to the protocol issues, as Bruce said, we're more or less stuck with them because of Microsoft compatibility. If a session goes away when there is no request pending (i.e. we don't owe the client a response), the filer delays 10 minutes before terminating the session. This is to keep the session alive in the face of temporary network partitions, which we can't distinguish from just losing a client. Currently that 10 minute interval is not configurable on the filer, but it sounds like we might want to make it so.
Mark Muhlestein -- mmm@netapp.com
-----Original Message----- From: Dave Atkin [mailto:dla1@york.ac.uk] Sent: Friday, June 16, 2000 3:39 AM To: toasters@mathworks.com Cc: AB Smith Subject: CIFS file locking problem
We have a problem with file locking which occurs every few months. We have finally worked out at least part of what is happening.
We have a Windows 95 system filestore exported as a read-only CIFS share. There are 1200 or so PCs which access this.
Just occasionally when a user launches the Telnet client (Hummingbird) it complains that it can't access its keyboard mapping file. Once this starts to happen, it is the same for every PC. A unix user running as root can't read or do anything to this file either - it gets "permission denied".
Yesterday when this happened we used (rc_toggle_basic) lock_dump -f to look at the locks, and we found that a PC appeared to have taken out an exclusive read lock on the file:
========00003a31:0006b48e \95APPLIB\york.hts state=GRANTED mode=Excl-denyA host=OFFICE1
When we found the PC called "office1", it was turned off. The user said it had hung up and she had switched it off.
We managed to restore normal service by doing a "cifs terminate -t 0 office1".
This raises several issues:
- If a CIFS client is switched off, is there any sort of
keepalive timer on the Filer that closes down open files and locks?
- Should a PC with read-only access to a filestore be allowed to lock
files for exclusive read? Should this perhaps be an option on a qtree or share?
- This particular application obviously doesn't expect this
file to be shared. But could any malicious user take out locks on random important files and freeze the whole network?
Dave Atkin
Dave Atkin, Head of Technical Services Computing Service, University of York, YORK YO10 5DD Phone: +44-1904-433804 (ddi) Fax: +44-1904-433740 Email: D.Atkin@york.ac.uk