Re: Atomicity of rename on NFS

29 Dec 2016

      Just a comment: race conditions over NFS can be common and severe if you 
construct a piece of SW (a system) which inherently assumes perfect [client] 
Cache Coherence.  This is not really possible to achieve, and client side 
caching is VERY important, yes crucial, for performance of any distributed 
file system like NFS is.
(Try Google for "perfect cache coherence" file system and look at the hits 
you will get.)
The .nfsXXXX files are residues from such a race conditions in the case that 
one client had a file open (an active File Handle) and writes to it, and 
another client just deletes that file. When the W data is flushed from the 
1st client, the file is gone and that data goes into the .nfsXXXX file in 
the dir where client #2 expected the file to be.
It is, unfortunately, quite common that people have totally misunderstood 
the semantics of UNIX and NFS in this respect. Many really believe that if 
one NFS client has a file open, then no other client can delete it. So they 
have no idea what "Stale NFS file handle" means and how easy it is to end up 
in that situation if you work with a parallel system (home brew as it often 
is) over NFS with many NFS clients involved. It seems easy and 
straightforward, but it is not.
There is no mandatory file locking in NFS. Never has been. It's advvisory 
and also before NFSv4.x auxiliary (the NLM system, with its own ports, it 
doesn't have very high performance capacity).
If you don't know EXACTLY what you're doing, you will shoot yourself in the 
foot.
Regards,
/M
On 2016-12-28 14:10, andrei.borzenkov@ts.fujitsu.com wrote:
...
I would expect “Stale NFS handle” if the problem was (another) client
caching. But it looks like (another) client actually contacts server and
gets “No such file” in response. Multiple resources on Net suggest that it
is known NFS limitation.
I can think of at least one case when it is possible – if target file is
currently opened on the same client that is doing rename, client is expected
to rename target to .nfsXXXX to prevent deletion on server which opens up
window when target file is not available.
@Edward, do you see any .nfsXXXX files in the same directory?
*From:*toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net]
*On Behalf Of *Steiner, Jeffrey
*Sent:* Wednesday, December 28, 2016 3:49 PM
*To:* Edward Rolison
*Cc:* toasters@teaparty.net
*Subject:* Re: Atomicity of rename on NFS
That sounds like normal behavior with the typical mount options used for
NFS. What are you using exactly? The default includes several seconds of
caching of file and directory data. The act of renaming a file is atomic but
other NFS clients will not be immediately aware of the change unless you
have actimeo=0 and noac in the mount options. There are performance
consequences for that but sometimes it's unavoidable. For example, Oracle
database clusters using NFS must always have a single consistent image of
them data across notes. That's why they use actimeo=0 and noac.
Sent from my mobile phone.
On 28 Dec 2016, at 12:23, Edward Rolison <ed.rolison@gmail.com
mailto:ed.rolison@gmail.com> wrote:
Hello fellow NetApp Admins.
I have a bit of an odd one that I'm trying to troubleshoot - and whilst
I'm not sure it's specifically filer related, it's NFS related (and is
happening on a filer mount).
What happens is this - there's a process that updates a file, and relies
on 'rename()' being atomic- a journal is updated, and then reference
pointer (file) is newly created, and renamed over an old one.
The expectation is that this file will always be there - because
"rename()" is defined as an atomic operation.
But that's not quite what I'm getting - I have one nfs client doing it's
(atomic) rename. And another client (different NFS host) reading it, and

occasionally - reporting 'no such file or directory'.

This is causing an operation to fail, which in turn means that someone
has to intervene in the process. This operation (and multiple extremely
similar ones) happen at 5m intervals, and every few days (once a week
maybe?) it fails for this reason, and our developers think that should be
impossible. But as such - it looks like a pretty narrow race condition.
[...]

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Re: Atomicity of rename on NFS