It appears that some people have mantra about NFS that goes "use the soft
option only when client responsiveness is more important than data integrity". Is this true in general or is this true only for specific usage patterns?
Considering that writing in the middle of an existing file is potentially risky even with local filesystems in case of e.g. power loss, I would expect well written user mode applications to write new files in the same directory with a temporary filename and then move the new file over existing file. I know that for local filesystem the move operation is atomic for any POSIX system.
How about NFS mount point with flags soft,rw
? Is there potential for corruption if application always writes to unique temporary filename which is renamed over existing file to update the data? Is it important to check return value from close()
before renaming the file?
Examples of sources that claim that NFS with flags soft,rw
is unsafe:
Broadly, the answer to the title question is "Yes, if your use cases are simple, due to close-to-open consistency".
Anatomical explanation:
Server: The server owns the filesystem, and is responsible for the filesystem consistency guarantees. these guarantees include:
The contract between the client and server is governed by the NFS spec.
Client: The client owns the data and metadata presented to applications, and the client-side data and metadata cache where this information is taken from. It is responsible to populate, invalidate and flush this cache against the server as appropriate.
For example, when you close a file or call fsync()
, the client is responsible to flush its local cache and call COMMIT
(NFSv4 is more complex).
Importantly, if the client is performing a cache-invalidating operation such as WRITE
, it enters a state where the attributes of the file, and sometimes the data itself, are not known, and it therefore cannot allow an application to rely on the cached version. When that happens, it just retries indefinitely. In this situation, it may sometimes be able to return EGAIN
to the application, however, not all system calls allow EAGAIN
(e.g. stat()
). Therefore, its only resort is to block such system calls until the cache is valid again. This behavior is called "hard mount".
The contract between the application and the client is only specified to the degree that it's documented in the man page. In particular, NFS does NOT fully implement the POSIX standard. Implicitly, most kernel-based clients not only assume the server implements close-to-open consistency, but also implement it for the application - namely, they flush the cache when the file is closed as described above.
soft
: a soft mount is a slightly different contract between the client and the application: instead of blocking clients until the data is available, after some timeout, EIO
is returned (there's also softerr
which caused ETIMEDOUT
to be returned instead).
Given the above, let's go over what happens with soft
mounts when you write regularly vs when you rename:
Let's say you exported a DB table to a file by using regular writes, and then the server became unavailable, and then you closed the file.
Before the call to close()
, some of the data was flushed from the client's cache to the server (using NFS WRITE
s), and some didn't. This doesn't have to happen in any particular order: it could be that the last bytes of the file were flushed but the first ones didn't.
The server, in turn, may have flushed some of those WRITE
s to disk, but not all. It may also have crashed.
Now, when the exporting application closes the file, let's say it gets EIO
from close()
- what does it do with it? Typically, it just prints it and exits. You can't even delete the file because the server is down.
Then, when you try to import the table back, you might be lucky and get EIO
because the server is still unavailable.
But if the server is back up, the reader may see the file with the right size, but a bunch of missing data, with no way to know it's missing. That's your data corruption.
Now let's say that you instead write to a temporary file and then rename.
So you start by deleting the target file, and then write to a temporary file and before closing, the server crashes, and the server's disk now contains an inconsistent table, as above.
Now, before renaming, you close the file (or call fsync()
), and check the error. If you didn't get an error, you rename. Again you can't delete the source or target files because the server is down.
So now when you try to import the table back, you get ENOENT
and you know there's a problem.
Note that by "simple use case" I mean:
fsync()
before rename.ENOENT
in the above example).* Side note: close-to-open consistency is an NFS thing. POSIX doesn't guarantee that closing a file will flush it and provide data consistency, you have to explicitly call fsync()
and check its return value. Most of the time this isn't a problem, but if consistency is critical you should call fsync()
before close()
. NFS guarantees close-to-open consistency which means that close()
automatically calls fsync()
.