linuxamazon-ec2nfsdistributed-systemamazon-efs

NFS (EFS) close-to-open consistency rule not respected?


I have a simple setup in AWS where 2 VMs (EC2) mount a common Elastic File System (EFS). EFS is backed by NFSv4, which offers close-to-open (CTO) consistency, meaning that (quoting azure docs): "no matter the state of the (efs client) cache, on open the most recent data for a file is always presented to the application.". In this context, "most recent data" means that a write operation was concluded by a successful call to close() (see section 8 of About the NFS protocol).

In my case, VM A compiles a project and VM B checks for a resulting library file. According to the CTO rule, VM B should definitely see the file after compilation, but I do not observe this - it takes several tens of seconds until the file becomes visible to it.

Interestingly, if I disable NFS client caches on VM B, the file is visible right away - this suggests that NFS client caches are not in fact invalidated on open().

I already made a detailed post on the AWS repost forum with complete steps to reproduce the issue.

Has anyone encountered something similar with NFS based distributed storage? Thanks a lot!


Solution

  • I figured it out as I believe it is the same phenomenon described in this post, namely CTO doesn't apply if the file isn't already seen by the NFS client.

    As explained in the "Directory entry caching" section of the man page, the NFS client caches directory entry lookups - so what can happen is the following:

    The way I understand it now is that CTO is actually simply more general - it works on the directory level as well. So before opening a file, one must also first open the directory to see the latest entries.