linuxbashshellnfsnfsclient

Linux Shell Script: How to detect NFS Mount-point (or the Server) is dead?


Generally on NFS Client, how to detect the Mounted-Point is no more available or DEAD from Server-end, by using the Bash Shell Script?

Normally i do:

if ls '/var/data' 2>&1 | grep 'Stale file handle';
then
   echo "failing";
else
   echo "ok";
fi

But the problem is, when especially the NFS Server is totally dead or stopped, even the, ls command, into that directory, at Client-side is hanged or died. Means, the script above is no more usable.

Is there any way to detect this again please?


Solution

  • "stat" command is a somewhat cleaner way:

    statresult=`stat /my/mountpoint 2>&1 | grep -i "stale"`
    if [ "${statresult}" != "" ]; then
      #result not empty: mountpoint is stale; remove it
      umount -f /my/mountpoint
    fi
    

    Additionally, you can use rpcinfo to detect whether the remote nfs share is available:

    rpcinfo -t remote.system.net nfs > /dev/null 2>&1
    if [ $? -eq 0 ]; then
      echo Remote NFS share available.
    fi
    

    Added 2013-07-15T14:31:18-05:00:

    I looked into this further as I am also working on a script that needs to recognize stale mountpoints. Inspired by one of the replies to "Is there a good way to detect a stale NFS mount", I think the following may be the most reliable way to check for staleness of a specific mountpoint in bash:

    read -t1 < <(stat -t "/my/mountpoint")
    if [ $? -eq 1 ]; then
       echo NFS mount stale. Removing... 
       umount -f -l /my/mountpoint
    fi
    

    "read -t1" construct reliably times out the subshell if stat command hangs for some reason.

    Added 2013-07-17T12:03:23-05:00:

    Although read -t1 < <(stat -t "/my/mountpoint") works, there doesn't seem to be a way to mute its error output when the mountpoint is stale. Adding > /dev/null 2>&1 either within the subshell, or in the end of the command line breaks it. Using a simple test: if [ -d /path/to/mountpoint ] ; then ... fi also works, and may preferable in scripts. After much testing it is what I ended up using.

    Added 2013-07-19T13:51:27-05:00:

    A reply to my question "How can I use read timeouts with stat?" provided additional detail about muting the output of stat (or rpcinfo) when the target is not available and the command hangs for a few minutes before it would time out on its own. While [ -d /some/mountpoint ] can be used to detect a stale mountpoint, there is no similar alternative for rpcinfo, and hence use of read -t1 redirection is the best option. The output from the subshell can be muted with 2>&-. Here is an example from CodeMonkey's response:

    mountpoint="/my/mountpoint"
    read -t1 < <(stat -t "$mountpoint" 2>&-)
    if [[ -n "$REPLY" ]]; then
      echo "NFS mount stale. Removing..."
      umount -f -l "$mountpoint"
    fi
    

    Perhaps now this question is fully answered :).