cremote-accessfile-descriptormount-point

Correctly close file descriptor when opened through network mount


I'm currently trying to figure out how to correctly close a file descriptor when it points to a remote file and the connection is lost.

I have a simple example program which opens a file descriptor on a sshfs mount folder and start to write to the file.

I'm not able to find how to handle the case when the connection is lost.

void *write_thread(void* arg);

int main()
{
    pthread_t thread;
    int fd = -1;

    if(-1 == (fd = open("/mnt/testfile.txt", O_CREAT | O_RDWR | O_NONBLOCK, S_IRWXU)))
    {
        fprintf(stderr, "Error oppening file : %m\n");
        return EXIT_FAILURE;
    }
    else
    {
        if(0 > pthread_create(&thread, NULL, write_thread, &fd))
        {
            fprintf(stderr, "Error launching thread : %m\n");
            return EXIT_FAILURE;
        }
        fprintf(stdout, "Waiting 10 seconds before closing\n");
        sleep(10);
        if(0 > close(fd))
        {
            fprintf(stderr, "Error closing file descriptor: %m\n");
        }
    }
}

void *write_thread(void* arg)
{
    int fd = *(int*)arg;
    int ret;

    while(1)
    {
        fprintf(stdout, "Write to file\n", fd);
        if(0 > ( ret = write(fd, "Test\n", 5)))
        {
            fprintf(stderr, "Error writing to file : %m\n");
            if(errno == EBADF)
            {
                if(-1 == close(fd))
                {
                    fprintf(stderr, "Close failed : %m\n");
                }
                return NULL;
            }
        }
        else if(0 == ret)
        {
            fprintf(stderr, "Nothing happened\n");
        }
        else
        {
            fprintf(stderr, "%d bytes written\n", ret);
        }
        sleep(1);
    }
}

When the connection is lost (i.e. I unplug the ethernet cable between my boards), The close in the main thread always blocks whether I use the flag O_NONBLOCK or not.

The write call sometimes immediately fails with EBADF error or sometimes continues for a long time before failing.

My problem is that the write call doesn't always fail when the connection is lost so I can't trigger the event into the thread and I also can't trigger it from the main thread because close blocks forever.

So my question is : How to correctly handle this case in C ?


Solution

  • After some diggin around I found that the SSH mount could be configured to drop the connection and disconnect from server if nothing happens.


    Setting ServerAliveInterval X on client side to disconnect if the server is unresponsive after X sec.

    Setting ClientAliveCountMax X on server side to disconnect if the client is unresponsive after X sec.

    ServerAliveCountMax Y and ClientAliveCountMax Y can also be used in order to retry Y times before dropping the connection.


    With this configuration applied, the sshfs mount is automatically removed by Linux when the connection is unresponsive.

    With this configuration, the write call fails with Input/output error first and then with Transport endpoint is not connected.

    This is enough to detect that the connection is lost and thus cleaning up the mess before exiting.