cfgetsposix-select

fgets() blocking when buffer too large


I'm currently using select() to tell if there is data to be read in a file descriptor because I don't want fgets to block. This works 99% of the time, however, if select() detects data in fp, but the data doesn't have a newline and the data is smaller than my buffer, it will block. Is there any way to tell how many bytes are ready to be read?

    //See if there is data in fp, waiting up to 5 seconds
    select_rv = checkForData(fp, 5);

    //If there is data in fp...
    if (select_rv > 0)
    {
        //Blocks if data doesn't have a newline and the data in fp is smaller than the size of command_out!!
        if (fgets(command_out, sizeof(command_out)-1, fp) != NULL)
        {
            printf("WGET: %s", command_out);
        }
    }
    else if (select_rv == 0)
    {
        printf("select() timed out... No output from command process!\n");
    }

I guess what I really want is a way to know if a full line is ready to be read before calling fgets.


Solution

  • As MBlanc mentions, implementing your own buffering using read() is the way to go here.

    Here's a program that demonstrates the general method. I don't recommend doing exactly this, since:

    1. The function presented here uses static variables, and will only work for one single file, and will be unusable once that's over. In reality, you'd want to set up a separate struct for each file and store the state for each file in there, passing it into your function each time.

    2. This maintains the buffer by simply memmove()ing the remaining data after some is removed from the buffer. In reality, implementing a circular queue would probably be a better approach, although the basic usage will be the same.

    3. If the output buffer here is larger than the internal buffer, it'll never use that extra space. In reality, if you get into this situation, you'd either resize the internal buffer, or copy the internal buffer into the output string, and go back and try for a second read() call before returning.

    but implementing all this would add too much complexity to an example program, and the general approach here will show how to accomplish the task.

    To simulate delays in receiving input, the main program will pipe the output from the following program, which just outputs a few times, sometimes with newlines, sometimes without, and sleep()s in between outputs:

    delayed_output.c:

    #define _POSIX_C_SOURCE 200809L
    
    #include <stdio.h>
    #include <unistd.h>
    
    int main(void)
    {
        printf("Here is some input...");
        fflush(stdout);
    
        sleep(3);
    
        printf("and here is some more.\n");
        printf("Yet more output is here...");
        fflush(stdout);
    
        sleep(3);
    
        printf("and here's the end of it.\n");
        printf("Here's some more, not ending with a newline. ");
        printf("There are some more words here, to exceed our buffer.");
        fflush(stdout);
    
        return 0;
    }
    

    The main program:

    buffer.c:

    #define _POSIX_C_SOURCE 200809L
    
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <stdbool.h>
    #include <stdarg.h>
    #include <unistd.h>
    #include <sys/select.h>
    
    #define INTBUFFERSIZE 1024
    #define BUFFERSIZE 60
    #define GET_LINE_DEBUG true
    
    /*  Prints a debug message if debugging is on  */
    
    void get_line_debug_msg(const char * msg, ...)
    {
        va_list ap;
        va_start(ap, msg);
        if ( GET_LINE_DEBUG ) {
            vfprintf(stderr, msg, ap);
        }
        va_end(ap);
    }
    
    /*
     *  Gets a line from a file if one is available.
     *
     *  Returns:
     *    1 if a line was successfully gotten
     *    0 if a line is not yet available
     *    -1 on end-of-file (no more input available)
     *
     *  NOTE: This function can be used only with one file, and will
     *  be unusable once that file has reached the end.
     */
    
    int get_line_if_ready(int fd, char * out_buffer, const size_t size)
    {
        static char int_buffer[INTBUFFERSIZE + 1] = {0};  /*  Internal buffer  */
        static char * back = int_buffer;    /*  Next available space in buffer */
        static bool end_of_file = false;
    
        if ( !end_of_file ) {
    
            /*  Check if input is available  */
    
            fd_set fds;
            FD_ZERO(&fds);
            FD_SET(fd, &fds);
            struct timeval tv = {0, 0};
    
            int status;
            if ( (status = select(fd + 1, &fds, NULL, NULL, &tv)) == -1 ) {
                perror("error calling select()");
                exit(EXIT_FAILURE);
            }
            else if ( status == 0 ) {
    
                /*  Return zero if no input available  */
    
                return 0;
            }
    
            /*  Get as much available input as will fit in buffer  */
    
            const size_t bufferspace = INTBUFFERSIZE - (back - int_buffer) - 1;
            const ssize_t numread = read(fd, back, bufferspace);
            if ( numread == -1 ) {
                perror("error calling read()");
                exit(EXIT_FAILURE);
            }
            else if ( numread == 0 ) {
                end_of_file = true;
            }
            else {
                const char * oldback = back;
                back += numread;
                *back = 0;
    
                get_line_debug_msg("(In function, just read [%s])\n"
                                   "(Internal buffer is [%s])\n",
                                   oldback, int_buffer);
            }
        }
    
        /*  Write line to output buffer if a full line is available,
         *  or if we have reached the end of the file.                */
    
        char * endptr;
        const size_t bufferspace = INTBUFFERSIZE - (back - int_buffer) - 1;
        if ( (endptr = strchr(int_buffer, '\n')) ||
             bufferspace == 0 ||
             end_of_file ) {
            const size_t buf_len = back - int_buffer;
            if ( end_of_file && buf_len == 0 ) {
    
                /*  Buffer empty, we're done  */
    
                return -1;
            }
    
            endptr = (end_of_file || bufferspace == 0) ? back : endptr + 1;
            const size_t line_len = endptr - int_buffer;
            const size_t numcopy = line_len > (size - 1) ? (size - 1) : line_len;
    
            strncpy(out_buffer, int_buffer, numcopy);
            out_buffer[numcopy] = 0;
            memmove(int_buffer, int_buffer + numcopy, INTBUFFERSIZE - numcopy);
            back -= numcopy;
    
            return 1;
        }
    
        /*  No full line available, and
         *  at end of file, so return 0.  */
    
        return 0;
    }
    
    int main(void)
    {
        char buffer[BUFFERSIZE];
    
        FILE * fp = popen("./delayed_output", "r");
        if ( !fp ) {
            perror("error calling popen()");
            return EXIT_FAILURE;
        }
    
        sleep(1);       /*  Give child process some time to write output  */
    
        int n = 0;
        while ( n != -1 ) {
    
            /*  Loop until we get a line  */
    
            while ( !(n = get_line_if_ready(fileno(fp), buffer, BUFFERSIZE)) ) {
    
                /*  Here's where you could do other stuff if no line
                 *  is available. Here, we'll just sleep for a while.  */
    
                printf("Line is not ready. Sleeping for five seconds.\n");
                sleep(5);
            }
    
            /*  Output it if we're not at end of file  */
    
            if ( n != -1 ) {
                const size_t len = strlen(buffer);
                if ( buffer[len - 1] == '\n' ) {
                    buffer[len - 1] = 0;
                }
    
                printf("Got line: %s\n", buffer);
            }
        }
    
        if ( pclose(fp) == -1 ) {
            perror("error calling pclose()");
            return EXIT_FAILURE;
        }
    
        return 0;
    }
    

    and the output:

    paul@thoth:~/src/sandbox/buffer$ ./buffer
    (In function, just read [Here is some input...])
    (Internal buffer is [Here is some input...])
    Line is not ready. Sleeping for five seconds.
    (In function, just read [and here is some more.
    Yet more output is here...])
    (Internal buffer is [Here is some input...and here is some more.
    Yet more output is here...])
    Got line: Here is some input...and here is some more.
    Line is not ready. Sleeping for five seconds.
    (In function, just read [and here's the end of it.
    Here's some more, not ending with a newline. There are some more words here, to exceed our buffer.])
    (Internal buffer is [Yet more output is here...and here's the end of it.
    Here's some more, not ending with a newline. There are some more words here, to exceed our buffer.])
    Got line: Yet more output is here...and here's the end of it.
    Got line: Here's some more, not ending with a newline. There are some
    Got line:  more words here, to exceed our buffer.
    paul@thoth:~/src/sandbox/buffer$