ceoffeof

How safe is using !feof in searching a file?


I read here that feof or more precisely using !feof in searching for a info in a file is a bad habit.

What I understood is that it's bad because it reads information from the FILE pointer before called function or process or something like that.

Wouldn't it be fine to have a do/while loop with fscanf inside and !feof as the exit condition?

This is a search function that I did:

typedef struct
{
    char lname[20] , fname[20];
    int nchildren;
}employee;
void searchemployee(char *filename , char *str)
{
    employee e;
    FILE *f;
    int c;
    f = fopen(filename, "r");
    if (f == NULL)
        printf("file couldn't be loaded\n");
    else {
        c = 0;
        do {
            fscanf(f, "%s %s %d\n", e.fname, e.lname, &e.nchildren);
            if (strcmp(e.fname, str) == 0)
                c = 1;
        } while (c == 0 && !feof(f));
        if (c != 1)
            printf("employee not found\n");
        else
            printf("employee : %s %s| children : %d\n", e.fname, e.lname, e.nchildren);
    }
    fclose(f);
}

Solution

  • The return value of the function feof specifies whether a previous input operation has already encountered the end of the file. This function does not specify whether the next input will encounter the end of the file.

    The problem with

    do{
        fscanf(f,"%s %s %d\n",e.fname,e.lname,&e.nchildren);
        if (strcmp(e.fname,str)==0)
            c=1;
    }while(c==0 && !feof(f));
    

    is that if fscanf fails and returns EOF due to encountering the end of the file, then it will write nothing to e.fname.

    If this happens in the first iteration of the loop, then the content of e.fname will be indeterminate and the subsequent function call strcmp(e.fname,str) will invoke undefined behavior (i.e. your program may crash), unless e.fname happens to contain a terminating null character.

    If this does not happen in the first iteration, but rather in a subsequent iteration of the loop, then the content of e.fname will contain the content of the previous loop iteration, so you will effectively be processing the last successful call of fscanf twice.

    In this specific case, processing the last successful call of fscanf twice is harmless, except for being a slight waste of CPU and memory resources. However, in most other cases, processing the last input twice will result in the program not working as intended.

    See the following question for further information:

    Why is “while( !feof(file) )” always wrong?

    If you change the loop to

    for (;;) {
        fscanf(f,"%s %s %d\n",e.fname,e.lname,&e.nchildren);
        if ( c != 0 || feof(f) )
            break;
        if (strcmp(e.fname,str)==0)
            c=1;
    }
    

    so that the loop condition is checked in the middle of the loop, then the problem mentioned above will be gone.

    However, it is generally better to check the return value of fscanf instead of calling feof, for example like this:

    c = 0;
    
    while ( c == 0 && fscanf(f,"%s %s %d\n",e.fname,e.lname,&e.nchildren) == 3 ) {
        if (strcmp(e.fname,str)==0)
            c=1;
    }
    

    Also, you don't need the flag variable c. I suggest that you incorporate the lines

    if (c!=1)
        printf("emplyee not found\n");
    else
        printf("employee : %s %s| children : %d\n",e.fname,e.lname,e.nchildren);
    

    partially into the loop, like this:

    void searchemployee( char *filename, char *str )
    {
        employee e;
        FILE *f = NULL;
    
        //attempt to open file
        f = fopen( filename, "r" );
        if ( f == NULL )
        {
            printf( "file couldn't be loaded\n" );
            goto cleanup;
        }
    
        //process one employee record per loop iteration
        while ( fscanf( f, "%s %s %d\n", e.fname, e.lname, &e.nchildren ) == 3 )
        {
            //check whether we found the target record
            if ( strcmp(e.fname,str) == 0 )
            {
                printf(
                    "employee : %s %s| children : %d\n",
                    e.fname, e.lname, e.nchildren
                );
                goto cleanup;
            }
        }
    
        printf( "employee not found.\n");
    
    cleanup:
        if ( f != NULL )
            fclose(f);
    }
    

    Another issue is that when using %s with scanf or fscanf, you should generally also add a width limit, to prevent a possible buffer overflow. For example, if e.fname has a size of 100 characters, you should use %99s to limit the number of bytes written to 99 plus the terminating null character.