ceoffseekftell

y with umlaut in file


I'm working on an example problem where I have to reverse the text in a text file using fseek() and ftell(). I was successful, but printing the same output to a file, I had some weird results. The text file I input was the following:

redivider
racecar
kayak
civic
level
refer
These are all palindromes

The result in the command line works great. In the text file that I create however, I get the following:

ÿsemordnilap lla era esehTT
referr
levell
civicc
kayakk
racecarr
redivide

I am aware from the answer to this question says that this corresponds to the text file version of EOF in C. I'm just confused as to why the command line and text file outputs are different.

#include <stdio.h>
#include <stdlib.h>

/**********************************
This program is designed to read in a text file and then reverse the order 
of the text.
The reversed text then gets output to a new file.
The new file is then opened and read.
**********************************/

int main()
{
    //Open our files and check for NULL
    FILE *fp = NULL;
    fp = fopen("mainText.txt","r");
    if (!fp)
        return -1;

    FILE *fnew = NULL;
    fnew = fopen("reversedText.txt","w+");
    if (!fnew)
        return -2;

    //Go to the end of the file so we can reverse it
    int i = 1;
    fseek(fp, 0, SEEK_END);
    int endNum = ftell(fp);
    while(i < endNum+1)
    {
        fseek(fp,-i,SEEK_END);
        printf("%c",fgetc(fp));
        fputc(fgetc(fp),fnew);
        i++;
    }

    fclose(fp);
    fclose(fnew);
    fp = NULL;
    fnew = NULL;

    return 0;
}

No errors, I just want identical outputs.


Solution

  • The outputs are different because your loop reads two characters from fp per iteration.

    For example, in the first iteration i is 1 and so fseek sets the current file position of fp just before the last byte:

    ...
    These are all palindromes
                            ^
    

    Then printf("%c",fgetc(fp)); reads a byte (s) and prints it to the console. Having read the s, the file position is now

    ...
    These are all palindromes
                             ^
    

    i.e. we're at the end of the file.

    Then fputc(fgetc(fp),fnew); attempts to read another byte from fp. This fails and fgetc returns EOF (a negative value, usually -1) instead. However, your code is not prepared for this and blindly treats -1 as a character code. Converted to a byte, -1 corresponds to 255, which is the character code for ÿ in the ISO-8859-1 encoding. This byte is written to your file.

    In the next iteration of the loop we seek back to the e:

    ...
    These are all palindromes
                           ^
    

    Again the loop reads two characters: e is written to the console, and s is written to the file.

    This continues backwards until we reach the beginning of the input file:

    redivider
    ^
    

    Yet again the loop reads two characters: r is written to the console, and e is written to the file.

    This ends the loop. The end result is that your output file contains one character that doesn't exist (from the attempt to read past the end of the input file) and never sees the first character.

    The fix is to only call fgetc once per loop:

    while(i < endNum+1)
    {
        fseek(fp,-i,SEEK_END);
        int c = fgetc(fp);
        if (c == EOF) {
            perror("error reading from mainText.txt");
            exit(EXIT_FAILURE);
        }
        printf("%c", c);
        fputc(c, fnew);
        i++;
    }