c++filefile-ioseekg

Unusual behaviour of get() (reading from a file in c++)


// Print the last n lines of a file i.e implement your own tail command
 #include <iostream>
 #include <fstream>
 #include <string>
 int main()
 {
  std::ifstream rd("D:\\BigFile.txt");
  int cnt = 0;char c;
  std::string data;
  rd.seekg(0,rd.end);
  int pos=rd.tellg();

   while(1)
  {

    rd.seekg(--pos,std::ios_base::beg);

      rd.get(c); 
      if(c=='\n')
      {
          cnt++;
         // std::cout<<pos<<"\t"<<rd.tellg()<<"\n";

      }

      if(cnt==10)
        break;

 }
       rd.seekg(pos+1);
       while(std::getline(rd,data))
     {
        std::cout<<data<<"\n";
     }



    }

So, I wrote this program to print the last 10 lines of a text file. However it prints only the last 5 , for some reason every time it encounters an actual '\n' the next get() also gives a \n leading to incorrect output . Here is my input file:

Hello
Trello
Capello
Morsello
Odello
Othello
HelloTrello
sdasd
qerrttt
mkoilll
qwertyfe 

I am using notepad on Windows and this is my output:

HelloTrello
sdasd
qerrttt
mkoilll
qwertyfe

I cant figure out why this is happening , Please help.


Solution

  • Do not use arithmetic on file positions if file is opened in text mode. It will not give you correct result.

    If file is opened in text mode, 1 character does not always mean 1 byte. And how file position is implemented (if it points to specific character or byte) is unspecified.

    In your case problem is that on Windows a newline symbol is two bytes long. Text streams converts it into single-byte symbol '\n' so you wouldn't need to worry about difference between platforms and actual byte sequences used.

    So your first read reads last byte of two-byte endline symbol which happens to have same value as '\n' in ASCII. Next read lands in the beginning of two-byte endline symbol and stream correctly converts it into '\n'.