I'm working with a program that produces text lines with tab-separated numbers as output, about 1 line every 2 minutes, each line having at most 51 characters (including the tabs).
I piped its stdout to a C++ program that keeps a text "FIFO buffer", a text file that always keeps the last 100 lines the first program outputted.
This is the code of the software in question:
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
string lines [100];
int used=0;
ofstream outfile("last100.txt", ios_base::out);
int main()
{
for (string line; getline(cin, line);)
{
if(used<100) //adds new lines to the end of the array while not full
{
lines[used]=line;
used++;
}
else //if full, then shift lines one position back and add on the end
{
for(int i=0;i<99;i++)
{
lines[i]=lines[i+1];
}
lines[99]=line;
}
outfile.seekp(0, ios::beg); //seek to the beginning of file
for(int i=0;i<used;i++) //save the array to the text file and flush
{
outfile.write((lines[i]+'\n').data(),(lines[i]+'\n').size());
outfile.flush();
}
}
return 0;
}
The problem is, it works fine in the beginning, it kept the file with 100 lines for a while, but after some time elapsed, it started writing some nonsense, trash lines in the ending of the file, like this:
1742596581 8249 1 0 1426631794 218844 33705 0
1742596615 8250 0 0 1426631828 947697 37462 0
1742597058 3274 -125 0 1426632268 159607 19100 0
1742597194 8264 1 0 1426632407 659981 34200 0
0
0
0
0
70 0
0
I couldn't find any flaw in my code and, even trying to debug it, nothing seemed to be wrong, the variable "used" never went above 100, as it should.
Plus, the first program never outputted anything like that by itself (I usually saved its output directly to a file before).
But, anyway, my pretension was integrating this functionality to the first program, which is written in C, so I made a C version of the same algorithm, here are the parts that matter for it:
char lines[100][256];
int used = 0;
int main(int argc, char *argv[])
{
FILE *outfile = fopen("last100.txt", "w");
if (outfile == NULL)
{
printf("Error opening file for writing.\n");
return 1;
}
// other pieces of the program come here
if (used < 100) //again, add to the end if not full
{
snprintf (lines[used],256, // a bunch of integer variables come here//);
used++;
}
else //if full, shift lines one position back and add to the last position
{
for (int i = 0; i < 99; i++)
{
strcpy(lines[i], lines[i+1]);
}
snprintf (lines[99],256, // a bunch of integer variables come here//);
}
rewind(outfile); //seek to the beginning of the file
for (int i = 0; i < used; i++) //write to the file and flush
{
fprintf(outfile, "%s\n", lines[i]);
fflush(outfile);
}
}
And this ended up having the exact same behavior as before, exactly, those trash nonsense lines appear in the end of the file the same way.
What is happening here?
The used
variable never goes out of bounds, stops at 100.
This is running in Angstrom Linux, version "The Ångström Distribution v2018.12", on a DE10-Standard board.
Also, I tested not using any optimization during the compilation, didn't change anything.
Thanks!
Here's a hint as to what's wrong.
Let's say the first 100 lines are really long like the following:
A REALLY REALLY REALLY LONG LONG LONG VERY LONG LINE THAT IS 200 CHARACTERS WIDE...\n
A REALLY REALLY REALLY LONG LONG LONG VERY LONG LINE THAT IS 200 CHARACTERS WIDE...\n
A REALLY REALLY REALLY LONG LONG LONG VERY LONG LINE THAT IS 200 CHARACTERS WIDE...\n
A REALLY REALLY REALLY LONG LONG LONG VERY LONG LINE THAT IS 200 CHARACTERS WIDE...\n
A REALLY REALLY REALLY LONG LONG LONG VERY LONG LINE THAT IS 200 CHARACTERS WIDE...\n
A REALLY REALLY REALLY LONG LONG LONG VERY LONG LINE THAT IS 200 CHARACTERS WIDE...\n
....
A REALLY REALLY REALLY LONG LONG LONG VERY LONG LINE THAT IS 200 CHARACTERS WIDE...\n
A REALLY REALLY REALLY LONG LONG LONG VERY LONG LINE THAT IS 200 CHARACTERS WIDE...\n
So you have a file that is approximately 20KB in size.
So then after those 100 lines are written, the user starts inputting very short lines characters for the next 100:
1
2
3
4
5
6
7
...
97
98
99
100
You've got 300 bytes of valid characters at this point re-written to your logfile, but the next 19700 chars are the old stuff that was there before. You go to print that file after the smaller lines are entered and it will look something like this:
...
97
98
99
100
NG LONG VERY LONG LINE THAT IS 200 CHARACTERS WIDE...\n
A REALLY REALLY REALLY LONG LONG LONG VERY LONG LINE THAT IS 200 CHARACTERS WIDE...\n
A REALLY REALLY REALLY LONG LONG LONG VERY LONG LINE THAT IS 200 CHARACTERS WIDE...\n
A REALLY REALLY REALLY LONG LONG LONG VERY LONG LINE THAT IS 200 CHARACTERS WIDE...\n
A REALLY REALLY REALLY LONG LONG LONG VERY LONG LINE THAT IS 200 CHARACTERS WIDE...\n
So the file is corrupt because you didn't reset its length on each re-write.
You probably want ftruncate for the C version of your program after you re-write the file from the beginning.