I'm trying to read a .txt file and save all sentences end with .!?
into array. I use getline
and strtok
to do this. When I save the sentences, it seems work. But when I try to retrieve data later through index, the first line is missing.
The input is in a file input.txt with content below
The wandering earth! In 2058, the aging Sun? is about to turn into a red .giant and threatens to engulf the Earth's orbit!
Below is my code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main() {
FILE *fp = fopen("input.txt", "r+");
char *line = NULL;
size_t len = 0;
char *sentences[100];
if (fp == NULL) {
perror("Cannot open file!");
exit(1);
}
char delimit[] = ".!?";
int i = 0;
while (getline(&line, &len, fp) != -1) {
char *p = strtok(line, delimit);
while (p != NULL) {
sentences[i] = p;
printf("sentences [%d]=%s\n", i, sentences[i]);
i++;
p = strtok(NULL, delimit);
}
}
for (int k = 0; k < i; k++) {
printf("sentence is ----%s\n", sentences[k]);
}
return 0;
}
output is
sentences [0]=The wandering earth
sentences [1]= In 2058, the aging Sun
sentences [2]= is about to turn into a red
sentences [3]=giant and threatens to engulf the Earth's orbit
sentence is ----
sentence is ---- In 2058, the aging Sun
sentence is ---- is about to turn into a red
sentence is ----giant and threatens to engulf the Earth's orbit
I use strtok
to split string directly. It worked fine.
DELIMITERS
and added '\n'. You may or may not what that '\n' in there but I would need to see the expected output now that you supplied input. vim, at least, ends the last line with a '\n' which would generate at least one '\n' token at the end. The other option is to remove leading and trailing white space, and if you end up with an empty string then don't add it as a sentence.strtok()
calls (DRY).sentences
no longer make sense. The easiest fix is strdup()
each string. Another approach would be to retain an array of line pointers (for subsequent free()) and have getline()
allocate new a new line each time by resetting line = 0
and line = NULL
.#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define DELIMITERS ".!?\n"
#define SENTENCES_LEN 100
int main() {
FILE *fp = fopen("input.txt", "r");
if (!fp) {
perror("Cannot open file!");
return 1;
}
char *line = NULL;
size_t len = 0;
char *sentences[SENTENCES_LEN];
int i = 0;
while (getline(&line, &len, fp) != -1) {
char *s = line;
for(; i < SENTENCES_LEN; i++) {
char *sentence = strtok(s, DELIMITERS);
if(!sentence)
break;
sentences[i] = strdup(sentence);
printf("sentences [%d]=%s\n", i, sentences[i]);
s = NULL;
}
}
for (int k = 0; k < i; k++) {
printf("sentence is ----%s\n", sentences[k]);
free(sentences[k]);
}
free(line);
fclose(fp);
}
Using the supplied input file the matching out is:
sentences [0]=The wandering earth
sentences [1]= In 2058, the aging Sun
sentences [2]= is about to turn into a red
sentences [3]=giant and threatens to engulf the Earth's orbit
sentence is ----The wandering earth
sentence is ---- In 2058, the aging Sun
sentence is ---- is about to turn into a red
sentence is ----giant and threatens to engulf the Earth's orbit