ccsvpointersstructure

Rading csv file contents in a structure array and display them on Console


I'm a noive programmer in C. I'm stuck a bit in reading csv file contents in a structure array and have some issues with pointers.

The cotents of mwecsv.csv file are

ID,DateWeek
Schd5,Monday
Schd3,Tuesday

My implementation is as follows:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>


struct Sample
{
    char *id;
    char *day;
}Sample;


int main (){

    struct Sample sample[100];


    char buffer[1000];
    char *data;
    FILE *fp;

    fp = fopen("mwecsv.csv","r");

    if (fp == NULL)
    {
        printf("Cannot open file");
        exit(-1);
    }
    printf("file exists and opened\n");

    //read the first line=> Header
    fgets(buffer,sizeof(buffer),fp);
    printf("%s\n",buffer);

    int counter = 0;
    while(fgets(buffer,sizeof(buffer),fp)){
        printf("***********\n%s\n******************\n",buffer);
        // parsing
        data = strtok(buffer,",");

        printf("first token is %s\n", data);
        printf("dataStr is %s\n", data);
        sample[counter].id = data;
        
        data = strtok(NULL,",");
        printf("second token is %s\n", data);
        sample[counter].day= data;
        counter++;
    }
    fclose(fp);

      for (int i = 0; i < counter; i++)
        printf("%s %s\n",
                sample[i].id,
                sample[i].day);
      printf("\n");

    return 0;
}

This program outputs:

file exists and opened
ID,DateWeek

***********
Schd5,Monday

******************
first token is Schd5
dataStr is Schd5
second token is Monday

***********
Schd3,Tuesday
******************
first token is Schd3
dataStr is Schd3
second token is Tuesday
Schd3 Tuesday
Schd3 Tuesday

The last two lines in the output should be:

Schd5 Monday
Schd3 Tuesday

I think it has something to do with the pointers in my code, but unfortunately, I can't figure it out. I tried using strcpy and memcpy but they did not solve the issue. Any hint would be helpful to solve this issue


Solution

  • The issue is that your struct contains pointers...

    struct Sample
    {
        char *id;
        char *day;
    }Sample;
    

    Unrelated note... This is creating a global variable of type struct Sample named Sample. This will work as you expect without the extra variable:

    struct Sample
    {
        char *id;
        char *day;
    };
    

    In the loop where you are parsing, you are assigning different parts of the buffer to sample[counter].id and sample[counter].day. When you do this:

    data = strtok(buffer,",");
    

    data will be pointing into buffer. strtok isn't creating something new, it is editing buffer and giving you back a pointer to it. So when you assign to your struct members:

    sample[counter].id = data;
    ...
    sample[counter].day= data;
    

    You are assigning those pointers to an address inside buffer. Please note that when you do this, C is not magically copying the contents of the buffer being pointed to. It's just copying the address itself. C isn't going to go behind your back and do anything for you. That's one of its powers. Very little in the way of implicit behavior = full control.

    Then, on the next loop, this completely overwrites what is in buffer:

    while(fgets(buffer,sizeof(buffer),fp))
    

    At this point, your struct members are now incorrect from the previous loop.

    There are 2 ways to handle this. You can keep pointers in your struct and allocate space to those pointers that then get a copy of data, or you can make the struct contain buffers themselves.

    Method 1: Keep the same struct, but allocate space to the pointers that can contain a duplicate.

    This just changes the 2 lines where you set your struct members.

    sample[counter].id = strdup(data);
    ...
    sample[counter].day= strdup(data);
    

    Note that strdup is allocating for you, so that memory is on the heap.

    Method 2: Change the struct layout to contain its own buffers

    struct Sample {
        char id[100];
        char day[25];
    };
    

    Then, change your struct member assignments like this:

    strcpy(sample[counter].id, data);
    ...
    strcpy(sample[counter].day, data);
    

    The goal of both of these approaches is to copy the data out of buffer before it gets overwritten on the next iteration of the loop.