cvariablesmpioverwritename-conflict

Strange global variable behaviour, once variable name is changed issue disappears


During my university exercise I have come across strange behaviour of a variable.

/* Main parameters                                                          */
double sizeX, sizeY;      /* Size of the global domain                      */
int nPartX, nPartY;       /* Particle number in x, y direction              */
int nPart;                /* Total number of particles                      */
int nCellX, nCellY;       /* (Global) number of cells in x, y direction     */
int steps;                /* Number of timesteps                            */
double dt;                /* Stepsize for timesteps                         */
int logs;                 /* Whether or not we want to keep logfiles        */

void ReadInput(const char *fname)
{
  FILE *fp;
  char c;

  Debug("ReadInput", 0);
  if(rank == 0)
  {
    fp = fopen(fname, "r");
    if(!fp) Debug("Cannot open input file", 1);
    if(fscanf(fp, "sizeX: %lf\n", &sizeX) != 1) Debug("sizeX?",  1);
    if(fscanf(fp, "sizeY: %lf\n", &sizeY) != 1) Debug("sizeY?",  1);
    if(fscanf(fp, "nPartX:%i\n", &nPartX) != 1) Debug("nPartX?", 1);
    if(fscanf(fp, "nPartY:%i\n", &nPartY) != 1) Debug("nPartY?", 1);
    if(fscanf(fp, "nCellX:%i\n", &nCellX) != 1) Debug("nCellX?", 1); //read value is 10
    if(fscanf(fp, "nCellY:%i\n", &nCellY) != 1) Debug("nCellY?", 1);    
    if(fscanf(fp, "steps: %li\n", &steps) != 1) Debug("steps?",  1);    
//here the nCellX variable value 10 is changed somehow to 0
    if(fscanf(fp, "dt:    %lf\n", &dt)    != 1) Debug("dt?",     1);
    if(fscanf(fp, "logs:  %c\n",  &c)     != 1) Debug("logs?",   1);
    logs = (c == 'y');
    fclose(fp);
  }

  printf("(%i) reporting in...\n", rank);

  MPI_Bcast(&sizeX, 1, MPI_DOUBLE, 0, grid_comm);  
  MPI_Bcast(&sizeY, 1, MPI_DOUBLE, 0, grid_comm);
  MPI_Bcast(&nPartX,1, MPI_INT,    0, grid_comm);  
  MPI_Bcast(&nPartY,1, MPI_INT,    0, grid_comm);
  MPI_Bcast(&nCellX,1, MPI_INT,    0, grid_comm);
  MPI_Bcast(&nCellY,1, MPI_INT,    0, grid_comm);
  MPI_Bcast(&steps, 1, MPI_INT,    0, grid_comm);
  MPI_Bcast(&dt,    1, MPI_DOUBLE, 0, grid_comm);
  MPI_Bcast(&logs,  1, MPI_INT,    0, grid_comm);
  nPart = nPartX * nPartY;
  dt2 = dt * dt;
}

Teacher and I have concluded that if we change the variable name from "nCellX" to "nCellX_2", the problem disappears and the code works as expected. Another interesting thing is that only this single global variable have this problem, other variables works correctly. I was wondering does anyone came across this type of problem as well. Any guideline/explanation would be appreciated.

If this problem is not clear enough let me know, also if full code is required I can provide that as well. In general the code is a parallel algorithm of a Particle-in-Cell.


Solution

  • It is possible that the following line of code is causing a problem:

    if(fscanf(fp, "steps: %li\n", &steps) != 1) Debug("steps?",  1);
    

    The %li indicates a long integer, which might be 64-bits while steps is an int, which might be 32-bits. The format specifier should be %i instead of %li.

    Whether there is an actual problem depends on the environment (e.g., it is most likely an issue if building a 64-bit application). If there is that 64-bit vs 32-bit mismatch, then the fscanf call will overwrite memory and possibly destroy whatever variable follows steps in the memory layout (and that could be nCellX). Note that using -Wall option should warn you about this situation. Why changing the name of nCellX to something different should mask the problem is not clear, but it would seem that changing the names may be resulting in a change in the layout of the variables in memory; I doubt that is disallowed by the C standard (although I have not looked).