cwhile-loopswitch-statementc-stringskernighan-and-ritchie

Can someone help me understand this unexplained behaviour in C char arrays?


I have a function called escape that copies over a character array into a new one while replacing the \t and \n characters with raw characters \t and \n i.e single character '\n' becomes 2 characters '\' and 'n'.

It seems to be working okay, but when I print the source char array once the target has been populated, it does not print the original value. I'm not sure what I'm missing here, since I do not make changes to the original array in escape function.

#include <stdio.h>

// max length that can be stored in target array
#define MAXLINE 8

// copies over source to target with changing '\t' and '\n\ to '\\','t' and
// '\\','n' respectively, and returns number of chars from source that could be
// fit in target
int escape(char target[], char source[], int lim) {
  int it, is;
  it = is = 0;

  while ((source[is] != '\0') && it < (lim - 1)) {
    // printing source in the function. Looks like it's value is fine here
    printf("source: %s\n", source);
    switch (source[is]) {
    case '\t':
      target[it++] = '\\';
      target[it++] = 't';
      ++is;
      break;
    case '\n':
      target[it++] = '\\';
      target[it++] = 'n';
      ++is;
      break;
    default:
      target[it++] = source[is++];
      break;
    }
  }
  target[it] = '\0';
  return is;
}

int main() {
  char source[] = {'a', '\t', 'e', '\n', '\n', 'a', '\t', 's', '\0'};
  char target[MAXLINE];
  int i = escape(target, source, MAXLINE);
  printf("final target: %s\n", target);
  printf("final source: %s\n", source);
  printf("%d\n", i);

  return 0;
}

Output:

./a.out 
source: a       e

a       s
source: a       e

a       s
source: a       e

a       s
source: a       e

a       s
source: a       e

a       s
final target: a\te\n\n
final source: 
5

Solution

  • The problem is that in these substatements of the switch statement

        case '\t':
          target[it++] = '\\';
          target[it++] = 't';
          ++is;
          break;
        case '\n':
          target[it++] = '\\';
          target[it++] = 'n';
          ++is;
          break;
    

    the index it is incremented twice and you are not checking whether the second increment breaks the condition it < (lim - 1). So it seems as a result this statement

    target[it] = '\0';
    

    places the zero character in the beginning of the character array source.

    You could rewrite the condition of the while statement for example the following way

    while ((source[is] != '\0') && it + ( source[is] == '\t' || source[is] == '\n' )< (lim - 1)) {
    

    Pay attention to that the second function parameter should be declared with qualifier const:

    const source[]