cstdscanfstdiocstdio

sscanf string format specifiers not works for '\t'


#include <math.h>
#include <stdio.h>
#include <stdlib.h>
char *tokenstring = "first,25.5,second,15";
int result, i;
double fp;
char o[10], f[10], s[10], t[10];
void main()
{
   result = sscanf(tokenstring, "%[^','],%[^','],%[^','],%s", o, s, t, f);
   printf("%s\n %s\n %s\n %s\n", o, s, t, f);
   fp = atof(s);
   i  = atoi(f);
   printf("%s\n %lf\n %s\n %d\n", o, fp, t, i);
}

Codes above not works for '\t', why? It works for this I am using vc6.0

Not works

char *tokenstring = "first\t25.5\tsecond\t15";



   result = sscanf(tokenstring, "%[^'\t'],%[^'\t'],%[^'\t'],%s", o, s, t, f);

Solution

  • Look at what your format is matching:

    "%[^'\t'],%[^'\t']
     ^     ^ ^
     \     | \- match a literal comma
      \    |
       \---+- match a sequence not containing tab or ' (single quote), up to the next
              tab or single quite.
    

    So the first %[..] matches everything up to and not including the first tab in the input, and then it tries to match a comma, which doesn't match the tab, and so fails.

    The easiest fix is to replace the commas in the string with spaces, which will skip whitespace (which include tabs). Using tabs will do the same thing, but will confuse people into thinking you're trying to match a tab rather than skip whitespace:

    sscanf(tokenstring, "%[^\t] %[^\t] %[^\t]%s", o, s, t, f);
    

    Note that you also probably don't want to treat ' characters specially in the matches, unless you want them to fail.

    Now if you want to use just tabs for your separators (not just any whitespace), you need to use tab patterns:

    sscanf(tokenstring, "%[^\t]%*1[\t\]%[^\t]%*1[\t]%[^\t]%s", o, s, t, f);
    

    The pattern %*1[\t] will match exactly a single tab in the input and not store it anywhere.

    This leads to another problem you may have noticed with your first (comma based) scanf -- a pattern like %[^,] or %[^\t] will not match an empty string -- if the next character on the input is a , (or \t in the second case), scanf will simply return without matching anything (or any of the following patterns), rather than storing an empty string.

    In addition, if any of your strings are too long for the arrays, you'll overflow and crash (or worse). So whenever you use a scanf %s or %[ pattern into a buffer, you should ALWAYS specify the buffer size:

    sscanf(tokenstring, "%9[^,],%9[^,],%9[^,],%9s", o, s, t, f);
    

    Now instead of crashing or corrupting things, when the input is too long, the sscanf call will match only the first 9 characters of a field and return with the rest of the field yet to be read.