cparametersscanfvariadic-functions

Why does scanf() read only after specified text?


I was doing a CTF reversing challenge when I came across this C code in Ghidra:

int main(void)
{
  int iVar1;
  char input[32];
  
  fwrite("Password: ",1,10,stdout);
  __isoc99_scanf("DoYouEven%sCTF",input);
  iVar1 = strcmp(input,"__dso_handle");
  if ((-1 < iVar1) && (iVar1 = strcmp(input,"__dso_handle"), iVar1 < 1)) {
    printf("Try again!");
    return 0;
  }
  iVar1 = strcmp(input,"_init");
  if (iVar1 == 0) {
    printf("Correct!");
  }
  else {
    printf("Try again!");
  }
  return 0;
}

And upon making my own similar code, I noticed that the program only saves in input when my input starts with DoYouEven and only saves whatever comes after it. And I am trying to understand the reasoning for this from the source code of scanf.c and vfscanf.c, but I am unable to understand the actual logic behind this.

My snippet:

int main()
{ 
    char input[32];
    
    printf("Input: ");
    int a = scanf("DoYouEven%sCTF",input);
    printf("Input was: %s\n", input);
    
    int iVar = strcmp(input,"__dsohandle");
    printf("iVar: %d\n", iVar);
    printf("strcmp __dsohandle > -1: %d\n", (-1 < iVar));    
    printf("strcmp __dso_handle: %d\n", iVar);
    printf("strcmp _init: %d\n", strcmp(input,"_init"));
    printf("%d\n",a);
    return 0;
}

Output:

Enter your input: DoYouEvenByee
input is: Byee
iVar: -29
strcmp __dsohandle > -1: 0
strcmp __dso_handle: -29
strcmp _init: -29
1

Can anybody help me understand this through source code? I am not a master at understanding C libraries.


Solution

  • To build upon @dbush answer, here is a more detailed sequence of the behavior of scanf("DoYouEven%sCTF", input):

    scanf tries to match the format string reading one byte at a time

    1. at the end of the format string it returns the number of successful conversions, which by definition is the number of conversions if the end of the format string is reached: 1 in this case.

    2. for any character in the format string that is neither whitespace nor a %, scanf reads the next byte from the stream and:

      2a) if end of file is reached or a read error occurs, scanf returns the number of successful conversions so far or EOF if none has been tried yet.

      2b) if the byte matches the character, the process continues at step 1

      2c) otherwise, the match fails the byte is pushed back into the stream (as if by ungetc()) and the number of successful conversions so far is returned: if the user input does not start with DoYouEven, 0 is returned.

    3. for any whitespace character (eg: ' ', '\t', '\n'...) scanf reads the stream and consumes any whitespace byte, not necessarily the same as the format string character. Other input is processed either as in 2a or in 2c.

    4. if the character is a %, the following characters are interpreted as a conversion specifier and the conversion is attempted:

      • for %s, scanf will retrieve the next argument passed and this argument must be a pointer to a modifiable array of char. It then reads and discards any white space from stdin, then if end-of-file or a read-error has occurred, this is a conversion failure handled as in 2a. Otherwise it reads and stores any non whitespace bytes read from stdin into the array pointed to by the pointer. This will stop on the first whitespace byte read from the stream (which will be pushed back) or end-of-file or read-error. A null terminator is stored after all the bytes stored, the number of successful conversions is incremented and the process continues at 1.

    These steps imply 2 problems for the specific call:

    Note also other problems in this code:

    The goal of this reverse engineering session is to show the security flaw in the call to scanf() that can be exploited.