cscanf

Saving a character to an int variable using scanf( ) making unexpected char equivalent int value


I was checking library function isupper() to check if the function version or macro version of isupper() is efficient depending on storage used or runtime and I have this peculiar problem when using scanf() for storing input character to a int varibale. Here is my code-

#include<stdio.h>

int my_isupper(int );

int main() {
    int c;

    printf("Enter a alphabet to check if it is upper case or not\n");
    
    //c = getchar(); // getchar(), getc() works fine though
        
    scanf("%c", &c);

    /*
    Doesn't work when c is a int but not initialized to 0
    Works fine on some machine no matter c is a char, int or not initialized to 0
    Is it compiler/machine dependent? if so, which part ( scanf()?) has dependency?
    */

    printf("test1: %d\n", c);

    if(my_isupper(c))
        printf("%c is upper case\n", c);

    else
        printf("%c is not upper case\n", c);

    return 0;
}

int my_isupper(int c) {

    printf("test2: %c\n", c);

    int value = (c >= 'A' && c <= 'Z')? 1 : 0;

    printf("test3: %d\n", value);

    return value;
}

when the variable c is set as char it works fine. when it is set as int the program works fine with getchar(), getc() etc library functions but when using scanf(), if the variable is set as int and not initialized to 0 then scanf() is storing 32577 for the char 'A' , 32578 for the char 'B' and so on.

When giving input: A , return value should be 1, but the return value I am getting is 0 as the condition is not satisfied because scanf() is saving 32577 for the char A, and 32778 for char B and so on.


Solution

  • You are lying to scanf with scanf("%c", &c); saying c is a char while it is actually an int. This is an undefined behavior bug so anything can happen.

    One likely outcome (not guaranteed but likely) is that a byte is read into the lowest address of the int. In case of a little endian machine, then it will work just fine because it expects a number 0 to 255 in that byte to directly correspond with values 0 to 255. So the result of the int will end up as that value given that all other bytes of the int are zero. Or alternatively on a big endian machine, you would be writing to the most significant byte resulting in a very large number.

    However, you never initialized c so those other bytes may contain garbage values. Or possibly zeroes if you are unlucky - many debug builds zero out local variables that aren't initialized, which is by no means any guarantee from C. Printing the contents of a local variable which does not have it's address taken is also undefined behavior. Ok so you did take the address during the scanf call. But assuming a mainstream system with no trap representations for int, it is still unspecified behavior - meaning any combination may happen and you may get any value, but at least the program won't crash and burn unexpectedly as with undefined behavior.