cpointerscasting

Why does defining the elements of the array b is required to get the desired output


I recently learnt about typecasting pointers. So I played around, and made this bad code:

#include <stdio.h>
int main()
{
   int* a;
    char b[]= "Hye";
    a = (int*)&b;
    *a='B';
    printf("%s\n", b);
    return 0;
}

This code gives the output 'B'.

Upon further playing around, I wrote this code:

#include <stdio.h>
int main()
{
    int* a;
    char b[]= "Hye";
    a = (int*)&b;
    *a='B';
    b[1]='y';
    b[2]='e';
    printf("%s\n", b);
    return 0;
}

The output is "Bye".

I am unable to understand what the statement *a='B'; does, and why does redefining the second and the third elements of the array changes the output?

I already used stuffs like GPT-4o, searching stuffs like typecasting pointer on google, then reading articles, and watching videos related to it to get a answer to my question. But I got none. So here I am.


Solution

  • int* a and char b[] are two different types usually you run afoul with the strict aliasing rule that makes this access pattern undefined behavior (6.5, 7).

    As @TomKarzes points out you may run into problems if your platform require int be aligned, say, at 4 byte, while char b[] may perhaps only need 1 byte alignment. If you are on such a platform your program will may fail with a "bus error" or in other ways.

    The C standard only guarantees that an int is at least 16 bits. On a 64 bit platform it's usually 4 bytes so *a='B'; probably writes 4 bytes to the array b. On a little-endian platform (amd64) the first program will print "B" {'B', 0, 0, 0} and the 2nd program "Bye" {'B'¸'y', 'e' 0}, while both programs would print "" on a big-endian platform {0, 0, 0, 'B'} and {'0, 'y', 'e', 'B'} respectively.

    You will also run into problems if you sizeof b < sizeof(int). sizeof(char) is defined to be 1. stdint.h gives you access int types of known size.