I'm working with C, and I've noticed some interesting behavior when handling large integer inputs using scanf. Specifically, when I input a number larger than the maximum value that can be stored in an unsigned int, the number seems to be truncated to fit within the 32 bits of unsigned int. However, when I use size_t on a 64-bit system, and input a value larger than 2^64 - 1, it doesn't seem to truncate in the same way—instead, it just stores the maximum value possible for size_t.
#include <stdio.h>
int main() {
unsigned int num1;
size_t num2;
printf("Enter a large number for unsigned int: ");
scanf("%u", &num1);
printf("Stored value (unsigned int): %u\n", num1);
printf("Enter a large number for size_t: ");
scanf("%zu", &num2);
printf("Stored value (size_t): %zu\n", num2);
return 0;
}
Sample input:
For unsigned int: 12345678901234567890
For size_t: 123456789012345678901234567890
output:
For unsigned int: A truncated value that doesn't match the input (likely a smaller, incorrect number).
For size_t: The maximum value 18446744073709551615 (the largest 64-bit unsigned integer).
Why does scanf truncate the input when storing it in an unsigned int, leading to data loss, but when using size_t on a 64-bit system, it simply stores the maximum possible value if the input exceeds the range? How does scanf handle these situations differently depending on any type?
specifically i want to know scanf behaviour, for all kind of types with overflowing input.
The behavior is not defined by the C standard, per C 2018 7.21.6.2 10 (“… if the result of the conversion cannot be represented in the object, the behavior is undefined”), but the observations you make are readily explained if the scanf
function operates on any request for conversion to an unsigned integer type in this way:
uintmax_t
, using the strtoumax
function (or equivalent).strtoumax
to the caller’s destination object, using the type requested for it.The reason this produces the behavior you observed is that strtoumax
is specified (in C 2018 7.8.2.3 3) to produce the maximum value of uintmax_t
if the correct value is outside the range of representable values. So, when converting input that is outside the representable interval, you get that value. However, when you converted 12345678901234567890
, it was outside the range of unsigned int
but not outside the range of uintmax_t
. So, it was correctly converted to a uintmax_t
value, and then the low bit bits of this uintmax_t
value were copied into your unsigned int
.
As a test of this, you could try the input 123456789012345678901234567890
for unsigned int
. If the above is how scanf
operates, it will produce UINTMAX_T
in the unsigned int
.