I have this simple C program.
#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>
bool foo (unsigned int a) {
return (a > -2L);
}
bool bar (unsigned long a) {
return (a > -2L);
}
int main() {
printf("foo returned = %d\n", foo(99));
printf("bar returned = %d\n", bar(99));
return 0;
}
Output when I run this -
foo returned = 1
bar returned = 0
Recreated in godbolt here
My question is why does foo(99)
return true but bar(99)
return false.
To me it makes sense that bar
would return false. For simplicity lets say longs are 8 bits, then (using twos complement for signed value):
99 == 0110 0011
-2 == unsigned 254 == 1111 1110
So clearly the CMP instruction will see that 1111 1110 is bigger and return false.
But I dont understand what is going on behind the scenes in the foo
function. The assembly for foo
seems to hardcode to always return mov eax,0x1
. I would have expected foo
to do something similar to bar
. What is going on here?
This is covered in C classes and is specified in the documentation. Here is how you use documents to figure this out.
In the 2018 C standard, you can look up >
or “relational expressions” in the index to see they are discussed on pages 68-69. On page 68, you will find clause 6.5.8, which covers relational operators, including >
. Reading it, paragraph 3 says:
If both of the operands have arithmetic type, the usual arithmetic conversions are performed.
“Usual arithmetic conversions” is listed in the index as defined on page 39. Page 39 has clause 6.3.1.8, “Usual arithmetic conversions.” This clause explains that operands of arithmetic types are converted to a common type, and it gives rules determining the common type. For two integer types of different signedness, such as the unsigned long
and the long int
in bar
(a
and -2L
), it says that, if the unsigned type has rank greater than or equal to the rank of the other type, the signed type is converted to the unsigned type.
“Rank” is not in the index, but you can search the document to find it is discussed in clause 6.3.1.1, where it tells you the rank of long int
is greater than the rank of int
, and the any unsigned type has the same rank as the corresponding type.
Now you can consider a > -2L
in bar
, where a
is unsigned long
. Here we have an unsigned long
compared with a long
. They have the same rank, so -2L
is converted to unsigned long
. Conversion of a signed integer to unsigned is discussed in clause 6.3.1.3. It says the value is converted by wrapping it modulo ULONG_MAX
+1, so converting the signed long
−2 produces a ULONG_MAX
+1−2 = ULONG_MAX
−1, which is a large integer. Then comparing a
, which has the value 99, to a large integer with >
yields false, so zero is returned.
For foo
, we continue with the rules for the usual arithmetic conversions. When the unsigned type does not have rank greater than or equal to the rank of the signed type, but the signed type can represent all the values of the type of the operand with unsigned type, the operand with the unsigned type is converted to the operand of the signed type. In foo
, a
is unsigned int
and -2L
is long int
. Presumably in your C implementation, long int
is 64 bits, so it can represent all the values of a 32-bit unsigned int
. So this rule applies, and a
is converted to long int
. This does not change the value. So the original value of a
, 99, is compared to −2 with >
, and this yields true, so one is returned.