When I use .Internal(inspect())
to NA_real_
and NaN
, it returns,
> .Internal(inspect(NA_real_))
@0x000001e79724d0e0 14 REALSXP g0c1 [REF(2)] (len=1, tl=0) nan
> .Internal(inspect(NaN))
@0x000001e797264a88 14 REALSXP g0c1 [REF(2)] (len=1, tl=0) nan
It seems like their only difference is the memory address.
However, when I coerce the NA_real_
and NaN
into character, it returns,
> as.character(c(NaN, NA_real_))
[1] "NaN" NA
I understand that it should return the above result as NaN
can't be character and it will be coerced into "NaN"
but NA_real
will be coerced into NA_character_
. But considering their gut is same, how can R returns different results for them?
Thank you in advance for any suggestions!
Well. First off, remember that NA
is an R concept that has no equivalent in C. So, by necessity, NA
needs to be represented differently in C. The fact that .Internal(inspect())
does not make this distinction doesn’t mean it isn’t made elsewhere. In fact, it so happens that .Internal(inspect())
uses Rprintf
to print the value’s internal double floating point representation. And, indeed, R NAs are encoded as an NaN value in a C floating point type.
Secondly, you observe that “their only difference is the memory address.” — So what? At least conceptually, distinct memory addresses are fully sufficient to distinguish NA and NaN, nothing more is required.
But as a matter of fact R distinguishes these values by a different route. This is possible because the IEEE 754 double precision floating point format has multiple different representations of NaN, and R reserves a specific one for NAs:
static double R_ValueOfNA(void)
{
/* The gcc shipping with Fedora 9 gets this wrong without
* the volatile declaration. Thanks to Marc Schwartz. */
volatile ieee_double x;
x.word[hw] = 0x7ff00000;
x.word[lw] = 1954;
return x.value;
}
Where:
typedef union
{
double value;
unsigned int word[2];
} ieee_double;
And hw
and lw
have the values 0 and 1, respectively (which has which value depends on platform endianness).
And, furthermore:
/* is a value known to be a NaN also an R NA? */
int attribute_hidden R_NaN_is_R_NA(double x)
{
ieee_double y;
y.value = x;
return (y.word[lw] == 1954);
}
int R_IsNA(double x)
{
return isnan(x) && R_NaN_is_R_NA(x);
}
int R_IsNaN(double x)
{
return isnan(x) && ! R_NaN_is_R_NA(x);
}
(src/main/arithmetic.c
)