[SOLVED] Internal representation of int NA

Internal representation of int NA

This is question about R internals. How integer NA values are represented in R? Unlike floating there is no magic bit sequence to represent NaNs.

# Create big array. newer versions of R won't allocate memory to store data
# Instead star/end values are stored internally
a <- 1:1e6 # 

# Change some random value. This will cause and array to be allocated
a[123] <- NA
typeof(a)

At this point a is still an array of integers. How a[123] represented internally? Does R use some magic number to indicate that an integer is NA?

My primary interest in internal representation of integers is related to binary read/write (readBin/writeBin). How to handle NA when performing binary I/O with external sources, e.g. via sockets?

Solution

R uses the minimum integer value to represent NA. On a 4-byte system, valid integer values are usually -2,147,483,648 to 2,147,483,647 but in R

> .Machine$integer.max
[1] 2147483647
> -.Machine$integer.max
[1] -2147483647
> -.Machine$integer.max - 1L
[1] NA
Warning message:
In -.Machine$integer.max - 1L : NAs produced by integer overflow

Also,

> .Internal(inspect(NA_integer_))
@7fe69bbb79c0 13 INTSXP g0c1 [NAM(7)] (len=1, tl=0) -2147483648