Can the feather package in R support 64-bit integers?
When the dataset is passed to feather::write_feather()
, the column is converted to a 64-bit float, and loses precision. I'd like to avoid converting it to a character.
Here's a simplified example. In the real project, a database table (retrieved with the odbc package) has columns that are legit 64-bit integers (as specified in the bit64 package).
requireNamespace("bit64")
path <- base::tempfile(fileext = ".feather")
ds <-
tibble::tibble(
patient_id = bit64::as.integer64(1:6)
)
ds
# # A tibble: 6 x 1
# patient_id
# <int64>
# 1 1
# 2 2
# 3 3
# 4 4
# 5 5
# 6 6
feather::write_feather(x = ds, path = path)
ds_read <- feather::read_feather(path)
# # A tibble: 6 x 1
# patient_id
# <dbl>
# 1 Inf.Nae-324
# 2 Inf.Nae-324
# 3 1.50e-323
# 4 2.00e-323
# 5 2.50e-323
# 6 3.00e-323
as.integer(ds_read$patient_id)
# Returns: [1] 0 0 0 0 0 0
unlink(path_out)
Note: I don't want to store them as floats, as suggested here.
It is actually "complicated". As you probably know, R itself has only two types: 32-bit integer and 64-bit double.
So to represent 64-bit integers, Jens did quite some work in his bit64
package to use double as a "carrier" for the 64-bit payload and redefining all accessor functionality to treat it as as 64-bit (signed) integer. That works.
Several packages support it natively, for example data.table
. I took advantage of this when I created nanotime
-- which uses 64-bit integers for nanoseconds since the epoch. This also works: we never convert to double in between and get faithful integer64 representation.
I have also been following reticulate
over the years, and it has very similar conversion issues from 64-bit integers (as those are native in Python) which are by now generally addressed.
So long story short: your question is more of a feature request for feather
. And as those involved focus now on arrow
which appears to have 64-bit integer support, you most likely will just be asked to move to arrow
. Or you could use data.table
.