I am currently working on an R package, in which I want to build a list of frequency tables of each column of a matrix. When testing on a
macOS Monterey Version 12.6.3 (21G419)
using
R version 4.0.5 (2021-03-31) -- "Shake and Throw" Platform: x86_64-apple-darwin17.0 (64-bit)
the following problem occurs (here reduced to an MWE):
test = matrix(sample(1:5, 1000, TRUE), 100,10)
str(test)
# int [1:100, 1:10] 4 4 2 5 3 1 2 4 5 1 ...
apply(test, 2, table, simplify = FALSE)
Error in FUN(newX[, i], ...) :
all arguments must have the same length
On a newer apple and on a windows machine, the code ran without a problem, hence it seems to be associated with the old version. My problem: I'd like to make the package also available to users of the older version rather than demanding an update.
Is anyone aware of a solution/workaround? The code seems correct and the matrix has no missings, factors, or other peculiarities. Why would such an error be thrown?
As Roland stated in the comments, apply()
gained the simplify argument in R 4.1.0. The docs for earlier versions show:
apply(X, MARGIN, FUN, …)
The ...
argument is defined as:
optional arguments to FUN.
This means that apply(test, 2, table, simplify = FALSE)
is translated as:
apply(test, 2, function(x) table(x, simplify = FALSE))
As simplify
is not a named argument to table()
, it is absorbed into the ellipsis and assumed to be a named vector. You are therefore doing something like:
table(1:10, 0)
Hence the error message: all arguments must have the same length
.
To circumvent this, you can use asplit()
on R 3.6.0
and greater to split the matrix into a list of columns, and then iterate over it:
lapply(asplit(test, 2), table)
On versions earlier than 3.6.0
, e.g. 3.5.2
you can do:
lapply(split(test, slice.index(test, 2)), table)