rtransposevalue-class

Why does the transpose function change numeric to character in R?


I've constructed a simple matrix in Excel with some character values and some numeric values (Screenshot of data as set up in Excel). I read it into R using the openxlsx package like so:

library(openxlsx)
data <- read.xlsx('~desktop/data.xlsx)

After that I check the class:

sapply(data, class)
         x1         a         b          c
"character" "numeric" "numeric"  "numeric"

Which is exactly what I want. My problem occurs when I try to transpose the matrix, and then check for class again:

data <- t(data)

When i check with sapply now, all values are "character". Why are the classes not preserved when transposing?


Solution

  • First off, I don't get your result when I read in your spreadsheet due to the fact the the cells with comma separated numbers appear as characters.

    data <- read.xlsx("data.xlsx")
    data
    #  X1   a b   c
    #1  x 0,1 3 4,5
    #2  y 2,4 0 6,5
    #3  z  24 0   0
    sapply(data,class)
    #         X1           a           b           c 
    #"character" "character"   "numeric" "character" 
    

    But the issue you are really seeing is that by transposing the data frame you are mixing types in the same column so R HAS TO convert the whole column to the broadest common type, which is character in this case.

    mydata<-data.frame(X1=c("x","y","z"),a=c(1,2,24),b=c(3,0,0),c=c(4,6,0),stringsAsFactors = FALSE)
    sapply(mydata,class)
    #         X1           a           b           c 
    #"character"   "numeric"   "numeric"   "numeric" 
    # what you showed
    t(mydata)
    #   [,1] [,2] [,3]
    #X1 "x"  "y"  "z" 
    #a  " 1" " 2" "24"
    #b  "3"  "0"  "0" 
    #c  "4"  "6"  "0" 
    
    mydata_t<-t(mydata)
    sapply(mydata_t,class)
    #          x           1           3           4           y           2           #0           6           z          24 
    #"character" "character" "character" "character" "character" "character" #"character" "character" "character" "character" 
    #          0           0 
    #"character" "character" 
    

    Do you want to work on the numbers in the transposed matrix and transpose them back after? If so, transpose a sub-matrix that has the character columns temporarily removed, then reassemble later, like so:

    sub_matrix<-t(mydata[,-1])
    sub_matrix
    #  [,1] [,2] [,3]
    #a    1    2   24
    #b    3    0    0
    #c    4    6    0
    sub_matrix2<-sub_matrix*2
    sub_matrix2
    #  [,1] [,2] [,3]
    #a    2    4   48
    #b    6    0    0
    #c    8   12    0
    cbind(X1=mydata[,1],as.data.frame(t(sub_matrix2)))
    #  X1  a b  c
    #1  x  2 6  8
    #2  y  4 0 12
    #3  z 48 0  0