rdataframematrix

r dataframe and matrix result in different rownames when using rbind


Method 1:

df1<-data.frame(A=1:5,B=2:6)
df2<-data.frame(A=1:5,B=2:6)
df3<-rbind(df1,df2)
row.names(df3)
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10"

Method 2:

df1<-data.frame(matrix(data = c(1:5,2:6),
                       nrow = 5,
                       dimnames = list(c(1:5),c("A","B"))))
df2<-data.frame(matrix(data = c(1:5,2:6),
                       nrow = 5,
                       dimnames = list(c(1:5),c("A","B"))))
df3<-rbind(df1,df2)
row.names(df3)
 [1] "1"  "2"  "3"  "4"  "5"  "11" "21" "31" "41" "51"

I understand that the rownames should be unique values, I would like to known why Mehtod1 and Method 2 generate different resutls in the rownames?


Solution

  • In Method 1, the rownames are integers

    df1 <- data.frame(A=1:5,B=2:6)
    df2 <- data.frame(A=1:5,B=2:6)
    
    dput(df1)
    structure(list(A = 1:5, B = 2:6), class = "data.frame", row.names = c(NA,
    -5L))
    

    and they stay integers after rbind

    df3 <- rbind(df1, df2)
    
    dput(df3)
    structure(list(A = c(1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L),
        B = c(2L, 3L, 4L, 5L, 6L, 2L, 3L, 4L, 5L, 6L)), row.names = c(NA,
    -10L), class = "data.frame")
    

    In Method 2, the rownames are strings

    df1 <- data.frame(matrix(data = c(1:5,2:6),
                             nrow = 5,
                             dimnames = list(c(1:5),c("A","B"))))
    df2 <- data.frame(matrix(data = c(1:5,2:6),
                             nrow = 5,
                             dimnames = list(c(1:5),c("A","B"))))
    
    dput(df1)
    structure(list(A = 1:5, B = 2:6), class = "data.frame", row.names = c("1",
    "2", "3", "4", "5"))
    

    and they also remain the original class, strings, after rbind

    df3 <- rbind(df1, df2)
    
    dput(df3)
    structure(list(A = c(1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L),
        B = c(2L, 3L, 4L, 5L, 6L, 2L, 3L, 4L, 5L, 6L)), row.names = c("1",
    "2", "3", "4", "5", "11", "21", "31", "41", "51"), class = "data.frame")
    

    Strings can't be enumerated so they are made unique by adding an index to the string.

    You can equalize the two by adding row.names=NULL when creating the dataframe from a matrix.

    df1 <- data.frame(matrix(data = c(1:5,2:6),
                             nrow = 5,
                             dimnames = list(c(1:5),c("A","B"))), row.names=NULL)
    
    dput(df1)
    structure(list(A = 1:5, B = 2:6), class = "data.frame", row.names = c(NA,
    -5L))