I have the following two sample data frames of different lengths with the same column names:
data1=data.frame('name'=c('siva','ramu','giri'),
'xx'=c(1,0,3))
name xx
1 siva 1
2 ramu 0
3 giri 3
data2=data.frame('name'=c('siva','ramya','giri','geetha','pallavi'),
'xx'=c(1,2,3,4,5))
name xx
1 siva 1
2 ramya 2
3 giri 3
4 geetha 4
5 pallavi 5
I want to compare the pair of columns in data1 with the corresponding pair of columns in data2. For example, the 1rst row in data1 is the same with the 1rst row in data2. Hence, for this row it holds TRUE. The same holds for row 3.For the other rows we should get FALSE
I tried
library(arsenal)
comparedf(data1,data2)
Compare Object
Function Call:
comparedf(x = data1, y = data2)
Shared: 2 non-by variables and 3 observations.
Not shared: 0 variables and 2 observations.
Differences found in 2/2 variables compared.
0 variables compared have non-identical attributes
.
Is that correct? If it is, I can not interpret this output.
If you want to use the comparedf
function, you need to summarise the results:
Without a "by" argument data frames are compared row-by-row (as stated in the help page).
summary(comparedf(data1, data2))
Gives (after omitting some irrelevant output)
Table: Summary of data.frames
version arg ncol nrow
-------- ------ ----- -----
x data1 2 3
y data2 2 5
Table: Summary of overall comparison
statistic value
------------------------------------------------------------ ------
Number of by-variables 0
Number of non-by variables in common 2
Number of variables compared 2
Number of variables in x but not y 0
Number of variables in y but not x 0
Number of variables compared with some values unequal 2
Number of variables compared with all values equal 0
Number of observations in common 3
Number of observations in x but not y 0
Number of observations in y but not x 2
Number of observations with some compared variables unequal 1
Number of observations with all compared variables equal 2
Number of values unequal 2
Table: Observations not shared
version ..row.names.. observation
-------- -------------- ------------
y 4 4
y 5 5
Table: Differences detected by variable
var.x var.y n NAs
------ ------ --- ----
name name 1 0
xx xx 1 0
Table: Differences detected
var.x var.y ..row.names.. values.x values.y row.x row.y
------ ------ -------------- --------- --------- ------ ------
name name 2 ramu ramya 2 2
xx xx 2 0 2 2 2