I have a data frame in R which looks like:
| RIC | Date | Open |
|--------|---------------------|--------|
| S1A.PA | 2011-06-30 20:00:00 | 23.7 |
| ABC.PA | 2011-07-03 20:00:00 | 24.31 |
| EFG.PA | 2011-07-04 20:00:00 | 24.495 |
| S1A.PA | 2011-07-05 20:00:00 | 24.23 |
I want to know if there's any duplicates regarding to the combination of RIC and Date. Is there a function for that in R?
You can always try simply passing those first two columns to the function duplicated
:
duplicated(dat[,1:2])
assuming your data frame is called dat
. For more information, we can consult the help files for the duplicated
function by typing ?duplicated
at the console. This will provide the following sentences:
Determines which elements of a vector or data frame are duplicates of elements with smaller subscripts, and returns a logical vector indicating which elements (rows) are duplicates.
So duplicated
returns a logical vector, which we can then use to extract a subset of dat
:
ind <- duplicated(dat[,1:2])
dat[ind,]
or you can skip the separate assignment step and simply use:
dat[duplicated(dat[,1:2]),]