I'd like to get the unique values from a column in a dataframe. With the R package dplyr, it should be possible.
This distinct(select(dataframe, column))
works great on my Mac. In RStudio on Windows 7 I encounter this:
when I run this R code:
library(dplyr)
df <- data.frame(replicate(4,sample(0:1,10,rep=TRUE)))
unique_values <- distinct(select(df, X1))
EDIT
Please check if dplyr::distinct(select(df, X1))
works? – akrun
Of course - here is the console output:
EDIT
I've not used distinct, but perhaps unique would work for you? unique(df$X1)
– NPE
It does work, and it's concise too! I would still like to understand this dplyr error...
EDIT
Please add the output of sessionInfo()
instead. – Roland
EDIT
some comments note that dplyr_0.2
version is old. install.packages("dplyr")
gets a CRAN link to the old package. Now to figure out how to manually install dplyr_0.3.0.2
.
Figured it out! Old R
means old dplyr
means no distinct()
function.
To fix this, install the latest version of R:
source: this very nice answer
Then run the command install.packages("dplyr")
in the RStudio Console.
Now you can create a dataframe and use the distinct()
function to get the unique values from one of its columns:
library(dplyr)
# create a dataframe with some values
df <- data.frame(replicate(4,sample(0:1,10,rep=TRUE)))
df
# select a column from that dataframe and get a list of the unique values
unique_values <- distinct(select(df, X1))
unique_values
In the console you should see:
Thanks to David Arenburg and Richard Scriven for pointing our that dplyr-0.2 is old and lacks the distinct()
function. This line of thinking led to the answer.