I'm operating on some data that looks like below: dataFrame
the command that I'm performing is :
library(magrittr)
#subsetting the data for MAC-OS & sorting by event-timestamp.
macDF <- eventsDF %>%
SparkR::select("device", "event_timestamp") %>%
SparkR::filter("device = macOS") %>%
SparkR::arrange("event_timestamp")
display(macDF)
And the error I get is:
Error in h(simpleError(msg, call)) :
error in evaluating the argument 'x' in selecting a method for function 'arrange': unable to find an inherited method for function ‘filter’ for signature ‘"character", "missing"’
Some(<code style = 'font-size:10p'> Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'arrange': unable to find an inherited method for function ‘filter’ for signature ‘"character", "missing"’ </code>)
Any help would be appreciated, Thanks!
I couldn't precisely replicate your error, but I created an example eventsDF dataframe in R, converted it to a Spark dataframe, and updated a bit of your code.
Here's an example in the style you started with. Note the call to SparkR::expr which allows you provide a sql expressions for Spark to put in the where clause it is building. Since this example uses expr() to build a sql where clause, macOS needs to be quoted:
library(magrittr)
eventsDF = data.frame(device=c("macOS","redhat","macOS"),event_timestamp=strptime(c('2022-01-13 12:19','2021-11-14 08:02','2021-12-01 21:33'),format="%Y-%m-%d %H:%M")) %>%
SparkR::as.DataFrame()
macDF <- eventsDF %>%
SparkR::select(eventsDF$device, eventsDF$event_timestamp) %>%
SparkR::filter(SparkR::expr("device='macOS'")) %>%
SparkR::arrange('event_timestamp') %>%
display()
How I might do it:
library(dplyr)
library(SparkR)
eventsDF = data.frame(device=c("macOS","redhat","macOS"),event_timestamp=strptime(c('2022-01-13 12:19','2021-11-14 08:02','2021-12-01 21:33'),format="%Y-%m-%d %H:%M")) %>%
as.DataFrame()
macDF <- eventsDF %>%
select(c('device','event_timestamp')) %>%
filter(eventsDF$device=='macOS') %>%
arrange('event_timestamp') %>%
display()