I am following a tutorial on web (Adobe) analytics, where I want to build a Markov Chain Model. (http://datafeedtoolbox.com/attribution-theory-the-two-best-models-for-algorithmic-marketing-attribution-implemented-in-apache-spark-and-r/).
In the example they are using the function: concat_ws (from library(sparklyr)). But it looks like the function does not exists (after installing the package, and calling the library, I receive an error that the function does not exists...).
Comment author of the blog: concat_ws is a Spark SQL function: https://spark.apache.org/docs/2.2.0/api/java/org/apache/spark/sql/functions.html So, you’ll have to rely on sparklyr to have that function work.
My question: are there workarounds to get access to the concat_ws() function? I tried:
What is the goal of the function? Concatenates multiple input string columns together into a single string column, using the given separator.
You can simply use paste
from base R.
library(sparklyr)
library(dplyr)
config <- spark_config()
sc <- spark_connect(master = "local", config = config)
df <- as.data.frame(cbind(c("1", "2", "3"), c("a", "b", "c")))
sdf <- sdf_copy_to(sc, df, overwrite = T)
sdf %>%
mutate(concat = paste(V1, V2, sep = "-"))