rsparklyr

Concat_ws() function in Sparklyr is missing


I am following a tutorial on web (Adobe) analytics, where I want to build a Markov Chain Model. (http://datafeedtoolbox.com/attribution-theory-the-two-best-models-for-algorithmic-marketing-attribution-implemented-in-apache-spark-and-r/).

In the example they are using the function: concat_ws (from library(sparklyr)). But it looks like the function does not exists (after installing the package, and calling the library, I receive an error that the function does not exists...).

Comment author of the blog: concat_ws is a Spark SQL function: https://spark.apache.org/docs/2.2.0/api/java/org/apache/spark/sql/functions.html So, you’ll have to rely on sparklyr to have that function work.

My question: are there workarounds to get access to the concat_ws() function? I tried:

What is the goal of the function? Concatenates multiple input string columns together into a single string column, using the given separator.


Solution

  • You can simply use paste from base R.

    library(sparklyr)
    library(dplyr)
    
    config <- spark_config()
    sc <- spark_connect(master = "local", config = config)
    
    df <- as.data.frame(cbind(c("1", "2", "3"), c("a", "b", "c")))
    sdf <- sdf_copy_to(sc, df, overwrite = T)
    
    sdf %>%
      mutate(concat = paste(V1, V2, sep = "-"))