I need to convert single-column rows into a string variable for use in a where condition while loading from a DB table, instead of loading the entire data from the table.
Sample dataframe like below.
depName | emp_name |
---|---|
develop | Astrid |
develop | Freja |
develop | Wilma |
sales | Maja |
sales | Alice |
personnel | John |
personnel | Marsh |
Expecting output like below, pls help me.
val data='develop','develop','develop','sales','sales','sales','personnel','personnel'
I tried the below logic but COLLECT method taking more time
val result = df.select("depName").collect().map(_.getString(0)).mkString(",")
you need to select
the column, collect
it. It'll return an array of Row.
We'll map over the rows and use getString
to convert each value to a string.
Finally, mkString
would make an overall string of them with a "," as a delimiter.
import sparkSession.implicits._
val df = List(
("develop", "astrid"),
("develop", "Freja"),
("develop", "Wilma"),
("sales", "Maja"),
("sales", "Alice"),
("personnel", "John"),
("personnel", "Marsh")
).toDF("depName", "emp_name")
val result = df.select("depName").collect().map(_.getString(0)).mkString(",")