I am trying to do a filter similar to below using scala
where col1 = 'abc' and col2 not in (0,4) and col3 in (1,2,3,4)
I tried writing something like this
val finalDf: DataFrame =
initDf.filter(col("col1") ="abc")
.filter(col("col2") <> 0)
.filter(col("col2") <> 4)
.filter(col("col3") = 1 ||col("col3") = 2 ||col("col3") = 3 ||col("col3") = 4)
or
val finalDf: DataFrame =
initDf.filter(col("col1") ="abc")
&& col("col2") != 0 && col("col2") != 4
&& (col("col3") = 1
|| col("col3") = 2
|| col("col3") = 3
|| col("col3") = 4))
both not seems to be working. Can anyone help me on this.
For col operators are a little bit different
For equality use ===
For Inequality =!=
If you want to use literals you can use lit function
Your example may look like this
dfMain.filter(col("col1") === lit("abc"))
.filter(col("col2") =!= lit(0))
.filter(col("col2") =!= lit(4))
.filter(col("col3") === lit(1) || col("col3") === lit(2) ||col("col3") === lit(3) ||col("col3") === lit(4))
You can also use isin instead of this filter with multiply ors
If you want to find more about operators for cols you ca read this