I have a dataframe like below and need to find the least value except zeros and add it in a new column as 'Least'.
Column1 | Column2 | Column3 |
---|---|---|
100.0 | 120.0 | 150.0 |
200.0 | 0.0 | 0.0 |
0.0 | 20.0 | 100.0 |
I tried with least() function but I didn't get the expected output.
expected output would be like below.
Column1 | Column2 | Column3 | Least |
---|---|---|---|
100.0 | 120.0 | 150.0 | 100.0 |
200.0 | 0.0 | 0.0 | 200.0 |
0.0 | 20.0 | 100.0 | 20.0 |
You can do something like this to get the least values
import sparkSession.implicits._
val df = List(
(100.0, 120.0, 150.0),
(200.0, 0.0, 0.0),
(0.0, 20.0, 100.0)
).toDF("column1", "column2", "column3")
val columns = df.columns.toSeq
val leastRow = least(
columns map col: _*
).alias("min")
df.select($"*", leastRow).show
Try to improve the leastRow method to ignore the zero values. think about replacing the zero values with the maximum possible float value in your use case, Double.PositiveInfinity in general ect.. Do not hesitate to post your work and be sure that you'll get help ! Good luck.