I'm using pyspark and hivecontext.sql and I want to filter out all null and empty values from my data.
So I used simple sql commands to first filter out the null values, but it doesen't work.
My code:
hiveContext.sql("select column1 from table where column2 is not null")
but it work without the expression "where column2 is not null"
Error:
Py4JavaError: An error occurred while calling o577.showString
I think it was due to my select is wrong.
Data example:
column 1 | column 2
null | 1
null | 2
1 | 3
2 | 4
null | 2
3 | 8
Objective:
column 1 | column 2
1 | 3
2 | 4
3 | 8
Tks
It work for me:
df.na.drop(subset=["column1"])