I'm would like to get the SUM of each column by years. Rather then displays several individual rows for the same year.
spark.sql("""
SELECT YEAR(date) AS year,
useful, funny, cool
FROM reviews_without_text_table
ORDER by year ASC;
""").show(truncate=False)
[enter image description here][1]
Please view the attatchment here: [1]: https://i.sstatic.net/nYIrA.png
use this
spark.sql("""
SELECT YEAR(date) AS year,
sum(useful) useful,sum(funny) funny,sum(cool) cool
FROM reviews_without_text_table
GROUP BY YEAR(date)
ORDER by year ASC;
""").show(truncate=False)