I am writing a query on this table to get the sum of size for all the directories, group by directory where date is yesterday. I am getting no output from the below query.
test.id test.path test.size test.date
1 this/is/the/path1/fil.txt 232.24 2019-06-01
2 this/is/the/path2/test.txt 324.0 2016-06-01
3 this/is/the/path3/index.txt 12.3 2017-05-01
4 this/is/the/path4/test2.txt 134.0 2019-03-23
5 this/is/the/path1/files.json 2.23 2018-07-23
6 this/is/the/path1/code.java 1.34 2014-03-23
7 this/is/the/path2/data.csv 23.42 2016-06-23
8 this/is/the/path3/test.html 1.33 2018-09-23
9 this/is/the/path4/prog.js 6.356 2019-06-23
4 this/is/the/path4/test2.txt 134.0 2019-04-23
SELECT regexp_replace(path,'[^/]+$',''), sum(cast(size as decimal))
from test WHERE date > date_sub(current_date, 1) GROUP BY path,size;
You must not group by size
, only by regexp_replace(path,'[^/]+$','')
.
Also, since you want only yesterday's rows why do you use WHERE date > '2019%
?
You can get yesterday's date with date_sub(current_date, 1)
:
select
regexp_replace(path,'[^/]+$',''),
sum(cast(size as decimal))
from test
where date = date_sub(current_date, 1)
group by regexp_replace(path,'[^/]+$','');