I would like to know the total size of a table, as well as the file sizes of the files that comprise it.
Using describe detail works DESCRIBE DETAIL table1, but using the information as a table doesn't.
These fail
SELECT sizeInBytes FROM (Describe detail table1)
SELECT sizeInBytes FROM TABLE(Describe detail table1)
with this error
[PARSE_SYNTAX_ERROR] Syntax error at or near 'detail'. SQLSTATE: 42601 line 3, pos 39
== SQL ==
SELECT sizeInBytes FROM TABLE(Describe detail
I am using Unity Catalog for all data. I found this answer but its in Scala.
This works in SQL
select
count(*) as rows,
file_path,
file_size
from
(
select
*,
_metadata.file_path,
_metadata.file_size
FROM
table1
)
group by
file_path,
file_size
Here is Python
spark.sql("describe detail table1").select("sizeInBytes").collect()