In MySQL
, we can query the table information_schema.tables
and obtain useful information such as data_length
or table_rows
select
data_length
, table_rows
from
information_schema.tables
where
table_schema='some_db'
and table_name='some_table';
+-------------+------------+
| data_length | table_rows |
+-------------+------------+
| 8368 | 198 |
+-------------+------------+
1 row in set (0.01 sec)
Is there an equivalent mechanism for SparkSQL/Hive?
I am okay to use SparkSQL
or program API like HiveMetaStoreClient
(java API org.apache.hadoop.hive.metastore.HiveMetaStoreClient
). For the latter I read the API doc (here) and could not find any method related to table row numbers and sizes.
There is no one command for meta-information. Rather there are a set of commands, you may use
Describe Table/View/Column
desc [formatted|extended] schema_name.table_name;
show table extended like part_table;
SHOW TBLPROPERTIES tblname("foo");
Display Column Statistics (Hive 0.14.0 and later)
DESCRIBE FORMATTED [db_name.]table_name column_name;
DESCRIBE FORMATTED [db_name.]table_name column_name PARTITION (partition_spec);