In Apache Spark SQL (Azure Databricks), I created a table like this:
CREATE TABLE t(
a BIGINT,
b BIGINT NOT NULL
)
I have verified that I have one nullable column and one non-nullable column, as I get an error if I try to insert NULL
in column b
but I can insert null in column a
.
However, the following statement
DESCRIBE TABLE t
does not tell me which columns are nullable and which are not null. Neither does DESCRIBE TABLE EXTENDED
. How can I get the full description?
Spark SQL auxiliary commands like DESCRIBE TABLE
and SHOW COLUMNS
do not display column NULL constraints as per the docs.
There is this command:
SHOW TABLE EXTENDED like 't'
which returns the schema in the information
column along with others information but not much readable.
//Table Properties: [transient_lastDdlTime=1643396075]
//.....
//Partition Provider: Catalog
//Schema: root
// |-- a: long (nullable = true)
// |-- b: long (nullable = false)
But if you can use spark.catalog
with scala/python api then this is simple:
spark.catalog.listColumns("t").show()
//+----+-----------+--------+--------+-----------+--------+
//|name|description|dataType|nullable|isPartition|isBucket|
//+----+-----------+--------+--------+-----------+--------+
//| a| null| bigint| true| false| false|
//| b| null| bigint| false| false| false|
//+----+-----------+--------+--------+-----------+--------+