apache-sparkapache-spark-sqldatabricksazure-databricks

DESCRIBE TABLE see which columns are NOT NULL


In Apache Spark SQL (Azure Databricks), I created a table like this:

CREATE TABLE t(
    a BIGINT,
    b BIGINT NOT NULL
)

I have verified that I have one nullable column and one non-nullable column, as I get an error if I try to insert NULL in column b but I can insert null in column a.

However, the following statement

DESCRIBE TABLE t

does not tell me which columns are nullable and which are not null. Neither does DESCRIBE TABLE EXTENDED. How can I get the full description?


Solution

  • Spark SQL auxiliary commands like DESCRIBE TABLE and SHOW COLUMNS do not display column NULL constraints as per the docs.

    There is this command:

    SHOW TABLE EXTENDED like 't'
    

    which returns the schema in the information column along with others information but not much readable.

    //Table Properties: [transient_lastDdlTime=1643396075]
    //.....
    //Partition Provider: Catalog
    //Schema: root
    // |-- a: long (nullable = true)
    // |-- b: long (nullable = false)
    

    But if you can use spark.catalog with scala/python api then this is simple:

    spark.catalog.listColumns("t").show()
    //+----+-----------+--------+--------+-----------+--------+
    //|name|description|dataType|nullable|isPartition|isBucket|
    //+----+-----------+--------+--------+-----------+--------+
    //|   a|       null|  bigint|    true|      false|   false|
    //|   b|       null|  bigint|   false|      false|   false|
    //+----+-----------+--------+--------+-----------+--------+