sqlapache-sparkapache-spark-sqldatabricks

Escape a single quote in plain Databricks SQL


I want to count rows dynamically from multiple tables in a schema in (preferably) plain Databricks SQL. I also want the tablename to be displayed before the number of rows.

Something like this:

table name rowcount
sales_customers 134
sales_franchises 200
sales_suppliers 99

With this query:

select 'select ' || table_name || ', count (*) from ' || table_catalog || '.' || 
table_schema || '.' || table_name || ';'
from system.information_schema.tables
where table_catalog ='samples'
and table_schema = 'bakehouse'

I get this output:

select sales_customers, count (*) from samples.bakehouse.sales_customers;
select sales_franchises, count (*) from samples.bakehouse.sales_franchises;
select sales_suppliers, count (*) from samples.bakehouse.sales_suppliers;

I would then copy and paste the output in a new window and execute it and get the rowcount per table.

But the table name needs single quotes to be able to run, like this:

select 'sales_customers', count (*) from samples.bakehouse.sales_customers
etc...

How can I put single quotes around the table name in the first query, so single quotes are around the table name in the output? I tried different things, like 3 single quotes, a backslash, backticks, but I can't get it to work. Or should I forget about Databricks SQL for this and use spark sql in python?


Solution

  • Use \ to escape single chars. You also might wanna provide an alias for first column (AS table_name).

    select 'select \'' || table_name || '\' AS t_name, count (*) from ' 
    || table_catalog || '.' || table_schema || '.' || table_name || ';'
    from system.information_schema.tables
    where table_catalog ='samples'
    and table_schema = 'bakehouse'
    

    Also if you throw in a UNION at the end then you'll get a single query (delete the last UNION of course)

    select 'select \'' || table_name || '\' AS t_name, count (*) from ' 
    || table_catalog || '.' || table_schema || '.' || table_name || ' UNION'
    from ...
    

    will give you:

    select 'sales_customers' AS t_name, count (*) from samples.bakehouse.sales_customers UNION
    select 'sales_franchises' AS t_name, count (*) from samples.bakehouse.sales_franchises UNION
    select 'sales_suppliers' AS t_name, count (*) from samples.bakehouse.sales_suppliers UNION
    

    replace last UNION with ; and it should give you exact output you want.


    It's documented here.

    char

    One character from the character set. Use \ to escape special characters (e.g., ' or ). To represent unicode characters, use 16-bit or 32-bit unicode escape of the form \uxxxx or \Uxxxxxxxx, where xxxx and xxxxxxxx are 16-bit and 32-bit code points in hexadecimal respectively (e.g., \u3042 for あ and \U0001F44D for 👍).

    >>> SELECT 'it\'s $10.' AS col;
    +---------+
    |      col|
    +---------+
    |It's $10.|
    +---------+
    

    Or Databricks' documentation here.