sqlselectdbtcolumn-alias

Why would using column aliases in a select statement fill in null values with the column name?


I am using DBT to select data with column aliases, and I am wondering why the column name would become the default value if there are no values in the column I am renaming. How would you fix this select statement so that it doesn't fill in the null values with the column name?

Walkthrough:

I have a table that is basically empty in my database which I generated from dbt seed.

my_data
----┬------┬------
 A  |  B 1 |  C 2
----┼------┼------
 g  |      |  
----┼------┼------    
 h  |      |     
----┴------┴------

and then I select this data in dbt_model and do a dbt run.

select 
    A as column1, 
    'B 1' as column2, 
    'C 2' as column3
from {{ ref(`my_data`)}}

for some reason this ends up with a table looking like this

dbt_model
---------┬---------┬---------
 column1 | column2 | column3
---------┼---------┼---------
    g    |   B 1   |   C 2  
---------┼---------┼---------   
    h    |   B 1   |   C 2  
---------┴---------┴---------

but that isn't what I wanted. I wanted this. So how do I do a column alias to rename the columns in the select statement? Why would it behave this way?

dbt_model
---------┬---------┬---------
 column1 | column2 | column3
---------┼---------┼---------
    g    |         |        
---------┼---------┼---------   
    h    |         |        
---------┴---------┴---------

Solution

  • In most SQL dialects, single quotes are used as string literals. So in your query:

    select 
        A as column1, 
        'B 1' as column2, 
        'C 2' as column3
    from {{ ref(`my_data`)}}
    

    You are not selecting from the columns called B 1 and C 2, you are selecting the string literals "B 1" and "C 2", which is why your result is what it is.

    In Postgres-like databases, you should use double quotes for names with spaces in them:

    select 
        A as column1, 
        "B 1" as column2, 
        "C 2" as column3
    from {{ ref(`my_data`)}}
    

    In MySQL-like dialects (including Google BigQuery), you should use backticks instead:

    select 
        A as column1, 
        `B 1` as column2, 
        `C 2` as column3
    from {{ ref(`my_data`)}}