sqlgoogle-bigquerysql-date-functionssqldatetime

Dynamic Column alias based on a function in BigQuery


I am trying to alias a column based on a DateTime function in BigQuery. Below is my query snippet,

SELECT
    SUM(CASE 
            WHEN CAST(AsOfDate AS DATE) >= DATE_TRUNC(CURRENT_DATE,MONTH) 
                THEN Debit - Credit 
            ELSE 0 
        END) AS FORMAT_DATETIME("%B %G", DATETIME(DATE_TRUNC(CURRENT_DATE,MONTH)))
FROM 
    TableName

Obviously, I am getting a syntax error trying to use a DateTime function as a column alias in a query.

Syntax error: Expected end of input but got "("

The code

FORMAT_DATETIME("%B %G", DATETIME(DATE_TRUNC(CURRENT_DATE, MONTH))) 

run in BigQuery returns the value "November 2024", and I want this to be the alias of the column and it would be "December 2024" in the next month (changing dynamically depending on the month/year we are in).

Is there a way to dynamically alias a column name based on a DateTime function in BigQuery?


Solution

  • You can do a lot of things in BigQuery. Often, the question you should really be asking yourself is not 'can I?' but 'should I?'

    Let's take 'can I?' first.

    The method for these types of question is almost always the same: EXECUTE IMMEDIATE, a BigQuery instruction that allows you to run a string as a SQL query, along with judicious use of the string concatenation operator ||.

    Here is a simple and relevant example:

    EXECUTE IMMEDIATE "SELECT 1 AS " || (SELECT FORMAT_DATE('`%B %G`', CURRENT_DATE))
    

    This will create two BigQuery jobs:

    1. an 'outer' job that constructs a SQL query
    2. an 'inner' job that executes the constructed SQL query

    Here's what you get as output: The two jobs that the example query creates

    You can see the two jobs. Note that the second is executing this SQL command:

    SELECT 1 AS `November 2024`
    

    (the back-ticks are required, because you have a space in your column name)

    If you click 'view results' on the second job, you get this:

    Results of the job - one row and one column, containing the value '1' and with the column named 'November 2024'

    You can see that this essentially does what you want.

    Finally, note that you can use triple quotes to multi-line your strings; this makes it a bit easier to write a long SQL query using EXECUTE IMMEDIATE. It would look like this:

    EXECUTE IMMEDIATE """
    SELECT 1 AS """ 
    || (SELECT FORMAT_DATE('`%B %G`', CURRENT_DATE)) || ","
    || """
    'example' AS col_b,
    """
    

    (this one is like the previous example, but adds a second column, col_b, with value 'example')

    Now for 'should I?'

    Similarly, the answer for these types of questions is almost always the same: no.

    As we've seen, what you are trying to do is technically possible. But you are definitely going against all sorts of principles.

    Column names are supposed to be static identifiers that tell you what sort of thing each element in a row is supposed to be.

    Column names shouldn't be dates. A date is a property of an observation/record (i.e. a row), and so it should be stored as a value in a column called something like observation_date.

    Column names should be maximally machine-readable; stick to snake_case (no capital letters, underscores used to separate words, definitely no spaces). This will save you from a world of backtick-pain.


    Usually when people ask questions like this (and I'm only guessing here) it's because they are trying to push data presentation logic into their analytic/data retrieval queries. Don't be tempted!

    Spreadsheets and data visualisation tools have a place. That place is at the end of the data workflow. One thing spreadsheets are much better at than BigQuery is constructing pivot tables, i.e. taking something like

    obs_date,    country, val
    -------------------------
    2024-01-01   GB       1
    2024-02-01   GB       5
    2024-01-01   US       3
    2024-02-01   US       2
    

    and presenting it as

       | Jan 2024   Feb 2024
    ---+--------------------
    GB |     1          3
    US |     2          5
    

    This is a perfectly valid and useful way to present data, but it's best to use the right tool for that job. Usually it's best to store, retrieve, and transmit data in a 'tidy' format and then manipulate it into the form you want it at the point of presentation (or let the person you are sending it to do so).