google-cloud-platformgoogle-bigquerylexical-analysis

What is a "path_expression" in BigQuery


BigQuery describes a path_expression in the Syntax page as follows:

A path expression describes how to navigate to an object in a graph of objects and generally follows this structure...

Examples:

foo.bar
foo.bar/25
foo/bar:25
foo/bar/25-31
/foo/bar
/25/foo/bar

What are some actual examples of a Path Expression with a valid table, for example in a CTE? My thinking was that a path expression would be something that explicitly qualifies a field or struct sub-field, such as:

myTable.myField.mySubfield

But from the above syntax, which allows for:

/:-

I'm not exactly sure what it is or how it would be used. Could someone show a real-world example of how a path expression would be used?


Solution

  • Path expressions describe where to search for (or store) data. Consider this example from the docs:

    #legacySQL
    SELECT
      weight_pounds, state, year, gestation_weeks
    FROM
      [bigquery-public-data:samples.natality]
    ORDER BY
      weight_pounds DESC
    LIMIT
      10;
    

    In this example, the portion [bigquery-public-data:samples.natality] is the path_expression. It is saying to look at the natality table in the samples database in the bigquery-public-data project.

    But from the above syntax, which allows for: /:-

    This would actually not be allowed, as a path_expression.

    This example would be parsed as:

    /:-
    
    {first_part}/{subsequent_part}:{subsequent_part}
    
    {{ unquoted_identifier | quoted_identifier }} / 
    {{ unquoted_identifier | quoted_identifier | number }} : 
    {{ unquoted_identifier | quoted_identifier | number }}
    

    An unquoted_identifier must at least begin with a letter or an underscore. A quoted_identifier can contain any character, but cannot be empty. The empty string therefore cannot be considered an unquoted_identifier, quoted_identifier, or number, so this expression is invalid (and is invalid in 3 positions).

    A possible minimal path_expression could be something like:

    a:b.c
    

    meaning look in the c table in the b database in the a project.