I'm using dbt on a Databricks platform. I'm creating a new table - something I've done dozens of times before without problems, but this one's making dbt choke. Here's my sql script:
{% set key_columns = ['transaction_originator','dlk_bnz_insert_timestamp'] %}
{{
config(alias='bnz_std_sqlbi01_user_transaction_file',
materialized='table',
unique_key=key_columns,
file_format='delta'
)
}}
SELECT
TRIM(CAST(transaction_originator AS STRING)) AS transaction_originator,
CAST(ab_number AS BIGINT) AS ab_number,
TRIM(CAST(alpha_name AS STRING)) AS alpha_name,
TRIM(CAST(emcu AS STRING)) AS emcu,
CASE WHEN LOWER(active_user) = 'true' THEN True ELSE False END AS active_user,
TRIM(CAST(location AS STRING)) AS location,
TRIM(CAST(status AS STRING)) AS status,
'sqlbi01' AS dlk_src_name,
_metadata.file_modification_time AS dlk_bnz_insert_timestamp,
CURRENT_TIMESTAMP() AS dlk_bnz_std_insert_timestamp
FROM {{ source('bronze', 'bnz_raw_sqlbi01_user_transaction_file') }}
WHERE transaction_originator IS NOT NULL
There's a matching set of entries in the schema.yml that I won't post because I don't think they're relevant. When I run dbt to create & populate the table, it fails, the relevant bit of the log file saying this:
14:09:49 Database Error in model bnz_std_sqlbi01_user_transaction_file (models/bronze/bnz_std_sqlbi01_user_transaction_file.sql)
[PARSE_SYNTAX_ERROR] Syntax error at or near ''. SQLSTATE: 42601 (line 16, pos 6)
== SQL ==
/* {"app": "dbt", "dbt_version": "1.9.4", "dbt_databricks_version": "1.10.1", "databricks_sql_connector_version": "4.0.3", "profile_name": "datalakehouse", "target_name": "databricks_cluster", "node_id": "model.datalakehouse.bnz_std_sqlbi01_user_transaction_file"} */
create or replace table `dev`.`bronze`.`bnz_std_sqlbi01_user_transaction_file`
using delta
as
------^^^
SELECT
TRIM(CAST(transaction_originator AS STRING)) AS transaction_originator,
CAST(ab_number AS BIGINT) AS ab_number,
TRIM(CAST(alpha_name AS STRING)) AS alpha_name,
TRIM(CAST(emcu AS STRING)) AS emcu,
CASE WHEN LOWER(active_user) = 'true' THEN True ELSE False END AS active_user,
TRIM(CAST(location AS STRING)) AS location,
TRIM(CAST(status AS STRING)) AS status,
'sqlbi01' AS dlk_src_name,
_metadata.file_modification_time AS dlk_bnz_insert_timestamp,
CURRENT_TIMESTAMP() AS dlk_bnz_std_insert_timestamp
FROM `prod`.`bronze`.`bnz_raw_sqlbi01_user_transaction_file`
WHERE transaction_originator IS NOT NULL
compiled code at target/run/datalakehouse/models/bronze/bnz_std_sqlbi01_user_transaction_file.sql
14:09:49
The specific character it's choking on, the one under the as, is a U+FEFF character that's being introduced from somewhere. I've checked and double checked my source files for unprintable characters - they're clean.
Where the heck is this character coming from, and how can I get rid of it?
OK, so I never figured out what was going wrong here, but abandoning the git patch I was working on and starting a new one from scratch got it to work.
Just some random glitch, I guess.