amazon-redshiftredshift-query

Amazon Redshift Literals from Union Interaction


When unioning two tables together and selecting a string literal like: SELECT 'Literal' as col1, it appears filtering where col1 = 'Literal' returns no rows.


create table schema.t1 (col1 INT8, col2 varchar(max));
insert into schema.t1 (col1, col2) VALUES (1, 'A'), (2, 'B'), (3, NULL);
create table schema.t2 (col1 INT8);
insert into schema.t2 (col1) values (4);

with cte as (
select col1, coalesce(col2, 'None') "col2" from schema.t1 
UNION 
select col1, 'Literal' from schema.t2 
)
select * from cte where col2 = 'Literal'
;

It is apparent that col2 = 'Literal' should return a row, but I see 0 results when running the above.

Wrapping Trim() around the column appears to return results:

with cte as (
select col1, coalesce(col2, 'None') "col2" from schema.t1 
UNION 
select col1, 'Literal' from schema.t2 
)

select * from cte where trim(col2) = 'Literal';

And omitting the UNION returns results as well:

with cte as (
    SELECT col1, 'Literal'::text as "col2" from schema.t2
)
select * from cte where col2 = 'Literal'

I've attempted to cast the literal to different types, but it still fails to return results.


Solution

  • Running your query on Redshift Serverless produces the expected result

    col1,      col2
    4,         Literal
    

    What appears to be happening to you is the misapplication of a WHERE clause to the t2 table. Redshift pushes WHERE clauses down to the table scan and uses metadata to decide if data is needed. It looks like Redshift is doing this for t2 which doesn't meet the criteria in its metadata. By adding trim() to the WHERE clause you break RS's ability to test the metadata allowing the t2 row to be scanned. Since I cannot reproduce what you are seeing I can only give a few ideas.

    1. Check that your Redshift cluster is up to date in version and patches. If you are out of date then this could be an already fixed issue.
    2. Test to see if explicitly naming the column in the post-union select gives a needed clue 'select col1, 'Literal' as col2 from t2'.
    3. Test if having t2 first in the UNION changes behavior but you will need to explicitly name col2 as the first select in a UNION sets the column names.

    If your cluster is up to date and is misapplying a WHERE clause you can submit a bug to AWS but check all the bases first.