sqlsnowflake-cloud-data-platform

Snowflake correlated subqueries


I've found other questions with similar issues, but the answers don't match my needs and I don't see how to adapt them. The problem, of course, is that Snowflake doesn't support most correlated subqueries and I don't know how to achieve what I want to do without one. Here's what I'd like to be able to do:

SELECT
   IH.ID
   , (SELECT "DESCRIPTION" 
      FROM DISPUTE_REASON 
      WHERE INVOICE_HEADER_ID = IH.ID 
      ORDER BY CREATED_AT DESC LIMIT 1) DISPUTE_REASON
FROM
   INVOICE_HEADER IH

The key is I need to be able to get the most-recently-created record in the DISPUTE_REASON table within the scope of an invoice header ID. I've tried using a function to do it, but I get the same error, "Unsupported subquery type cannot be evaluated". I've also tried using FIRST_VALUE with GROUP BY, but they don't play together.

Can anyone think of a way to do this without using a correlated subquery?


Solution

  • I would just use a window function like this:

    SELECT IH.ID, dr."DESCRIPTION" as DISPUTE_REASON
    FROM INVOICE_HEADER IH LEFT JOIN
         (SELECT dr.*,
                 ROW_NUMBER() OVER (PARTITION BY dr.INVOICE_HEADER_ID ORDER BY dr.CREATED_AT DESC) as seqnum
          FROM DISPUTE_REASON dr
         ) dr
         ON dr.INVOICE_HEADER_ID = IH.ID AND dr.seqnum = 1;
    

    Snowflake also supports lateral joins, so I'm surprised your version doesn't work. Try this:

    SELECT IH.ID, dr."DESCRIPTION" as DISPUTE_REASON
    FROM INVOICE_HEADER IH LEFT JOIN LATERAL
         (SELECT dr.*
          FROM DISPUTE_REASON dr
          WHERE dr.INVOICE_HEADER_ID = IH.ID
          ORDER BY dr.CREATED_AT DESC
          LIMIT 1
         ) dr;