sqlpivotcrosstab

SQL convert row values to column headers


I have the following table:

tableA
+-----------+--------+
| tableA_id |  code  |
+-----------+--------+
|         1 | code A |
|         2 | code B |
|         3 | code A |
|         3 | code C |
|         3 | code B |
|         4 | code A |
|         4 | code C |
|         4 | code B |
|         5 | code A |
|         5 | code C |
|         5 | code B |
+-----------+--------+

I want to use a query to display code A, code B, code C as the column headers and then the values would display whether or not the tableA_id entry contains that code in the code field. So something like this:

+-----------+------------------------------+
| tableA_id |  code A |  code B  |  code C |
+-----------+------------------------------+
|         1 |   yes   |          |         |
|         2 |         |   yes    |   yes   |
|         3 |   yes   |   yes    |   yes   |

etc...

Can you do this in SQL?


Solution

  • The challenge with this question is that code column can contain arbitrary list of values. Normally PIVOT requires that values inside IN are provided as constants, though with Snowflake it is possible to useANY or a subquery.

    This pattern is called "dynamic PIVOT":

    SELECT *
    FROM tableA
    PIVOT (MIN(code) FOR code IN (ANY)) AS pvt
    ORDER BY tableA_id;
    

    enter image description here

    Version with a subquery:

    SELECT *
    FROM tableA
    PIVOT (MIN(code) FOR code IN (SELECT code FROM tableA)) AS pvt
    ORDER BY tableA_id;
    

    It is also possible to handle NULLs values in pivoted columns at once by using DEFAULT ON NULL (<replacement_value>):

    SELECT *
    FROM tableA
    PIVOT (MIN(code) FOR code IN (ANY) DEFAULT ON NULL ('empty')) AS pvt
    ORDER BY tableA_id;
    


    Previous version:

    Using conditional aggregation(portable between different dialects):

    SELECT tableA_id,
           MAX(CASE WHEN code ='code A' THEN 'yes' END) AS "code A",
           MAX(CASE WHEN code ='code B' THEN 'yes' END) AS "code B",
           MAX(CASE WHEN code ='code C' THEN 'yes' END) AS "code C"
    FROM tableA
    GROUP BY tableA_id;
    

    Output:

    ╔════════════╦═════════╦═════════╦════════╗
    ║ tableA_id  ║ code A  ║ code B  ║ code C ║
    ╠════════════╬═════════╬═════════╬════════╣
    ║         1  ║ yes     ║ (null)  ║ (null) ║
    ║         2  ║ (null)  ║ yes     ║ (null) ║
    ║         3  ║ yes     ║ yes     ║ yes    ║
    ║         4  ║ yes     ║ yes     ║ yes    ║
    ║         5  ║ yes     ║ yes     ║ yes    ║
    ╚════════════╩═════════╩═════════╩════════╝
    

    There are many possibilities(search for):

    PIVOT            -> SQL Server/Oracle
    CROSSTAB         -> Postgresql
    SELF OUTER JOIN  -> All
    CONDITIONAL AGG  -> All
    ...