I have a dataset with variables ID, Vaccine, and Date. Some Vaccines have a single date, and some vaccines have two or three dates for a single ID. When I create a table in SQL, multiple rows are created for the IDs that have two or more dates for a vaccine. See an example below:
ID | Vaccine | Date |
---|---|---|
A | Flu | 1/15/2005 |
A | Flu | 3/25/2005 |
B | Flu | 1/15/2005 |
B | TB | 1/15/2005 |
B | TB | 2/15/2005 |
B | TB | 3/15/2005 |
B | Flu | 4/18/2005 |
C | Flu | 1/15/2005 |
C | TB | 5/21/2005 |
How can I create a table as below in SQL?
ID | Vaccine | Date1 | Date2 | Date3 |
---|---|---|---|---|
A | Flu | 1/15/2005 | 3/25/2005 | |
B | Flu | 1/15/2005 | 4/18/2005 | |
B | TB | 1/15/2005 | 2/15/2005 | 3/15/2005 |
C | Flu | 1/15/2005 | ||
C | TB | 5/21/2005 |
Thanks
You can try something like this:
SELECT
ID,
Vaccine,
MAX(CASE WHEN rn = 1 THEN Date END) AS Date1,
MAX(CASE WHEN rn = 2 THEN Date END) AS Date2,
MAX(CASE WHEN rn = 3 THEN Date END) AS Date3
FROM (
SELECT
ID,
Vaccine,
Date,
ROW_NUMBER() OVER (PARTITION BY ID, Vaccine ORDER BY Date) AS rn
FROM
YourTableName
) AS subquery
GROUP BY
ID, Vaccine;
In this query:
Make sure to replace YourTableName with the actual name of your table. This query will transform the data into the format you desire, with Date1, Date2, and Date3 columns indicating the different dates for each ID and Vaccine combination.