I need to modify a relatively small subset of rows that - for whatever reason I still need to investigate - did not get their data populated appropriately.
Working on health data, to determine prescription end dates, if no value is provided, we perform some logic (see below) to populate this. Unfortunately, some rows did not populate (< 100 out of tens of millions of rows) but NA values for the end date are disallowed by convention. I've already loaded everything into our staging DB, but to upload it to prod, I have to sort out these invalid dates. I'm unsure how to do this in SQL Server as I've never really ventured past selecting, filtering and insertion of data, very basic stuff.
One thing I DO NOT want to do is apply any logic to the entire table, I only want to modify the < 100 rows that are invalid.
If I was to do this in R's dplyr
library it looks something like this:
#semi-pseudocode
library(dplyr)
table <- read_csv("my_data.csv") |>
filter(is.na(drug_exposure_end_date)) |>
mutate(drug_exposure_end_date = as_date(coalesce(drug_exposure_end_datetime,
drug_exposure_start_datetime + (days_supply - 1L),
drug_exposure_start_datetime + ddays(29L))))
I'm not sure how I can get a similar result in SQL, and really the only operation I want to do is to just add 29 days, in these invalid rows, days_supply
and drug_exposure_end_datetime
are both NULL
.
This is how the data is in my database:
drug_exposure_id drug_concept_id drug_exposure_start_date drug_exposure_start_datetime drug_exposure_end_date drug_exposure_end_datetime verbatim_end_date drug_type_concept_id stop_reason refills quantity days_supply sig route_concept_id lot_number provider_id visit_occurrence_id visit_detail_id drug_source_value drug_source_concept_id route_source_value dose_unit_source_value
13903618 21059132 2015-03-24 2015-03-24 00:00:00.000 NULL NULL NULL 32833 NULL NULL NULL NULL NULL 0 NULL NULL NULL NULL [180840] PHENYLEPHRINE 2.5% EYE DROPS PRESERVA... 0 NULL NULL
14097493 21177049 2015-04-01 2015-04-01 00:00:00.000 NULL NULL NULL 32833 NULL NULL NULL NULL NULL 0 NULL NULL NULL NULL [183044] CYCLOPENTOLATE 1% EYE DROPS PRESERVAT... 0 NULL NULL
14109189 21059132 2015-01-16 2015-01-16 00:00:00.000 NULL NULL NULL 32833 NULL NULL NULL NULL NULL 0 NULL NULL NULL NULL [180840] PHENYLEPHRINE 2.5% EYE DROPS PRESERVA... 0 NULL NULL
14338276 21177049 2015-04-01 2015-04-01 00:00:00.000 NULL NULL NULL 32833 NULL NULL NULL NULL NULL 0 NULL NULL NULL NULL [183044] CYCLOPENTOLATE 1% EYE DROPS PRESERVAT... 0 NULL NULL
UPDATE your_table_name
SET drug_exposure_end_date = DATEADD(DAY, 29, drug_exposure_start_datetime)
WHERE drug_exposure_end_date IS NULL
AND drug_exposure_end_datetime IS NULL
AND days_supply IS NULL;