I have a table called referrals
CREATE TABLE [dataproduct].[referrals]
(
[person_id] DOUBLE,
[medical_service_cd] DOUBLE,
[refer_from_organization_id]
[referral_org_name] STRING,
[refer_from_provider_id] DOUBLE,
[provider_from_name] STRING,
[refer_to_provider_id] DOUBLE,
[provider_to_name] STRING,
[outbound_encntr_id DOUBLE,
[order_id] DOUBLE,
[referral_written_dt_tm] STRING,
[requested_start_dt_tm] STRING,
[medical_service] STRING,
[referral_status] STRING,
[loc_code] STRING,
[service_type] STRING,
[referral_reason] STRING,
[treatment_text] STRING
)
I want to find patients, by person_id, who were referred to Neurology FOLLOWED BY Medical Genetics (as listed under medical_service) using referral_written_dt_tm. I confirmed that some patients had both medical services, but want to find the ones who had them in the desired order. so I tried this:
SELECT r.person_id,
COUNT(DISTINCT r.medical_service) as count,
row_number() OVER (PARTITION BY person_id ORDER BY referral_written_dt_tm ASC) AS row_num
FROM matt_dataproduct.referrals r
WHERE r.medical_service IN ('Medical Genetics',
'Neurology'
)
GROUP BY r.person_id,
HAVING count > 1;
I got no results. I just want a list of patients who meet the criteria (medical genetics followed by neurology). Is there some way I should revise my query? Something about it seems off.
One way is probably something like the following:
SELECT r.person_id
FROM matt_dataproduct.referrals rmg INNER JOIN matt_dataproduct.referrals rn
ON rmg.person_id = rn.person_id
WHERE rmg.medical_service ='Medical Genetics'
AND rn.medical_service = 'Neurology'
AND rn.eferral_written_dt_tm < rmg.eferral_written_dt_tm
GROUP BY r.person_id
This simply joins the table to itself (once for Neurology and once for Medical Genetics) and finds rows where the same patient has Neurology before Medical Genetics. The group by
is being used to get distinct values (could use distinct
in the select
part instead).
You said the type of eferral_written_dt_tm
is a string (not an actual date). I'm assuming it's directly comparable without conversion (since you used it in your window function's order by
), but if not you would replace rn.eferral_written_dt_tm < rmg.eferral_written_dt_tm
with the appropriate functions or conversions so that it's comparable.
You could use the row_number
window function as you tried (or maybe a variation with lag
) by wrapping it in another query to compare values. But in this case I don't see any big advantage over the simple self-join.