I am trying to create a new data extract from a (badly designed) sql database. The customer requires that I add a distinctidentifier which I am attempting to do using the NEWID() function. Unfortunately this leads to multiple duplicate records being returned.
After a bit of research I have found that the NEWID() function does indeed 'undo' the use of the distinct keyword, but I cannot work out why or how to overcome this.
An example of the query I am trying to write is as follows:
select distinct
NEWID() as UUID
,Histo_Results_File.ISRN
,Histo_Results_File.Internal_Patient_No
,Histo_Results_File.Date_of_Birth
,Histo_Result_freetext.histo_report
,Histo_Report.Date_Report_Updated as [Investigation_Result_Date]
from apex.Histo_Results_File
inner join apex.Histo_Report on (Histo_Report.Histo_Results_File = Histo_Results_File.ID)
If I miss out the NEWID() line in the select block, I get 569 records returned, which is correct, but if I include that line then I get in excess of 30,000 which are all duplicates of the original 569 but with different IDs. Can anyone suggest a way around this problem?
Thanks in advance
Use a sub query would be the easiest way to do it.
SELECT NEWID() as UUID
, * -- this is everything from below
FROM (
select distinct
Histo_Results_File.ISRN
,Histo_Results_File.Internal_Patient_No
,Histo_Results_File.Date_of_Birth
,Histo_Result_freetext.histo_report
,Histo_Report.Date_Report_Updated as [Investigation_Result_Date]
from apex.Histo_Results_File
inner join apex.Histo_Report on (Histo_Report.Histo_Results_File = Histo_Results_File.ID)) as mySub