sql-server-2008distinctnewid

Use NEWID() without losing distinct?


I am trying to create a new data extract from a (badly designed) sql database. The customer requires that I add a distinctidentifier which I am attempting to do using the NEWID() function. Unfortunately this leads to multiple duplicate records being returned.

After a bit of research I have found that the NEWID() function does indeed 'undo' the use of the distinct keyword, but I cannot work out why or how to overcome this.

An example of the query I am trying to write is as follows:

select distinct

    NEWID() as UUID
    ,Histo_Results_File.ISRN
    ,Histo_Results_File.Internal_Patient_No
    ,Histo_Results_File.Date_of_Birth
    ,Histo_Result_freetext.histo_report
    ,Histo_Report.Date_Report_Updated  as [Investigation_Result_Date]

from apex.Histo_Results_File
            inner join apex.Histo_Report on (Histo_Report.Histo_Results_File = Histo_Results_File.ID)

If I miss out the NEWID() line in the select block, I get 569 records returned, which is correct, but if I include that line then I get in excess of 30,000 which are all duplicates of the original 569 but with different IDs. Can anyone suggest a way around this problem?

Thanks in advance


Solution

  • Use a sub query would be the easiest way to do it.

    SELECT NEWID() as UUID
    , * -- this is everything from below
    FROM (
    select distinct
         Histo_Results_File.ISRN
        ,Histo_Results_File.Internal_Patient_No
        ,Histo_Results_File.Date_of_Birth
        ,Histo_Result_freetext.histo_report
        ,Histo_Report.Date_Report_Updated  as [Investigation_Result_Date]
    
    from apex.Histo_Results_File
                inner join apex.Histo_Report on (Histo_Report.Histo_Results_File = Histo_Results_File.ID)) as mySub