sql-servert-sqlunicodecollationsupplementary

SQL Query Where Column = '' returning Emoji characters 🎃 and 🍰


Ok so I have a table with three columns:

Id, Key, Value

I would like to delete all rows where Value is empty (''). Therefore I wrote the query to select before I delete which was:

Select * from [Imaging.ImageTag] where [Value] = ''

all pretty standard so far...

Now heres the strange part. This query returned two rows shown below with commas seperating columns:

CE7C367C-5C4A-4531-9C8C-8F2A26B1B980,   ObjectType, 🎃 
F5B2F8A8-C4A8-4799-8824-E5FFEEDAB887,   Caption,    🍰

Why are these two rows matching on ''?

Extra Info

I am using Sql-Server, The [Value] column is of type NVARCHAR(300) and yes the table name really is [Imaging.ImageTag]


Solution

  • This is collation dependant.

    Matches empty string

    SELECT 1 where N'' = N'🍰'  COLLATE latin1_general_ci_as
    

    Doesn't match empty string

    SELECT 1 WHERE N'' = N'🍰'   COLLATE latin1_general_100_ci_as
    

    The 100 collations are more up-to-date (though still not bleeding edge, they have been available since 2008) and you should use more modern collations unless you have some specific reason not to. The BOL entry for 100 collations specifically calls out

    Weighting has been added to previously non-weighted characters that would have compared equally.