sql-serverunicodecollation

Is there a SQL Server collation option that will allow matching different apostrophes?


I'm currently using SQL Server 2016 with SQL_Latin1_General_CP1_CI_AI collation. As expected, queries with the letter e will match values with the letters e, è, é, ê, ë, etc because of the accent insensitive option of the collation. However, queries with a ' (U+0027) do not match values containing a ’ (U+2019). I would like to know if such a collation exists where this case would match, since it's easier to type ' than it is to know that ’ is keystroke Alt-0146.


Solution

  • I'm confident in saying no. The main thing, here, is that the two characters are different (although similar). With accents, e and ê are still both an e (just one has an accent). This enables you (for example) to do searches for things like SELECT * FROM Games WHERE [Name] LIKE 'Pokémon%'; and still have rows containing Pokemon return (because people haven't used the accent).

    The best thing I could suggest would be to use REPLACE (at least in your WHERE clause) so that both rows are returned. That is, however, likely going to get expensive.

    If you know what columns are going to be a problem you could add a Computed Column to that table and index it appropriately. Then you could use that column in your WHERE clause, but display the original one. Something like:

    USE Sandbox;
    --Create dbo.Sample table and data
    CREATE TABLE dbo.Sample (String varchar(500));
    
    INSERT INTO dbo.Sample
    VALUES ('This is a string that does not contain either apostrophe'),
           ('Where as this string, isn''t without at least one'),
           ('’I have one of them as well’'),
           ('’Well, I''m going to use both’');
    GO
    --First attempt (without the column)
    SELECT String
    FROM dbo.Sample
    WHERE String LIKE '%''%'; --Only returns 2 of the rows
    GO
    --Create a PERSISTED Column
    ALTER TABLE dbo.Sample ADD StringRplc AS REPLACE(String,'’','''');
    GO
    CREATE INDEX IX_Sample_StringRplc ON dbo.Sample (StringRplc) INCLUDE (String);
    GO
    --Second attempt
    SELECT String
    FROM dbo.Sample
    WHERE StringRplc LIKE '%''%'; --Returns 3 rows
    GO
    --Clean up
    DROP TABLE dbo.Sample;
    GO