I'm currently using SQL Server 2016 with SQL_Latin1_General_CP1_CI_AI collation. As expected, queries with the letter e will match values with the letters e, è, é, ê, ë, etc because of the accent insensitive option of the collation. However, queries with a ' (U+0027) do not match values containing a ’ (U+2019). I would like to know if such a collation exists where this case would match, since it's easier to type ' than it is to know that ’ is keystroke Alt-0146.
I'm confident in saying no. The main thing, here, is that the two characters are different (although similar). With accents, e and ê are still both an e (just one has an accent). This enables you (for example) to do searches for things like SELECT * FROM Games WHERE [Name] LIKE 'Pokémon%';
and still have rows containing Pokemon
return (because people haven't used the accent).
The best thing I could suggest would be to use REPLACE
(at least in your WHERE
clause) so that both rows are returned. That is, however, likely going to get expensive.
If you know what columns are going to be a problem you could add a Computed Column to that table and index it appropriately. Then you could use that column in your WHERE
clause, but display the original one. Something like:
USE Sandbox;
--Create dbo.Sample table and data
CREATE TABLE dbo.Sample (String varchar(500));
INSERT INTO dbo.Sample
VALUES ('This is a string that does not contain either apostrophe'),
('Where as this string, isn''t without at least one'),
('’I have one of them as well’'),
('’Well, I''m going to use both’');
GO
--First attempt (without the column)
SELECT String
FROM dbo.Sample
WHERE String LIKE '%''%'; --Only returns 2 of the rows
GO
--Create a PERSISTED Column
ALTER TABLE dbo.Sample ADD StringRplc AS REPLACE(String,'’','''');
GO
CREATE INDEX IX_Sample_StringRplc ON dbo.Sample (StringRplc) INCLUDE (String);
GO
--Second attempt
SELECT String
FROM dbo.Sample
WHERE StringRplc LIKE '%''%'; --Returns 3 rows
GO
--Clean up
DROP TABLE dbo.Sample;
GO