A MySQL database running on Debian (version 5.5.41-0+wheezy1-log).
A table hotels
with a column name VARCHAR(128)
and engine is InnoDB.
CREATE TABLE `hotels` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT
`name` varchar(128) NOT NULL DEFAULT '' COMMENT 'Hotel Name',
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=0 DEFAULT CHARSET=utf8;
There are two records in this table:
1 BEST WESTERN PREMIER LE CARRE FOLIES OPERA
2 BEST WESTERN PREMIER LE CARRÉ FOLIES OPÉRA
When executing select DISTINCT name FROM hotels
, the query is returning only 1 record, while 2 records were expected to be returned.
The DBMS doesn't seem to distinct between E and É.
How to change the table settings in order to get the expected result?
The table collation was set to utf8_general_ci
. This was the default setting of the MySQL server and the schema.
There are 3 collation names available in MySQL 5.5:
_ci
indicates a case-insensitive collation._cs
indicates a case-sensitive collation._bin
indicates a binary collation. Character comparisons are based on character binary code values.The collation had to be changed to utf8_bin
. This can be done for the table and all columns with this query:
ALTER TABLE hotels CONVERT TO CHARACTER SET utf8 COLLATE utf8_bin;