mysqlcharacter-encodingcollation

MySQL DISTINCT and accents


A MySQL database running on Debian (version 5.5.41-0+wheezy1-log).

A table hotels with a column name VARCHAR(128) and engine is InnoDB.

CREATE TABLE `hotels` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT
  `name` varchar(128) NOT NULL DEFAULT '' COMMENT 'Hotel Name',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=0 DEFAULT CHARSET=utf8;

There are two records in this table:

1    BEST WESTERN PREMIER LE CARRE FOLIES OPERA
2    BEST WESTERN PREMIER LE CARRÉ FOLIES OPÉRA

When executing select DISTINCT name FROM hotels, the query is returning only 1 record, while 2 records were expected to be returned.

The DBMS doesn't seem to distinct between E and É.

How to change the table settings in order to get the expected result?


Solution

  • The table collation was set to utf8_general_ci. This was the default setting of the MySQL server and the schema.

    There are 3 collation names available in MySQL 5.5:

    The collation had to be changed to utf8_bin. This can be done for the table and all columns with this query:

    ALTER TABLE hotels CONVERT TO CHARACTER SET utf8 COLLATE utf8_bin;