I want to use an ICU system-insensitive sorting collation, to avoid sorting differences between postgres11-on-mac vs postgres11-on-Ubuntu. My first test was to dump out my existing Collate=en_US.UTF-8
and pg_restore them into a db created with Collate=en-US-x-icu
Create Database doc has this to say:
To create a database music with a different locale:
CREATE DATABASE music LC_COLLATE 'sv_SE.utf8' LC_CTYPE 'sv_SE.utf8' TEMPLATE template0;
select collname, collprovider from pg_collation where collname like 'en_US%';
collname | collprovider
------------------------+--------------
en_US.UTF-8 | c
en_US | c
en_US.ISO8859-15 | c
en_US.ISO8859-1 | c
en_US | c
en_US | c
en-US-x-icu | i ๐
en-US-u-va-posix-x-icu | i ๐
(8 rows)
But no luck when creating a database with either icu locales.
CREATE DATABASE test LC_COLLATE = 'en-US-x-icu' TEMPLATE template0;
ksysdb=# CREATE DATABASE test LC_COLLATE = 'en-US-x-icu' TEMPLATE template0;
ERROR: invalid locale name: "en-US-x-icu"
LC_COLLATE
with other locales:The LC_COLLATE
clause does seem to come with some strings attached, such as watching your encoding and specifying an appropriate template. But it seems to give error hints w non-ICU locales.
This works, for example: CREATE DATABASE test LC_COLLATE = 'en_US' TEMPLATE template0;
and this one gives a helpful user message:
ksysdb=# CREATE DATABASE test LC_COLLATE = 'en_US.ISO8859-15' TEMPLATE template0;
ERROR: encoding "UTF8" does not match locale "en_US.ISO8859-15"
DETAIL: The chosen LC_COLLATE setting requires encoding "LATIN9".
Note: a related question, PostgreSQL 10 on Linux - LC_COLLATE locale en_US.utf-8 not valid, doesn't seem all that relevant, as the answer talks about generating an OS-level locale to fix the issue. While the ICU locales, as far as I understand, are expressly intended to be separated from the underlying OS.
This is possible from PostgreSQL v15 on:
CREATE DATABASE test
LOCALE_PROVIDER icu
ICU_LOCALE "en-US"
LOCALE "en_US.utf8"
TEMPLATE template0;
You still need to provide the libc locale, but the ICU library is used for collations.