[SOLVED] Power BI M code Split column from Lowercase to Uppercase won't include diacritics (accents)

Power BI M code Split column from Lowercase to Uppercase won't include diacritics (accents)

In Power BI, I have a column that I need to split all values from Lowercase to Uppercase:

Column1
Blue
Blue
Energy
Energy
BlueÉnergie
Énergie
ConformitéBlue
Blue
ConformitéÉnergie
Énergie

The obvious solution is to use the built in function which ends up looking like this:

= Table.SplitColumn(Source, "Column1", Splitter.SplitTextByCharacterTransition({"a".."z"}, {"A".."Z"}), {"Column1.1", "Column1.2"})

The problem is that I don't get the expected result, ending up with this:

Column1.1	Column1.2
Blue	Energy
Blue	null
Energy	null
Energy	Blue
BlueÉnergie	null
Énergie	Blue
ConformitéBlue	null
Blue	Conformité
ConformitéÉnergie	null
Énergie	Conformité

when the expected result should be :

Column1.1	Column1.2
Blue	Energy
Blue	null
Energy	null
Energy	Blue
Blue	Énergie
Énergie	Blue
Conformité	Blue
Blue	Conformité
Conformité	Énergie
Énergie	Conformité

How can I get the expected result as concisely as possible such that the column will properly split from ALL lowercase characters to ALL uppercase characters?

Note: spaces and special characters do appear in my data set so I cannot split on any delimiters

Solution

@michal is on the right track but rather than list out every character, you can use the unicode blocks. See https://powerquery.how/character-fromnumber/.

When you write {"a".."z"}, under the covers, PQ is saying get me the unicode block from 66 to 91. You can leverage this by seeing other unicode blocks here: https://en.wikipedia.org/wiki/List_of_Unicode_characters and just use the start and end characters for each block you want to insert. i.e. {"Ā".."ſ"} gets you 192 to 319 which is extended latin containing cedillas,acutes,carons etc