Using PyICU, how can I use a Collator to sort a list of strings by "natural order", i.e., putting 10 after 2 instead of before?
In the ICU docs http://userguide.icu-project.org/collation/customization#TOC-Default-Options, I can see that there is a "numericOrdering" option (a.k.a. UCOL_NUMERIC_COLLATION) that can be set on or off, but I can't figure out how to set that attribute from Python code.
You can use the .setAttribute
method on the Collator instance.
The attribute name and value come from an enum that's attached to the main icu
module:
import icu
collator = icu.Collator.createInstance(icu.Locale('en_US'))
collator.setAttribute(icu.UCollAttribute.NUMERIC_COLLATION, icu.UCollAttributeValue.ON)
sorted(['3 three', '1 one', '10 ten', '2 two'])
# ['1 one', '10 ten', '2 two', '3 three']
sorted(['3 three', '1 one', '10 ten', '2 two'], key=collator.getSortKey)
# ['1 one', '2 two', '3 three', '10 ten']