What is the right way to localize a list of strings? I know that the separator can be localized to a comma or a semi-colon but does the conjunction get localized? If so, what would my format string for an arbitrary length list look like?
Example
"Bat, Cat and Dog". I could use the separator as per locale and construct the LIST as the following:
LIST := UNIT
LISTMID := UNIT SEPARATOR UNIT
LISTMID := LISTMID SEPARATOR UNIT
LIST := UNIT CONJUNCTION UNIT
LIST := LISTMID CONJUNCTION UNIT
Would I have to craft this rule per language? Any libraries available to help with this?
I came here looking for an answer to the same question, and ended up doing more googling, which found this: http://icu-project.org/apiref/icu4j/com/ibm/icu/text/ListFormatter.html
The class takes Parameters two
, start
, middle
, and end
:
So, for English, that would be:
- TWO := "{0} and {1}"
- START := "{0}, {1}"
- MIDDLE := "{0}, {1}"
- END := "{0} and {1}"
I wrote a quick Lua demonstration for how I imagine this works:
function list_format(words, templates)
local length = #words
if length == 1 then return words[1] end
if length == 2 then
return replace(replace(templates['two'], '{0}', words[1]),
'{1}', words[2])
end
local result = replace(templates['end'], '{1}', words[length])
while length > 3 do
length = length - 1
local mid = replace(templates['middle'], '{1}', words[length])
result = replace(result, '{0}', mid)
end
result = replace(result, '{0}', words[2])
result = replace(templates['start'], '{1}', result)
result = replace(result, '{0}', words[1])
return result
end
function replace(template, index, text)
str, _ = string.gsub(template, index, text)
return str
end
local english = {
["two"] = "{0} and {1}",
["start"] = "{0}, {1}",
["middle"] = "{0}, {1}",
["end"] = "{0} and {1}"
}
print(list_format({"banana"}, english))
print(list_format({"banana", "apple"}, english))
print(list_format({"banana", "apple", "mango"}, english))
print(list_format({"banana", "apple", "mango", "pineapple"}, english))
It should be trivial to adapt this for other languages.