I have observed in CLDR-25-data following entries for list pattern formats in arabic locale (similar also in hebrew):
<listPatterns>
<listPattern>
<listPatternPart type="start" draft="contributed">{0}، {1}</listPatternPart>
<listPatternPart type="middle" draft="contributed">{0}، {1}</listPatternPart>
<listPatternPart type="end" draft="contributed">{0}، و {1}</listPatternPart>
<listPatternPart type="2" draft="contributed">{0} و {1}</listPatternPart>
</listPattern>
</listPatterns>
Note that the LDML-specification only speaks about placeholders of the form "{0}" or "{1}" (not like in list pattern parts for types "end" and "2"). See also:
http://cldr.unicode.org/development/development-process/design-proposals/list-formatting
or
http://cldr.unicode.org/translation/lists
I suspect this has something to do with right-to-left-style, but how in detail?
UPDATE:
Now I have written a small Java program to see the real sequence of chars.
String s = "{0} و {1}"; // as displayed in browser or IDE-window
for (char c : s.toCharArray()) {
System.out.println(c);
}
The output is:
{
0
}
و
{
1
}
So it seems to be a display problem, not a problem of the char sequence itself?! I use Internet Explorer version 9 and Eclipse 4.3.
The char sequence is here (in codepoints):
123=>{
48=>0
125=>}
32=>
1608=>و // DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC=true
32=>
123=>{
49=>1
125=>}
Unicode infers the display style also from evaluating the bidirectional context. So here the unicode algorithm seems to apply first the standard LTR-context to the first chars found - hence preserving the char sequence "{0} ".
When the algorithm enters the arabic char it denotes its bidirectional status and applies it to the following next chars. According to the official paper of W3C this means:
The shape of opening bracket glyph "{" changes to "}" in RTL-context (right-to-left). So from the perspective of arabic char the sequence left to arabic char is "1} ", and this is equivalent to the usual LTR-form " {1". After having read the ASCII-char "1" the unicode algorithm evaluates that now the context is LTR again, so displaying the closing bracket in normal form "}". The final visual result (not in terms of codepoints however) is then as if there were one extra closing bracket and one less opening bracket.
I hope SO-readers might find this explanation useful if they encounter similar strange visual effects in bidirectional context.