unicodegrapheme

Unicode GraphemeBreakProperty spec including extra characters?


I was looking at the Unicode GraphemeBreakProperty spec and according to the table specified in Unicode Standard Annex #29 the Prepend property should include all code points with Indic_Syllabic_Category = Consonant_Preceding_Repha or Indic_Syllabic_Category = Consonant_Prefixed, or Prepended_Concatenation_Mark = Yes. The spec lists the code points as follows:

# ================================================

0600..0605    ; Prepend # Cf   [6] ARABIC NUMBER SIGN..ARABIC NUMBER MARK ABOVE
06DD          ; Prepend # Cf       ARABIC END OF AYAH
070F          ; Prepend # Cf       SYRIAC ABBREVIATION MARK
08E2          ; Prepend # Cf       ARABIC DISPUTED END OF AYAH
0D4E          ; Prepend # Lo       MALAYALAM LETTER DOT REPH
110BD         ; Prepend # Cf       KAITHI NUMBER SIGN
110CD         ; Prepend # Cf       KAITHI NUMBER SIGN ABOVE
111C2..111C3  ; Prepend # Lo   [2] SHARADA SIGN JIHVAMULIYA..SHARADA SIGN UPADHMANIYA
1193F         ; Prepend # Lo       DIVES AKURU PREFIXED NASAL SIGN
11941         ; Prepend # Lo       DIVES AKURU INITIAL RA
11A3A         ; Prepend # Lo       ZANABAZAR SQUARE CLUSTER-INITIAL LETTER RA
11A84..11A89  ; Prepend # Lo   [6] SOYOMBO SIGN JIHVAMULIYA..SOYOMBO CLUSTER-INITIAL LETTER SA
11D46         ; Prepend # Lo       MASARAM GONDI REPHA

# Total code points: 24

Doing a search on the UnicodeSet Utility for characters with those properties only lists 22 code points. What are 1193f and 11941 and why are they included in the Prepend GraphemeBreakProperty? Does the annex just fail to list them in the table? Any help figuring out why the table and spec seem to differ would be great!

Thanks!


Solution

  • U+1193F and U+11941 were added in Unicode 13.0. The link in the question now includes them in the listing.