utf-8linguistics

Tartar language and UTF-8


I'm working on a project that includes transforming latin symbols into the corresponding cyrillic ones. I'm talking about Tartar language which is used by one of many nations in Russia. I tried find these symbols in UTF-8 but failed so far. The only thing I need, is UTF codes for tartar symbols. There are 6 of them. Thank you!!


Solution

  • I'm not sure which "6 of them" you are referring to.

    From Wikipedia:

    The official Cyrilic version of the Tatar alphabet used in Tatarstan contains 39 letters:

    А Ә Б В Г Д Е (Ё) Ж Җ З И Й К Л М Н Ң О Ө П Р С Т У Ү Ф Х Һ Ц Ч Ш Щ Ъ Ы Ь Э Ю Я

    Unicode code points:

    U+0410 А
    U+04D8 Ә
    U+0411 Б
    U+0412 В
    U+0413 Г
    U+0414 Д
    U+0415 Е
    U+0401 Ё
    U+0416 Ж
    U+0496 Җ
    U+0417 З
    U+0418 И
    U+0419 Й
    U+041A К
    U+041B Л
    U+041C М
    U+041D Н
    U+04A2 Ң
    U+041E О
    U+04E8 Ө
    U+041F П
    U+0420 Р
    U+0421 С
    U+0422 Т
    U+0423 У
    U+04AE Ү
    U+0424 Ф
    U+0425 Х
    U+04BA Һ
    U+0426 Ц
    U+0427 Ч
    U+0428 Ш
    U+0429 Щ
    U+042A Ъ
    U+042B Ы
    U+042C Ь
    U+042D Э
    U+042E Ю
    U+042F Я