unicodecharacter-encodingutf-16utf-32double-byte

True double byte encoding


Exist some real double byte encoding (DBCS)?

The same question for 4 bytes encoding, exists any(not UCS-4, UTF-32)?

Thanks.


Solution

  • No, there are no double-byte character sets that satisfy your list of requirements. This is because designers back in the day used 7-bit ASCII as their starting point (good for compatibility), then put extra characters or multi-byte start codes in the upper half of the 256 byte values.

    Similarly for quad-byte character sets, no serious standard before Unicode even tried to provision for more than 65536 characters.

    To give one example, Chinese Big5 uses ASCII definitions for bytes 0x00 to 0x7F, uses 0x81 to 0xFF as a start byte for extended characters, and {0x40 to 0x7E, 0xA1 to 0xFE} for the second byte. This can code a maximum of 20067 different characters.