Unicode has characters for START OF HEADING
(␁ U+0001
), START OF TEXT
(␂ U+0002
), END OF TEXT
(␃ U+0003
), and END OF TRANSMISSION
(␄ U+0004
). What's confusing about this is that, while there is a START OF HEADING
character, there is no END OF HEADING
character, and while there is an END OF TRANSMISSION
character, there is no START OF TRANSMISSION
character.
Where are these missing characters?
How should I go about representing the start of a transmission, or the end of a heading, using Unicode?
If the answer is "just use START OF HEADING
in place of START OF TRANSMISSION
," then what should I do if my "transmission" doesn't have a "heading"?
If the second part of the answer is "just use START OF TEXT
in place of END OF HEADING
," what happens if there is something between the heading and the text?†
† I can't imagine that this happens often (if ever), but I'm asking just in case someone out there ever tries to put something between the end of the heading and the start of their text.
Stack Exchange doesn't have a Unicode site, so I'm posting this here. If someone thinks that it would fit better on one of the other Network sites, please let me know in the comments.
The characters U+0000 to U+001F are imported directly from ASCII. If it didn't exist in ASCII, it doesn't exist in Unicode, in that range.
Most are obsolete; in-band delimiters are not so much used nowadays. If you're using an existing protocol with in-band delimiters, it'll have rules based on ASCII usage; if you're designing a new protocol, there are probably better ways to proceed.
As far as I recall, there's no need for end-of-header in typical usage, because that's coincident with start-of-text. There's presumably no need for start-of-transmission because the first thing you receive is the start of transmission, after synchronization (start bits in async disciplines, SYN in sync).