I found some "funny" characters (e.g. ḓ̵̙͎̖̯̞̜̞̪̠ and •̩̩̩̩̩̩̩̩̩̩) in social media that takes more than one line. First I think it is the bug of Firefox. I tried this in Gedit and LibreOffice Writer, they are all the same. So, what is this actually? Actually I am asking about the character encoding and rendering.
I tried to find the character in GNOME Character Map, they could not be found.
I tried to check the character code of both of them in unicode (probably UTF-8). It seems they takes more than one character. How come one character is more than one character? This is the result by using Python.
Character ḓ̵̙͎̖̯̞̜̞̪̠
u'\u2022\u0329\u0329\u0329\u0329\u0329\u0329\u0329\u0329\u0329\u0329\u0329\u0329
\u0329\u0329\u0329\u0329\u0329\u0329\u0329\u0329\u0329\u0329'
Character •̩̩̩̩̩̩̩̩̩̩
u'\u1e13\u0335\u0319\u034e\u0316\u032f\u031e\u031c\u031e\u032a\u0320\u033c\u031e
\u0320\u034e\u033c\u0353\u034b\u036e\u034c\u0346\u0300\u035c\u0345'
U+0329 is COMBINING VERTICAL LINE BELOW. It is a combining character (and so are all the others in there except U+2022 and U+1E13), meaning that it combines with the previous one. What you see here is merely the result of someone stacking way too many combining characters on the same base.