htmlvalidationunicodenotepad++unicode-normalization

Text run is not in Unicode Normalization Form C


While I was trying to validate my site I get the following error:

Text run is not in Unicode Normalization Form C

A: What does it mean?

B: Can I fix it with notepad++ and how?

C: If B is no, How can I fix this with free tools(not dreamweaver)?


Solution

  • A. It means what it says (see dan04’s explanation for a brief answer and the Unicode Standard for a long one), but it simply indicates that the authors of the validator wanted to issue the warning. HTML5 rules do not require Normalization Form C (NFC); it is rather something generally favored by the W3C.

    B.There is no need to fix anything, unless you decide that using NFC would actually be better. If you do, then there are various tools for automatic conversion to NFC, such as the free BabelPad editor. If you only need to deal with one character not in NFC, you can use character information repositories such as Fileformat.info character search to find out the canonical decomposition of the character and use it.

    Whether you use NFC or not depends on many considerations and on the characters involved. As a rule, NFC works better, but in some cases, an alternative, non-NFC presentation produces more suitable rendering or works better in some specific processing.

    For example, in a duplicate question, the reference Ω has been reported as triggering the message. (The validator actually checks for characters entered as such references, too, instead of just plain text level NFC check.) The reference stands for U+2126 OHM SIGN “Ω”, which is defined to be canonical equivalent to U+03A9 GREEK CAPITAL LETTER OMEGA “Ω”. The Unicode Standard explicitly says that the latter is the preferred character. It is also better covered in fonts. But if you have a special reason to use OHM SIGN, you can do that, without violating current HTML5 rules, and you can ignore the validator warning.