htmlhtml-entities

Are there some valid HTML entities without the semicolon?


Looking at this official entities.json file, some of the entities are defined without an ending semicolon.

For example:

"&Acirc": { "codepoints": [194], "characters": "\u00C2" },
"Â": { "codepoints": [194], "characters": "\u00C2" },

Where is that documented in HTML5? Or is that a browser thing¹?

¹ thing as in extension for backward compatibility.


Solution

  • Named HTML entities without a semicolon are not valid, per the HTML spec, but browsers are required to support some of them anyway. (This spec pattern - where something is officially illegal for you to do as a HTML author, but still has a single unambiguously specified behaviour that browsers must implement - is used a lot in the HTML spec.)

    There are a few pertinent sections in the spec:

    As a final bit of corroboration that entities like &Acirc are invalid but work anyway, we can use this test document:

    <!DOCTYPE html>
    <html lang="en">
      <title>Test page</title>
      <div>&Acirc</div>
    </html>
    

    Open it in Chrome, and it works and shows us an A with a circumflex accent:

    screenshot

    But paste it into the Nu Html Checker (endorsed by WhatWG), and we get an error stating "Named character reference was not terminated by a semicolon.":

    screenshot

    i.e. it works, but it's invalid.