htmlutf-8ansinonblank

What is this " " called?


I am trying to parse a website, and I am trying to replace all occurrence of " " in a string. This doesn't seem to be space or tab, what is this?

a more general question: how do you search for the name of some char you don't know? I tried ansi and utf-8 page with not result.


Solution

  • It is character code 12288, a/k/a an ideographic space for use in, for example, many Asian languages. You can check this with this code:

    alert( " ".charCodeAt(0) );
    

    More info here.

    Edit: You can match this with the regex \s. For example, this converts all of those characters to a single, regular space (character 32):

    "foo bar baz".replace(/\s/g, ' '); // produces foo bar baz
    

    To replace this character but leave alone "normal" spaces (character 32, tab, new line, carriage return), you might try this:

    "foo bar baz\tblah\tblah\nblah".replace(/(?![ \t\r\n])\s/g, ' ')