javascriptstringasciinon-breaking-characters

JavaScript: replace non breaking space and special space characters with ordinary space


I was trying to debug a problem of searching inside a string and it came down to the following interesting piece of code.

Both "item " and "item " seem equal but they are not!

var result = ("item " === "item ");

document.write(result);
console.log(result);

After investigating this further by pasting it on a Python interpreter, I found out that the first "item " has a different kind of space as "item\xc2\xa0". Which I think is a non breaking space.

Now, A possible solution to match these strings will be to replace \xc2\xa0 with space, but is there a better approach to convert all special space characters with normal space?.


Solution

  • The space in the first string is character code 160 (a non-breaking space), and the space in the second string is character code 32 (a normal space), so the strings aren't equal to each other.

    console.log("item ".charCodeAt(4), "item ".charCodeAt(4));

    is there a better approach to convert all special space characters with normal space?.

    You can match space characters which aren't tabs or newlines and replace with a normal space:

    const makeSpacesNormal = str => str.replace(/(?=\s)[^\r\n\t]/g, ' ');
    console.log(makeSpacesNormal("item ") === makeSpacesNormal("item "));

    Specifically, the \s will match a whole bunch of space-like characters:

    [\t\n\v\f\r \u00a0\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200a\u200b\u2028\u2029\u3000]
    

    and by matching and replacing those (except newlines and tabs, if you want), you'll be left with ordinary spaces.