javascriptparsingunicodehtml-parsingdecodeuricomponent

Difference in behavior parsing unicode in Javascript


When I run the JSON.parse or decodeURI directly with the variable (_) / regex expression that gives the string, it throws error or does not decode but when I run these functions with the same string in the console, it works. I'm sure I'm missing something stupid but couldn't figure out what.

This is from the debugger console on VS code:

enter image description here

enter image description here

_ comes by parsing some values using regular expression from a html string. data is a HTML string that comes from external system. Something like:

const _ = /(?<=customer_data\['some_property'\] = ').*?(?=';)/.exec(data)?.[0] || '{}';

I can't figure out even how to approach this issue.


Solution

  • The string you are trying to parse is already escaped, i.e: \\x7b\\x222\\x22:\\x22041169082\\x22\\x7d.

    Replace all \\xs with a % and decode it using decodeURIComponent:

    const _ = '\\x7b\\x222\\x22:\\x22041169082\\x22\\x7d';
    
    console.log(
      decodeURIComponent(_.replace(/\\x/g, '%'))
    );