Using this javascript code we can remove accents/diacritics in a string.
var originalText = "éàçèñ"
var result = originalText.normalize('NFD').replace(/[\u0300-\u036f]/g, "")
console.log(result) // eacen
If we create a BigQuery UDF it does not (even with double \).
CREATE OR REPLACE FUNCTION project.remove_accent(x STRING)
RETURNS STRING
LANGUAGE js AS """
return x.normalize("NFD").replace(/[\u0300-\u036f]/g, "");
""";
SELECT project.remove_accent("éàçèñ") --"éàçèñ"
Any thoughts on that?
Consider below approach
select originalText,
regexp_replace(normalize(originalText, NFD), r"\pM", '') output
if applied to sample data in your question - output is
You can easily wrap it with SQL UDF if you wish