pythonamazon-web-servicesamazon-redshiftunaccent

Python UDF function in Redshift always return NULL value


I want to have a function in Redshift that removes accents from words. I have found a question in SO(question) with the code in Python for making it. I have tried a few solutions, one of them being:

import unicodedata
def remove_accents(accented_string):
    nfkd_form = unicodedata.normalize('NFKD', input_str)
    return u"".join([c for c in nfkd_form if not unicodedata.combining(c)])

Then I create the function in Redshift as follows:

create function remove_accents(accented_string varchar)
returns varchar
immutable
as $$
import unicodedata
def remove_accents(accented_string):
    nfkd_form = unicodedata.normalize('NFKD', input_str)
    return u"".join([c for c in nfkd_form if not unicodedata.combining(c)])
$$ language plpythonu;

And I apply it to a column with:

SELECT remove_accents(city) FROM info_geo

Getting just null values. The column city is of varchar type. Why am I getting null values and how could I solve it?


Solution

  • You don't need to create a Python function inside the UDF. Either add a call of the function or write it as a scalar expression:

    create function remove_accents(accented_string varchar)
    returns varchar
    immutable
    as $$
      import unicodedata
      nfkd_form = unicodedata.normalize('NFKD', accented_string)
      return u"".join([c for c in nfkd_form if not unicodedata.combining(c)])
    $$ language plpythonu;