pythonpython-module-unicodedata

Convert check mark in Python


I have a dataframe which has, in a certain column, a check mark (unicode: '\u2714'). I have been trying to replace it with the following coomand:

import unicodedata
df['Column'].str.replace(unicodedata.lookup("\u2714"), '')

But, i keep on reading this error: KeyError: "undefined character name '✔'".

Do you have an idea how to solve this?


Solution

  • check mark (unicode: '\u2714')

    No. This is SMALL ROMAN NUMERAL FIVE. unicodedata.lookup should be feed with Unicode name of character, not character itself. You might use .str.replace with Unicode characters directly, so rather than

    import unicodedata
    df['Column'].str.replace(unicodedata.lookup("\u2714"), '')
    

    you might do

    df['Column'].str.replace("\u2714", '')
    

    simple example

    import pandas as pd
    df = pd.DataFrame({'col':['YES\u2714']})
    print(df["col"].str.replace("\u2714", ''))
    

    output

    0    YES
    Name: col, dtype: object