I am working with some open data through Deep Note with the pandas library and since it is in Spanish there are accents and characters like 'ñ' in the DataFrame
Searching I have been able to solve part of the problem by putting 'encoding'. The problem is when I publish the page that they appear as strange signs because of the accents like 'á é í ó ú ñ' and then I would like to know if there is any way to read the columns that contain words and change it to their respective without accent.
datos = pd.read_csv("/work/avisos",delimiter = ';', encoding="ISO-8859-1")
import unicodedata
def remove_accents(x):
return (unicodedata.normalize('NFD', x)
.encode('ascii', 'ignore')
.decode('utf-8'))
word_cols = df.dtypes[lambda x: x.eq('object')].index.tolist()
df[word_cols] = df[word_cols].applymap(remove_accents)
Adapted from: How to replace accented characters?
This being said, you may only need to do:
return unicodedata.normalize('NFD', x)
For the accents to appear as expected on the published page ~