If this was PHP, I would probably do something like this:
function no_more_half_widths($string){
$foo = array('1','2','3','4','5','6','7','8','9','10')
$bar = array('1','2','3','4','5','6','7','8','9','10')
return str_replace($foo, $bar, $string)
}
I have tried the .translate function in python and it indicates that the arrays are not of the same size. I assume this is due to the fact that the individual characters are encoded in utf-8. Any suggestions?
The built-in unicodedata
module can do it:
>>> import unicodedata
>>> foo = u'1234567890'
>>> unicodedata.normalize('NFKC', foo)
u'1234567890'
The “NFKC” stands for “Normalization Form KC [Compatibility Decomposition, followed by Canonical Composition]”, and replaces full-width characters by half-width ones, which are Unicode equivalent.
Note that it also normalizes all sorts of other things at the same time, like separate accent marks and Roman numeral symbols.