I'm on OS X 10.5.5 (though it does not matter much I guess)
I have a set of text files with fancy characters like double backquotes, ellipsises ("...") in one character etc.
I need to convert these files to good old plain 7-bit ASCII, preferably without losing character meaning (that is, convert those ellipses to three periods, backquotes to usual "s etc.).
Please advise some smart command-line (bash) tool/script to do that.
The Elinks web browser will convert Unicode entities to their ASCII equivalents, giving things like "--" for "—" and "..." for "…", etc. There is a python module python-elinks which uses the same conversion table, and it would be trivial to turn it into a shell filter, like this:
#!/usr/bin/env python
import elinks
import sys
for line in sys.stdin:
line = line.decode('utf-8')
sys.stdout.write(line.encode('ASCII', 'elinks'))