linuxxmlpdfterminalpdf-to-html

Choose encoding for pdftohtml


How can I force pdftohtml output to be UTF 8?

$ pdftohtml -enc utf8 my.pdf 
Error: Couldn't find unicodeMap file for the 'utf-8' encoding

And -listenc doesn't seem to be a valid option.

I think it is using ISO-8859-1 by default (although for some reason VIM reads the file and special characters fine even though :set enc? reports utf-8)


Solution

  • Please run the command by using pdftohtml -enc UTF-8 file.pdf Like:

    $ pdftohtml -enc UTF-8 my.pdf