pythonencodingutf-8gbkjemdoc

A bug when using jemdoc+mathjax


I am using jemdoc+mathjax(http://www.mit.edu/~wsshin/jemdoc+mathjax.html) to make my website. However, when I am compiling, I came with the following mistake. If I want to simply compile jemdoc.py home, then everything goes ok. However, when I want to compile with the defult mysite.conf as follows

jemdoc.py -c mysite.conf home

then it does not run and the here is the bug report

Traceback (most recent call last):
  File "C:\homepage\jemdoc.py", line 1646, in <module>
    main()
  File "C:\homepage\jemdoc.py", line 1642, in main
    procfile(f)
  File "C:\homepage\jemdoc.py", line 1390, in procfile
    out(f.outf, f.conf['bodystart'])
  File "C:\homepage\jemdoc.py", line 380, in out
    f.write(s)
UnicodeEncodeError: 'gbk' codec can't encode character '\u2630' in position 747: illegal multibyte sequence

My system is windows 10 and the language is Chinese. But in my home.jemdoc, there is no Chinese character. Also, compiling using either python 2 or python 3 has the above problem.

Does anyone know how to solve it? Thanks a lot!


Solution

  • Replace the character (U+2630, Trigram For Heaven) with another one (a similar glyph), e.g. with (U+2261, Identical To).

    'gbk' codec then encodes this character as

    '\u2261'.encode('gbk')    # b'\xa1\xd4'
    

    Another similar glyphs \u2506 or \u2507:

    In Python:

    '┆ ┇'.encode('gbk')       # b'\xa9\xaa \xa9\xab'