Assume the string myStr
which contains three special characters.
myStr = "emdash —; delta Δ; thin space: ;"
Further assume that we wish to write this string to a LaTeX document via pylatex.
If we write the string as is to a LaTeX document, errors occur during its compilation:
import pylatex
doc = pylatex.Document()
with doc.create(pylatex.Section('myStr -- not encoded')):
doc.append(myStr)
doc.generate_pdf("myStr_notEncoded", clean_tex=False)
...
! Package inputenc Error: Unicode character Δ (U+0394)
(inputenc) not set up for use with LaTeX.
...
! Package inputenc Error: Unicode character (U+2009)
(inputenc) not set up for use with LaTeX.
...
If we first encode the string via pylatexenc, the special characters are either represented by their respective LaTeX encoding (emdash, delta) or encoded in a way unclear to me (thin space).
import pylatexenc
from pylatexenc import latexencode
myStr_latex = pylatexenc.latexencode.unicode_to_latex(myStr)
doc = pylatex.Document()
with doc.create(pylatex.Section('myStr')):
doc.append(myStr_latex)
doc.generate_pdf("myStr", clean_tex=False)
How do I have to write the string into the LaTeX document so that the special characters are printed as the actual characters when compiling with pdflatex?
Edit 1:
I also tried to change the default encoding inside the LaTeX document for the unencoded pathway but it results in a series of compilation errors as well.
doc.preamble.append(pylatex.NoEscape("\\usepackage[utf8]{inputenc}"))
You were close with your pylatexenc
solution. When you encode latex yourself, e.g. with pylatexenc.latexencode.unicode_to_latex()
you have to ensure that you tell pylatex the string should not be additional escaped. To wit:
Using regular LaTeX strings may not be as simple as is seems though, because by default almost all strings are escaped[...] there are cases where raw LaTeX strings should just be used directly in the document. This is why the
NoEscape
string type exists. This is just a subclass ofstr
, but it will not be escaped
In other words to solve, just make sure to use NoEscape
to tell pylatex your string is already encoded as latex and not to encode it again:
import pylatex
from pylatexenc import latexencode
myStr_latex = latexencode.unicode_to_latex(myStr)
doc = pylatex.Document()
with doc.create(pylatex.Section('myStr')):
doc.append(pylatex.utils.NoEscape(myStr_latex))
doc.generate_pdf("myStr", clean_tex=False)