pythonparsinglatex

Blending a LaTeX document with python package pylatexenc


I want to blend a LaTeX document. I thought I would use the pylatexenc package to parse the document, then run through the input structure blend the existing text and output the resulting structure. Is there an example program that I can derive the rest from for my own? I tried it, but had difficulty making changes to the content of the individual elements, almost as if they were deepcopies.

I tried this,

walker = LatexWalker(input)
nodelist, pos, len = walker.get_latex_nodes()
...
for node in nodelist:
    process_node(node)
...
for node in nodelist:
    output.write(node.latex_verbatim())

but had difficulty making changes to the content of the individual elements, almost as if they were deepcopies. Output was the same as input :-(


Solution

  • In the meantime, I have found a solution to this for myself. The original document is a DOCX. I convert this document to LaTeX using the OSS tool docx2tex. After not finding a solution on the TEX side, I searched on the DOCX side. Using the Python package python-docx, you can read in a DOCX as XML, process it and write it out again. With AI support, I developed a Python script that blends the DOCX. That way it works too, of course.