pythonpython-sphinxdocutils

Set document title with custom sphinx parser


I am looking into using sphinx to generate documentation for a non-python (SKILL, though not relevant here) project. Most of the functions have docstrings, so as a first approach I settled on creating a custom parser to document procedures from files, using the provided docstrings.

The following is my current parser structure, that I also provided in this other answer:

from docutils import nodes
from sphinx.parsers import Parser

class MySkillParser(Parser):
    supported = ("skill",)  # Declare supported languages
    def parse(self, inputstring, document):
        # Here, parse the document contained in "inputstring" and dynamically generate content
        txt = nodes.Text('HI')
        document += txt

def setup(app):  # This is called when loading the extension
    app.add_source_suffix('.il', 'skill')
    app.add_source_parser(MySkillParser)

    return {'version': '0.1'}

Besides some difficulty encountered with adding nodes, this seems to work well, except for the following warning:

WARNING: toctree contains reference to document 'myfile' that doesn't have a title: no link will be generated

This happens when referencing myfile in the table of content of some other document, such as the index. myfile.il is parsed and an html file is generated from it thanks to the parser (with the code above, it will contain "HI"). However, sphinx does not know the document title, so it refuses to generate a table of contents entry.

My question is simple: how do I make the name of the document known to sphinx?

This is similar to the following questions:

However, these do not refer to a custom parser, but use an existing rst or md parser.

The sphinx documentation states that:

A field list near the top of a file is normally parsed by docutils as the docinfo and shown on the page. However, in Sphinx, a field list preceding any other markup is moved from the docinfo to the Sphinx environment as document metadata, and is not displayed in the output.

So I think the parser needs to modify the sphinx environment, but what exactly? Following sphinx source code and some documentation, I tried multiple things, independently and in combination, to no avail:


title = "Document title"

def parse(self, inputstring, document):
    # The following was also tried with self.env.get_doctree(self.env.docname) instead of "document"
    document.settings.title = titletxt
    document.settings._title = titletxt  # Inspired from https://github.com/sphinx-doc/sphinx/blob/35965903177c6ed9a6afb62ccd33243a746a3fc0/sphinx/builders/latex/__init__.py#L314C20-L314C20
    document['docname'] = titletxt  # from https://github.com/sphinx-doc/sphinx/blob/35965903177c6ed9a6afb62ccd33243a746a3fc0/sphinx/builders/latex/__init__.py#L347
    document['contentsname'] = titletxt  # from https://github.com/sphinx-doc/sphinx/blob/35965903177c6ed9a6afb62ccd33243a746a3fc0/sphinx/builders/latex/__init__.py#L306C17-L306C40
    titlenode = nodes.title('Hi', nodes.Text('Title node'))
    document += titlenode
    self.env.titles[self.env.docname] = titlenode
    self.env.longtitles[self.env.docname] = titlenode
    self.env.tocs[self.env.docname] = titlenode # Less sure about this one
    document.title = 'another futile attempt'
    # Also tried: parse a rst string into the document with docutils.core.publish_doctree(rst)

After quite a few more attempts, I resolved to ask here. At this point, I am unsure if the Parser is responsible for setting the title, or if it should be set somewhere else.


Solution

  • After a lot of painful debugging, it turns out sphinx extracts the first section of the document and uses that as a metadata source, regardless of which parser is used.

    Therefore, one needs to add the necessary information such as the document title as a section in the document object:

    from docutils import nodes
    
    class MySkillParser(Parser):
        supported = ("skill",)
    
        def parse(self, inputstring, document):
            # ids are necessary. See https://pydoc.dev/docutils/latest/docutils.nodes.html#make_id
            mainsection = nodes.section(ids=[self.env.docname])
            document += mainsection  # Can be added before section is populated
    
            doctitle = nodes.title(text=f"Autogenerated {self.env.docname}")
            mainsection += doctitle
    
            sub1 = nodes.section(ids=["sub1"])
            sub1 += nodes.title(text="Subsection 1")
            sub1 += nodes.paragraph(text="This paragraph is in subsection 1")
            mainsection += sub1
    

    Corresponding output for an empty test.il page:

    browser screenshot of html page generated by the above parser


    Note that the page is listed in the navigation sidebar. It can also be referenced from toc elements in other pages.