pythonxmlpython-3.xelementtree

Is there a way to get a line number from an ElementTree Element


So I'm parsing some XML files using Python 3.2.1's cElementTree, and during the parsing I noticed that some of the tags were missing attribute information. I was wondering if there is any easy way of getting the line numbers of those Elements in the xml file.


Solution

  • Looking at the docs, I see no way to do this with cElementTree.

    However I've had luck with lxmls version of the XML implementation. Its supposed to be almost a drop in replacement, using libxml2. And elements have a sourceline attribute. (As well as getting a lot of other XML features).

    Only caveat is that I've only used it in python 2.x - not sure how/if it works under 3.x - but might be worth a look.

    Addendum: from their front page they say :

    The lxml XML toolkit is a Pythonic binding for the C libraries libxml2 and libxslt. It is unique in that it combines the speed and XML feature completeness of these libraries with the simplicity of a native Python API, mostly compatible but superior to the well-known ElementTree API. The latest release works with all CPython versions from 2.3 to 3.2. See the introduction for more information about background and goals of the lxml project. Some common questions are answered in the FAQ.

    So it looks like python 3.x is OK.