pythoncelementtree

Parsing XML using xml.etree.cElementTree


I have the following XML in a string named 'xml':

<?xml version="1.0" encoding="ISO-8859-1"?>
<Book>
  <Page>
    <Text>Blah</Text>
  </Page>
</Book>

I'm trying to get the value Blah out of it but I'm having trouble with xml.etree.cElementTree. I've tried the find() and findtext() methods but nothing. Eventually I did this:

import xml.etree.cElementTree as ET
...
root = ET.fromstring(xml)
element = root.getchildren()[0].getchildren()[0]

Element now equals the element, which is what I want (for this solution anyway), but how do I get the inner text from it? element.text does not work. Any ideas?

EDIT: element.text gives me None

PS: I am using Python 2.5 atm.

As an extra question: what is a better way to parse xml strings in python?


Solution

  • Please explain what "does not work" means to you. What I guess is the code that you ran (or should have ran) worked for me (Python 2.x for x in (5, 6)) -- see below. It even worked on Python 2.1 with the appropriate change to the import statement. Note that I displayed element.tag to show that it is referring to the desired element.

    >>> xml = """\
    ... <?xml version="1.0" encoding="ISO-8859-1"?>
    ... <Book>
    ...   <Page>
    ...     <Text>Blah</Text>
    ...   </Page>
    ... </Book>
    ... """
    >>> import xml.etree.cElementTree as ET
    >>> root = ET.fromstring(xml)
    >>> element = root.getchildren()[0].getchildren()[0]
    >>> element.tag
    'Text'
    >>> element.text
    'Blah'
    >>>
    

    Perhaps you'd like to take a rain-check on your extra question till we get the first one sorted out ;-)