pythonxmlelementtree

Python xml.ElementTree - function to return parsed xml in variable to be used later


I have a funcion which sends get request and parse response to xml:

def get_object(object_name):
    ...
    ...
    #parse xml file
    encoded_text = response.text.encode('utf-8', 'replace')
    root = ET.fromstring(encoded_text)
    tree = ET.ElementTree(root)
    return tree

Then I use this function to loop through object from list to get xmls and store them in variable:

jx_task_tree = ''
for jx in jx_tasks_lst:
    jx_task_tree += str(get_object(jx))

I am not sure, if the function returns me data in correct format/form to use them later the way I need to.

When I want to parse variable jx_task_tree like this:

parser = ET.XMLParser(encoding="utf-8")
print(type(jx_task_tree))
tree = ET.parse(jx_task_tree, parser=parser)
print(ET.tostring(tree))

it throws me an error:

Traceback (most recent call last):
  File "import_uac_wf.py", line 59, in <module>
    tree = ET.parse(jx_task_tree, parser=parser)
  File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 1182, in 
parse
    tree.parse(source, parser)
  File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 647, in parse
    source = open(source, "rb")
IOError: [Errno 36] File name too long: 
'<xml.etree.ElementTree.ElementTree 
object at 0x7ff2607c8910>\n<xml.etree.ElementTree.ElementTree object at 
0x7ff2607e23d0>\n<xml.etree.ElementTree.ElementTree object at 
0x7ff2607ee4d0>\n<xml.etree.ElementTree.ElementTree object at 
0x7ff2607d8e90>\n<xml.etree.ElementTree.ElementTree object at 
0x7ff2607e2550>\n<xml.etree.ElementTree.ElementTree object at 
0x7ff2607889d0>\n<xml.etree.ElementTree.ElementTree object at 
0x7ff26079f3d0>\n'

Would anybody help me, what should function get_object() return and how to work with it later, so what's returned can be joined into one variable and parsed?


Solution

  • Regarding to your current exception:

    According to [Python.Docs]: xml.etree.ElementTree.parse(source, parser=None) (emphasis is mine):

    Parses an XML section into an element tree. source is a filename or file object containing XML data.

    If you want to load the XML from a string, use ET.fromstring instead.

    Then, as you suspected, the 2nd code snippet is completely wrong:

    You could do something like:

    jx_tasks_string = ""
    for jx in jx_tasks_lst:
        jx_tasks_string += ET.tostring(get_object(jx).getroot())
    
    # or (one liner)
    
    jx_tasks_string = "".join(ET.tostring(get_object(jx).getroot()) for jx in jx_tasks_lst)
    

    Since jx_tasks_string is the concatenation of some strings obtained from parsing some XML blobs, there's no reason to parse it again.