pythonxmlelementtreexliff

Element Tree is not modifying element text


I am using ElementTree to modify an xliff file with text contained in an Excel sheet. I want to run down the entire file and identify elements where I have a match in my Excel sheet (match is based on segment id which is contained in the "mid" attribute value). Once I find a match, I want to populate the element with text pulled from the Excel sheet. For this example I am using dummy text "Target Segment{segment id}"

My code does everything I want. I can identify each element and pull the element text and attributes as needed. I set the text value of the element and can see the difference before and after when I print the results - "mrk.text" before is None, and after setting the new value , "mrk.text" is populated with the correct dummy text. So everything looks like it is working correctly.

BUT - when I generate the xml file, I can see the element text is still empty. Meanwhile the other modifications I made to the xml - for example registering namespaces and including the xml declaration are working fine).

I am expecting text to appear in "mrk" elements that are children of "target" elements. But nothing gets written there.

I am not sure what I am doing wrong.

I have read through xml.etree.ElementTree documentation on python.org and have searched for the correct answer on this site and several others. I found answers which hint at being the possible solution, but nothing quite does it.

(I know that my tag references can be made without explicitly calling the namespace URIs, but I am new to Element Tree and wanted to solve my problem first before improving my code)

Sample XML that I am trying to modify is here:

        <trans-unit id="f60d234c-2d06-47e7-b4aa-e2c7a7caf0e8">
            <source>Please select all that apply.</source>
            <seg-source>
                <mrk mtype="seg"
                     mid="1751">Please select all that apply.</mrk>
            </seg-source>
            <target>
                <mrk mtype="seg"
                     mid="1751"/>
            </target>
        </trans-unit>

Relevant python code here:

beolroot = ET.parse(filetobeol).getroot()
for tu in beolroot.findall(".//{urn:oasis:names:tc:xliff:document:1.2}trans-unit"):
    ET.register_namespace("sdl", "http://sdl.com/FileTypes/SdlXliff/1.0")
    ET.register_namespace("", "urn:oasis:names:tc:xliff:document:1.2")
    #print (tu)
    srctxt = tu.find("./{urn:oasis:names:tc:xliff:document:1.2}source")
    trg = tu.find("./{urn:oasis:names:tc:xliff:document:1.2}target")
    #print (srctxt)
    print (srctxt.text)
    #print (trg)
    for target in tu.findall("./{urn:oasis:names:tc:xliff:document:1.2}target"):
        for mrk in target.findall("./{urn:oasis:names:tc:xliff:document:1.2}mrk"):
            print ("mrk is element id " + str(mrk))
            print ("mrk text is: " +str(mrk.text))
            mid = mrk.get("mid")
            print ("segment id is: " +str(mid))
            if mid in srctrgmap.keys():
                mrk = target.find("./{urn:oasis:names:tc:xliff:document:1.2}mrk")
                targetvalue = srctrgmap[mid]
                #print(targetvalue)
                mrk.text = str(targetvalue)
                target.text = str(targetvalue)
                print ("mrk is STILL element id: " + str(mrk))
                print ("new mrk text is: " +str(mrk.text))
                print ("new target  text is: " +str(target.text))
            else:
                print("Segment Number " + str(mid) + " has no translation target text")
tree.write("output.sdlxliff", encoding="utf-8", xml_declaration=True)

Solution

  • The following code works. As suggested, I worked on a minimal reproducible example. In doing so, I produced a version that worked. I am not certain what was wrong. But this code now does what I need it to do.

    Example xml here:

        <?xml version="1.0" encoding="utf-8"?>
    <xliff xmlns:sdl="http://sdl.com/FileTypes/SdlXliff/1.0"
           xmlns="urn:oasis:names:tc:xliff:document:1.2"
           version="1.2"
           sdl:version="1.0">
        <file original="C:\File\Location\example.xml">
            <body>
                <trans-unit id="a">
                    <source>Foo</source>
                    <seg-source>
                        <mrk mtype="seg"
                             mid="1328">Foo</mrk>
                    </seg-source>
                    <target>
                        <mrk mtype="seg"
                             mid="1328"/>
                    </target>
                    <sdl:seg-defs>Bar</sdl:seg-defs>
                </trans-unit>
                <trans-unit id="b">
                    <source>My Hovercraft</source>
                    <seg-source>
                        <mrk mtype="seg"
                             mid="1329">My Hovercraft</mrk>
                    </seg-source>
                    <target>
                        <mrk mtype="seg"
                             mid="1329"/>
                    </target>
                    <sdl:seg-defs>Is full of eels</sdl:seg-defs>
                </trans-unit>
                <trans-unit id="c">
                    <source>I will not buy this record</source>
                    <seg-source>
                        <mrk mtype="seg"
                             mid="1330">I will not buy this record</mrk>
                    </seg-source>
                    <target>
                        <mrk mtype="seg"
                             mid="1330"/>
                    </target>
                    <sdl:seg-defs>It is scratched</sdl:seg-defs>
                </trans-unit>
                <trans-unit id="d">
                    <source>I will not buy this tobacconist</source>
                    <seg-source>
                        <mrk mtype="seg"
                             mid="1331">I will not buy this tobacconist</mrk>
                    </seg-source>
                    <target>
                        <mrk mtype="seg"
                             mid="1331"/>
                    </target>
                <sdl:seg-defs>It is scratched</sdl:seg-defs>
                </trans-unit>
                <trans-unit id="f">
                    <source>I want to buy</source>
                    <seg-source>
                        <mrk mtype="seg"
                             mid="1332">I want to buy</mrk>
                    </seg-source>
                    <target>
                        <mrk mtype="seg"
                             mid="1332"/>
                    </target>
                    <sdl:seg-defs>Some Matches</sdl:seg-defs>
                </trans-unit>
            </body>
        </file>
    </xliff>
    

    Working python here:

    import xml.etree.ElementTree as ET
    
    filetobeol = 'D:\\Stack\\example.xml'
    
    srctrgmap = {'1328': 'Target Segment1328',
                 '1330': 'Target Segment1330',
                 '1332': 'Target Segment1332'
                 }
    
    tree = ET.parse(filetobeol)
    beolroot = tree.getroot()
    for tu in beolroot.findall(".//{urn:oasis:names:tc:xliff:document:1.2}trans-unit"):       
        ET.register_namespace("sdl", "http://sdl.com/FileTypes/SdlXliff/1.0")
        ET.register_namespace("", "urn:oasis:names:tc:xliff:document:1.2")
        for target in tu.findall("./{urn:oasis:names:tc:xliff:document:1.2}target"):
            for mrk in target.findall("./{urn:oasis:names:tc:xliff:document:1.2}mrk"):
                print ("mrk is element id " + str(mrk))
                print ("mrk text is: " +str(mrk.text))
                mid = mrk.get("mid")
                print ("segment id is: " +str(mid))
                if mid in srctrgmap.keys():
                    stillmrk = target.find("./{urn:oasis:names:tc:xliff:document:1.2}mrk")
                    targetvalue = srctrgmap[mid]
                    #print(targetvalue)
                    stillmrk.text = str(targetvalue)
                    print ("mrk is STILL element id: " + str(stillmrk))
                    print ("new mrk text is: " +str(stillmrk.text))
                else:
                    print("Segment Number " + str(mid) + " has no translation target text")
    tree.write("D:\\Stack\\outexample.xml", encoding="utf-8", xml_declaration=True)