pythonxmlxml-parsingelementtree

Parsing : String to XML


my API is supposed to take a string and typecast it to XML format.

But i am consistently getting this error :

ParseError: mismatched tag: line 1, column 764

XML

<?xml version="1.0" encoding="utf-8" ?>
<MasterDetails IssuerId="5" Version="12.2">
    <XMLRequest />
    <BookingDetails  Amount="768"  Comment="Hotel Travel Purchase"  CurrencyCode="INR"  PurchaseType="Hotel"  SupplierName="SomeHotel"  CardAlias="C_ALIAS"  ValidFor="-1D"  CurrencyType="B" />
    <CDFs>
        <CDF FieldName="Order Date" FieldValue="2015-01-01" />
    </CDFs>
    <SomeTag>
        <Rule Action="A" Alias="MyAlias">
            <Controls>
                <OPMCCControl Negate="False"/>
                <OPMIDControl />
                <SomeControlsTags       CumulativeLimit="768"       MaxTrans="None"                 Period="C" />
                <ValidityPeriod           ValidFrom="2015-01-01 00:00:00.0 +0000"          ValidTo="2015-01-11 00:00:00.0 +0000" />
            </Controls>
        </Rule>
    </SomeTag>
</BookingDetails>
<Email  EmailAddress="T@J.COM"/>
<MasterDetails />

Implemented via :

tree = ET.ElementTree(ET.fromstring(kk.strip()))

I know for sure that my XML string is containing all matching tags and is formatted but still something might be missing infront of my eyes!!


Solution

  • The BookingDetails tag is self-closed on this line:

    <BookingDetails  Amount="768"  Comment="Hotel Travel Purchase"  CurrencyCode="INR"  PurchaseType="Hotel"  SupplierName="SomeHotel"  CardAlias="C_ALIAS"  ValidFor="-1D"  CurrencyType="B" />
    

    But when there is a separate closing BookingDetails element:

    </BookingDetails>
    

    Also, the <MasterDetails /> is not properly closed on the last line. Should be </MasterDetails> instead of <MasterDetails />.


    Note that you can parse this XML in the "recover" mode if lxml.etree is used:

    import lxml.etree as ET
    
    parser = ET.XMLParser(recover=True)
    tree = ET.ElementTree(ET.fromstring(data, parser=parser)) 
    

    Or, use BeautifulSoup with xml features:

    from bs4 import BeautifulSoup
    
    soup = BeautifulSoup(data, "xml")
    print(soup.prettify())