I am try to parse an XML file using python's xml minidom. I only want to return the first name <wd:First_Name> and last name <wd:Last_Name> of the legal name data <wd:Legal_Name_Data>. I do not want to return the first name or last name of the <wd:Preferred_Name_Data> data or any tertiary name data from each record. Below is an example of the XML file I am trying to parse. Of course, this is just one record of many from which I need to do retrieve this data.
<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope xmlns:env="http://schema.xmlsoap.org/soap/envelope/">
<env:Body>
<wd:Get_Working_Response xmlns:wd="urn:com.workway/bsvc"
wd:version="v40.1">
<wd:Request_Criteria>
<wd:Transaction_Log_Criteria_Data>
</wd:Transaction_Log_Criteria_Data>
<wd:Field_And_Parameter_Criteria_Data>
</wd:Field_And_Parameter_Criteria_Data>
<wd:Eligibility_Criteria_Data>
</wd:Eligibility_Criteria_Data>
</wd:Request_Criteria>
<wd:Response_Filter>
</wd:Response_Filter>
<wd:Response_Group>
</wd:Response_Group>
<wd:Response_Results>
</wd:Response_Results>
<wd:Response_Data>
<wd:Worker>
<wd:Worker_Reference>
<wd:ID wd:type="WID">787878787878787</wd:ID>
<wd:ID wd:type="Employee_ID">123456</wd:ID>
</wd:Worker_Reference>
<wd:Worker_Descriptor>John Smith</wd:Worker_Descriptor>
<wd:Worker_Data>
<wd:Worker_ID>123456</wd:Worker_ID>
<wd:User_ID>jsmith</wd:User_ID>
<wd:Personal_Data>
<wd:Name_Data>
<wd:Legal_Name_Data>
<wd:Name_Detail_Data wd:Formatted_Name="John J. Smith"
wd:Reporting_Name="John J. Smith">
<wd:Country_Reference>
<wd:ID wd:type="WID">89989898989989898</wd:ID>
<wd:ID wd:type="ISO_3166-1_Alpha-2_Code">US</wd:ID>
<wd:ID wd:type="ISO_3166-1_Alpha-3_Code">USA</wd:ID>
<wd:ID wd:type="ISO_3166-1_Numeric-3_Code">000000</wd:ID>
</wd:Country_Reference>
<wd:First_Name>John</wd:First_Name>
<wd:Middle_Name>J.</wd:Middle_Name>
<wd:Last_Name>Smith</wd:Last_Name>
</wd:Name_Detail_Data>
</wd:Legal_Name_Data>
<wd:Preferred_Name_Data>
<wd:Name_Detail_Data wd:Formatted_Name="Johnny James Smith"
wd:Reporting_Name="Johnny James Smith">
<wd:Country_Reference>
<wd:ID wd:type="WID">89989898989989898</wd:ID>
<wd:ID wd:type="ISO_3166-1_Alpha-2_Code">US</wd:ID>
<wd:ID wd:type="ISO_3166-1_Alpha-3_Code">USA</wd:ID>
<wd:ID wd:type="ISO_3166-1_Numeric-3_Code">000000</wd:ID>
</wd:Country_Reference>
<wd:First_Name>Johnny</wd:First_Name>
<wd:Middle_Name>James</wd:Middle_Name>
<wd:Last_Name>Smith</wd:Last_Name>
</wd:Name_Detail_Data>
</wd:Preferred_Name_Data>
<wd:Additional_Name_Data>
<wd:Name_Detail_Data wd:Formatted_Name="John J. Smith"
wd:Reporting_Name="John J. Smith">
<wd:Country_Reference>
<wd:ID wd:type="WID">89989898989989898</wd:ID>
<wd:ID wd:type="ISO_3166-1_Alpha-2_Code">US</wd:ID>
<wd:ID wd:type="ISO_3166-1_Alpha-3_Code">USA</wd:ID>
<wd:ID wd:type="ISO_3166-1_Numeric-3_Code">840</wd:ID>
</wd:Country_Reference>
<wd:First_Name>John</wd:First_Name>
<wd:Middle_Name>J.</wd:Middle_Name>
<wd:Last_Name>Smith</wd:Last_Name>
</wd:Name_Detail_Data>
<wd:Name_Type_Reference>
<wd:ID wd:type="WID">89989898989989898</wd:ID>
<wd:ID wd:type="Additional_Name_Type_ID">Preferred</wd:ID>
</wd:Name_Type_Reference>
</wd:Additional_Name_Data>
</wd:Name_Data>
So far, I have tried the below, but it is returning all first name and last name data for each record. How can I specify the <wd:Legal_Name_Data>?
from xml.dom import minidom
doc = minidom.parse('myfile.xml')
firstlist =[]
lastlist=[]
first = doc.getElementsByTagName('wd:First_Name')
for name in first:
first2 =name.firstChild.nodeValue
firstlist.append(first2)
last = doc.getElementsByTagName('wd:Last_Name')
for lasts in last:
last2 =lasts.firstChild.nodeValue
lastlist.append(last2)
Thanks, Nick
You need to extract the legal names first, like this:
firstlist = []
lastlist = []
legal_names = doc.getElementsByTagName('wd:Legal_Name_Data')
for legal_name in legal_names:
first = legal_name.getElementsByTagName('wd:First_Name')
for name in first:
first2 = name.firstChild.nodeValue
firstlist.append(first2)
last = legal_name.getElementsByTagName('wd:Last_Name')
for lasts in last:
last2 = lasts.firstChild.nodeValue
lastlist.append(last2)