pythonpython-3.xxmlminidom

How to parse only certain information from XML using python


i have a small simple problem, I just only want a small part of the parsed tag. so when the "hardware version is being parsed the terminal output is: "TREE M-5TX IP67 1.00" but i only want 1.00 not the "TREE M-5TX IP67" part.

Does anybody know how to do this? Please show me an example i am a beginner hence any help would be nice. And I am sorry if i have not used certain terms properly

# open webpage and read values
xml_str = urllib.request.urlopen(url_str).read()

# Parses XML doc to String for Terminal output
xmldoc = minidom.parseString(xml_str)

# prints the order_number from the xmldoc
order_number = xmldoc.getElementsByTagName('order_number')
ord_nmr = order_number[0].firstChild.nodeValue

# prints the firmware_version from the xmldoc
firmware_version = xmldoc.getElementsByTagName('firmware_version')
frm_ver = firmware_version[0].firstChild.nodeValue

# prints the hardware_version from the xmldoc
hardware_version = xmldoc.getElementsByTagName('hardware_version')
hrd_ver = hardware_version[0].firstChild.nodeValue

# prints the mac_address from the xmldoc
mac_address = xmldoc.getElementsByTagName('mac_address')
mac_addr = mac_address[0].firstChild.nodeValue

print("Current device information: ")
print("Order-number: ",ord_nmr, "Software-version: ",frm_ver, "Hardware version: ",hrd_ver, "MAC address: ",mac_addr)

Terminal output looks like this:

Order-number: 58183 Software-version: 1.1.0 ( Build : 1 ) Hardware version: TREE M-5TX IP67 1.00 MAC address: 00:0F:9E:F3:F8:A0


Solution

  • You haven't given the rule or specification for distinguishing the part you want ("1.00" in this specific case) from the rest, you should look at all the other possible values of 'hardware_version' and define a general rule.

    Absent that, I'll just assume the part you want is separated from the rest by whitespace (one or more spaces or tabs), and that it's the last piece of non-space text. With such a rule, it's very easy to split what you have and retrieve the last element:

    # prints the hardware_version from the xmldoc
    hardware_version = xmldoc.getElementsByTagName('hardware_version')
    hrd_ver = hardware_version[0].firstChild.nodeValue
    v = hrd_ver.split()[-1]
    

    v will be "1.00". The split function by default splits on whitespace, and returns an array of strings, we just pick the last one.