pythonxmlapiurllib2minidom

Python loop to iterate through elements in an XML and get sub-elements values


I am working with an XML being read from aN API that has several hotel locations. Each individual hotel has a "hotel code" element that is a value unique to each hotel in the XML output and I would like to GET the "latitude" and "longitude" attributes for each hotel. My code right now can parse through the XML and record each instance of "latitude" and "longitude" but not organized as paired lat/lon for a hotel, rather it records every latitude in the XML then every longitude in the XML. I am having trouble figuring out how to say: IF hotel code == the previous hotel code, record latitude/longitude together; ELSE move on to next hotel and record that lat/lon. An example section of the XML output is below as is my code and my code's output:

XML:

<hotel code="13272" name="Sonesta Fort Lauderdale Beach" categoryCode="4EST" categoryName="4 STARS" destinationCode="FLL" destinationName="Fort Lauderdale - Hollywood Area - FL" zoneCode="1" zoneName="Fort Lauderdale Beach Area" latitude="26.137508" longitude="-80.103438" minRate="1032.10" maxRate="1032.10" currency="USD"><rooms><room code="DBL.DX" name="DOUBLE DELUXE"><rates><rate rateKey="20161215|20161220|W|235|13272|DBL.DX|GC-ALL|RO||1~1~0||N@675BEABED1984D9E8073EB6154B41AEE" rateClass="NOR" rateType="BOOKABLE" net="1032.10" allotment="238" rateCommentsId="235|38788|431" paymentType="AT_WEB" packaging="false" boardCode="RO" boardName="ROOM ONLY" rooms="1" adults="1" children="0"><cancellationPolicies><cancellationPolicy amount="206.42" from="2016-12-11T23:59:00-05:00"/></cancellationPolicies></rate></rates></room></rooms></hotel>

CODE:

import time, hashlib
import urllib2
from xml.dom import minidom

# Your API Key and secret
apiKey =
Secret =

# Signature is generated by SHA256 (Api-Key + Secret + Timestamp (in seconds))
sigStr = "%s%s%d" % (apiKey,Secret,int(time.time()))
signature = hashlib.sha256(sigStr).hexdigest()

endpoint = "https://api.test.hotelbeds.com/hotel-api/1.0/hotels"

try:
    # Create http request and add headers
    req = urllib2.Request(url=endpoint)
    req.add_header("X-Signature", signature)
    req.add_header("Api-Key", apiKey)
    req.add_header("Accept", "application/xml")
    req.add_header("Content-Type", "application/xml")
    req.add_data(' <availabilityRQ xmlns="http://www.hotelbeds.com/schemas/messages" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" ><stay checkIn="2016-12-15" checkOut="2016-12-20"/><occupancies><occupancy rooms="1" adults="1" children="0"/></occupancies><geolocation longitude="-80.265323" latitude="26.131510" radius="10" unit="km"/></availabilityRQ>')

    # Reading response and print-out
    file = minidom.parse(urllib2.urlopen(req))
    hotels = file.getElementsByTagName("hotel")
    lat = [items.attributes['latitude'].value for items in hotels]
    lon = [items.attributes['longitude'].value for items in hotels]
    print lat + lon

except urllib2.HTTPError, e:
    # Reading body of response
    httpResonponse = e.read()
    print "%s, reason: %s " % (str(e), httpResonponse)
except urllib2.URLError, e:
    print "Client error: %s" % e.reason
except Exception, e:
    print "General exception: %s " % str(e)

MY OUTPUT RIGHT NOW:

[u'26.144224', u'26.122569', u'26.11437', u'26.1243414605478', u'26.119195', u'26.1942424979814', u'26.145488', u'26.1632044819114', u'26.194145', u'26.1457688280936', u'26.1868547339183', u'26.1037652256159', u'26.090442389015', u'26.187242', u'-80.325579', u'-80.251829', u'-80.25315', u'-80.2564349700697', u'-80.262738', u'-80.2919112076052', u'-80.258274', u'-80.2584546734579', u'-80.261252', u'-80.2576325763948', u'-80.1963213016279', u'-80.2630081633106', u'-80.2272565662588', u'-80.20161000000002']


Solution

  • You could place the result of your XML file in to an iterable structure like a dictionary. I've taken your sample xml data and placed it into a file called hotels.xml.

    from xml.dom import minidom
    
    hotels_position = {}
    
    dom = minidom.parse('hotels.xml')
    hotels = dom.getElementsByTagName("hotel")
    
    for hotel in hotels:
        hotel_id = hotel.attributes['code'].value
        position = {}
        position['latitude'] = hotel.attributes['latitude'].value
        position['longitude'] = hotel.attributes['longitude'].value
        hotels_position[hotel_id] = position
    
    print(hotels_position)
    

    This code outputs the following structure (I added a second hotel)

    {u'13272': {'latitude': u'26.137508', 'longitude': u'-80.103438'}, u'13273': {'latitude': u'26.137508', 'longitude': u'-80.103438'}}
    

    You can now iterate through each hotel in the dictionary.

    for hotel in hotels_position:
        print("Hotel {} is located at ({},{})".format(hotel,
                                                      hotels_position[hotel]['latitude'],
                                                      hotels_position[hotel]['latitude']))
    

    Now that you have your data in an organised structure, your 'logic' will be much easier to write.