I'm quite new to Python and struggling to get my head round the logic in this for loop. My data has two values, a city and a temp. I would like to write a "for loop" that outputs the maximum temp for each city as follows:
PAR 31
LON 23
RIO 36
DUB 44
As it is to be used in Hadoop, I can't use any python libraries.
Here is my dataset:
['PAR,31',
'PAR,18',
'PAR,14',
'PAR,18',
'LON,12',
'LON,13',
'LON,9',
'LON,23',
'LON,5',
'RIO,36',
'RIO,33',
'RIO,21',
'RIO,25',
'DUB,44',
'DUB,42',
'DUB,38',
'DUB,34']
This is my code:
current_city = None
current_max = 0
for line in lines:
(city, temp) = line.split(',')
temp = int(temp)
if city == current_city:
if current_max < temp:
current_max == temp
current_city = city
print(current_city, current_max)
This was my output:
DUB 0
Build a dictionary keyed on city names. The associated values should be a list of integers (the temperatures).
Once the dictionary has been constructed you can then iterate over its items to determine the highest value in each list of temperatures,
data = ['PAR,31',
'PAR,18',
'PAR,14',
'PAR,18',
'LON,12',
'LON,13',
'LON,9',
'LON,23',
'LON,5',
'RIO,36',
'RIO,33',
'RIO,21',
'RIO,25',
'DUB,44',
'DUB,42',
'DUB,38',
'DUB,34']
d = {}
for e in data:
city, temp = e.split(',')
d.setdefault(city, []).append(temp)
for k, v in d.items():
print(k, max(map(int, v)))
Output:
PAR 31
LON 23
RIO 36
DUB 44