day city temperature windspeed event
2017-01-01 new york 32 6 Rain
2017-01-02 new york 36 7 Sunny
2017-01-03 new york 28 12 Snow
2017-01-04 new york 33 7 Sunny
2017-01-05 new york 31 7 Rain
2017-01-06 new york 33 5 Sunny
2017-01-07 new york 27 12 Rain
2017-01-08 new york 23 7 Rain
2017-01-01 mumbai 90 5 Sunny
2017-01-02 mumbai 85 12 Fog
2017-01-03 mumbai 87 15 Fog
2017-01-04 mumbai 92 5 Rain
2017-01-05 mumbai 89 7 Sunny
2017-01-06 mumbai 80 10 Fog
2017-01-07 mumbai 85 9 Sunny
2017-01-08 mumbai 89 8 Rain
2017-01-01 paris 45 20 Sunny
2017-01-02 paris 50 13 Cloudy
2017-01-03 paris 54 8 Cloudy
2017-01-04 paris 42 10 Cloudy
2017-01-05 paris 43 20 Sunny
2017-01-06 paris 48 4 Cloudy
2017-01-07 paris 40 14 Rain
2017-01-08 paris 42 15 Cloudy
2017-01-09 paris 53 8 Sunny
The above shows the .txt file.
My goal is to create 4 groups as evenly distributed as possible, containing all the cities, meaning that each group has 'new york','mumbai','paris'.
Since there are 25 data, 3 groups will have 6 lines while 1 group will have 7 lines.
What I have in mind right now is that, since the data are already sorted by their city, I can read the text file lines by lines and then for each line, i will append it to 4 groups (G1-G4) in an alternating pattern. Meaning to say, the first line, it will append it to G1, then 2nd line to G2, 3rd to G3, 4th to G4 , 5th will append back to G1, 6th append to G2 and so on. This can ensure that all the groups have all the 3 cities.
Is it possible to code in this way?
Expected result:
G1: Row/Line 1 , Row 5, Row 9,
G2: Row 2, Row 6, Row 10,
G3: Row 3, Row 7, Row 11,
G4: Row 4, Row 8, Row 12, and so on.
Since your input is already sorted, you can split the string into a list and then slice them using a step of 4:
data = ''' 2017-01-01 new york 32 6 Rain
2017-01-02 new york 36 7 Sunny
2017-01-03 new york 28 12 Snow
2017-01-04 new york 33 7 Sunny
2017-01-05 new york 31 7 Rain
2017-01-06 new york 33 5 Sunny
2017-01-07 new york 27 12 Rain
2017-01-08 new york 23 7 Rain
2017-01-01 mumbai 90 5 Sunny
2017-01-02 mumbai 85 12 Fog
2017-01-03 mumbai 87 15 Fog
2017-01-04 mumbai 92 5 Rain
2017-01-05 mumbai 89 7 Sunny
2017-01-06 mumbai 80 10 Fog
2017-01-07 mumbai 85 9 Sunny
2017-01-08 mumbai 89 8 Rain
2017-01-01 paris 45 20 Sunny
2017-01-02 paris 50 13 Cloudy
2017-01-03 paris 54 8 Cloudy
2017-01-04 paris 42 10 Cloudy
2017-01-05 paris 43 20 Sunny
2017-01-06 paris 48 4 Cloudy
2017-01-07 paris 40 14 Rain
2017-01-08 paris 42 15 Cloudy
2017-01-09 paris 53 8 Sunny'''
lines = data.splitlines()
groups = [lines[i::4] for i in range(4)]
for g in groups:
print(g)
This outputs:
[' 2017-01-01 new york 32 6 Rain', ' 2017-01-05 new york 31 7 Rain', ' 2017-01-01 mumbai 90 5 Sunny', ' 2017-01-05 mumbai 89 7 Sunny', ' 2017-01-01 paris 45 20 Sunny', ' 2017-01-05 paris 43 20 Sunny', ' 2017-01-09 paris 53 8 Sunny']
[' 2017-01-02 new york 36 7 Sunny', ' 2017-01-06 new york 33 5 Sunny', ' 2017-01-02 mumbai 85 12 Fog', ' 2017-01-06 mumbai 80 10 Fog', ' 2017-01-02 paris 50 13 Cloudy', ' 2017-01-06 paris 48 4 Cloudy']
[' 2017-01-03 new york 28 12 Snow', ' 2017-01-07 new york 27 12 Rain', ' 2017-01-03 mumbai 87 15 Fog', ' 2017-01-07 mumbai 85 9 Sunny', ' 2017-01-03 paris 54 8 Cloudy', ' 2017-01-07 paris 40 14 Rain']
[' 2017-01-04 new york 33 7 Sunny', ' 2017-01-08 new york 23 7 Rain', ' 2017-01-04 mumbai 92 5 Rain', ' 2017-01-08 mumbai 89 8 Rain', ' 2017-01-04 paris 42 10 Cloudy', ' 2017-01-08 paris 42 15 Cloudy']