i have a dataset which is a .txt file and each line has items separated by spaces. each line is a different transaction.
the dataset looks like this:
data.txt file
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
20 12 5 41 65
41 6 11 27 81 21
65 15 27 8 31 65 20 19 44 29 41
i created a dictionary with keys as serial num. starting from 0 and each line values seperated by commas as values like this
{0: '1,2,3,4,5,6,7,8,9,10,11,12,13,14,15', 1:'20,12,5,41,65', 2:'41,6,11,27,81,21', 3: '65,15,27,8,31,65,20,19,44,29,41'}
but i am not able to iterate through each value in dict , is there any way i can convert it into a list of values for each key
i want to find the frequency of each time in the whole dictionary and create a table
item | frequency |
---|---|
1 | 1 |
2 | 1 |
20 | 2 |
41 | 3 |
like the above
my_dict = {}
with open('text.csv', 'r') as file:
lines = file.readlines()
for line in lines:
my_dict[lines.index(line)] = line.strip()
this is the code i used to create the dictionary but i am not sure what i should change, also i need to find frequency of each value.
Any help would be appreciated. thank u.
Since you're really just counting numbers over the entire file, you can just:
my_dict = {}
with open('data.txt', 'r') as file:
for number in file.read().split():
my_dict[number] = my_dict.get(number, 0) + 1
print(my_dict)
Result:
{'1': 1, '2': 1, '3': 1, '4': 1, '5': 2, '6': 2, '7': 1, '8': 2, '9': 1, '10': 1, '11': 2, '12': 2, '13': 1, '14': 1, '15': 2, '20': 2, '41': 3, '65': 3, '27': 2, '81': 1, '21': 1, '31': 1, '19': 1, '44': 1, '29': 1}
That just counts the strings representing numbers, you can turn them into actual numbers:
with open('data.txt', 'r') as file:
for number in file.read().split():
my_dict[int(number)] = my_dict.get(int(number), 0) + 1
Result:
{1: 1, 2: 1, 3: 1, 4: 1, 5: 2, 6: 2, 7: 1, 8: 2, 9: 1, 10: 1, 11: 2, 12: 2, 13: 1, 14: 1, 15: 2, 20: 2, 41: 3, 65: 3, 27: 2, 81: 1, 21: 1, 31: 1, 19: 1, 44: 1, 29: 1}
Or:
my_dict[i] = my_dict.get(i := int(number), 0) + 1